Natural Language Processing : starting from scratch and discarding conventional wisdom.
We propose a unified neural network architecture and learning algorithm that
can be applied to various natural language processing tasks including:
part-of-speech tagging, chunking, named entity recognition, and semantic role
labeling. This versatility is achieved by trying to avoid task-specific
engineering and therefore disregarding a lot of prior knowledge. Instead of
exploiting man-made input features carefully optimized for each task, our
system learns internal representations on the basis of vast amounts of mostly
unlabeled training data. This work is then used as a basis for building a
freely available tagging system with good performance and minimal computational
requirements.
http://arxiv.org/abs/1103.0398
In this contribution, we try to excel on multiple benchmarks while avoiding task-specific enginering. Instead we use a single learning system able to discover adequate internal representations. In fact we view the benchmarks as indirect measurements of the relevance of the internal representations discovered by the learning procedure, and we posit that these intermediate representations are more general than any of the benchmarks. Our desire to avoid task-specific engineered features led us to ignore a large body of linguistic knowledge. Instead we reach good performance levels in most of the tasks by transferring intermediate representations discovered on large unlabeled datasets. We call this approach âalmost from scratchâ to emphasize the reduced (but still important) reliance on a priori NLP knowledge.