Thursday, April 26, 2012

Natural Language Processing : starting from scratch and discarding conventional wisdom.

 We propose a unified neural network architecture and learning algorithm that can be applied to various natural language processing tasks including: part-of-speech tagging, chunking, named entity recognition, and semantic role labeling. This versatility is achieved by trying to avoid task-specific engineering and therefore disregarding a lot of prior knowledge. Instead of exploiting man-made input features carefully optimized for each task, our system learns internal representations on the basis of vast amounts of mostly unlabeled training data. This work is then used as a basis for building a freely available tagging system with good performance and minimal computational requirements.
http://arxiv.org/abs/1103.0398
In this contribution, we try to excel on multiple benchmarks while avoiding task-specific enginering. Instead we use a single learning system able to discover adequate internal representations. In fact we view the benchmarks as indirect measurements of the relevance of the internal representations discovered by the learning procedure, and we posit that these intermediate representations are more general than any of the benchmarks. Our desire to avoid task-specific engineered features led us to ignore a large body of linguistic knowledge. Instead we reach good performance levels in most of the tasks by transferring intermediate representations discovered on large unlabeled datasets. We call this approach “almost from scratch” to emphasize the reduced (but still important) reliance on a priori NLP knowledge.