Natural language processing is often called an "Al-complete" task, in the sense that in order to truly process language (i.e. to comprehend, to translate, to generate) full understanding is required, which is itself the ultimate goal of Artificial Intelligence. For those who seek solutions to practical problems, this is not a desirable property of NLP. However, it is possible to address reduced versions of the NLP problem without the prerequisite of having first solved all of the other arbitrarily-difficult AI problems. There are various ways to restrict the NLP problem: restrict the semantic domain, restrict the expressiveness of the syntax, focus on only one aspect of NLP at a time (e.g. phoneme recognition, Part-of-Speech tagging, morphological analysis), seek only approximate solutions (e.g. by replacing a complex cognitive model with a statistical component), and so on. The work described in this monograph pursues the latter two approaches with significant success.
The beauty of statistical techniques for NLP is that in principle they require only training data not manual reprogramming to solve new or extended versions of the same problem. For instance, a Part-of-Speech tagger should be as easily trainable for any subset of English (e.g. legal, medical, en gineering texts) as for the original subset in which it was developed. Moreover, it should be applicable to other languages as well, after modifying the tagset and possibly the feature set. The drawbacks of statistical systems, however, are also significant. It is difficult to solve the more complex NLP problems statistically with acceptable accuracy. It is difficult to obtain enough train ing data for models with large feature sets. It is a significant challenge to create computationally-tractable models that cope with significant combina tions of features. And, it is seldom clear a priori how to design the feature set or what statistical model to use. All these difficulties notwithstanding, significant progress has been made in statistical methods for speech recog nition, Part-of-Speech tagging, lexical disambiguation, Prepositional Phrase (PP) attachment, and even end-to-end machine translation.