NLTK
2.1 Introduction
Natural Language Toolkit was developed in conjunction with a Computational Lin- guistics course at the University of Pennsylvania in 2001. It is a collection of modules and corpora, released under an open-source license, which allows to learn and con- duct research in NLP. NLTK can be used not only as a training complex, but also as a ready analytical tool or basis for the development of applied text processing sys- tems. Nowadays it is widely used in linguistics, articial intelligence, machine earning projects, etc. There are a lot of advantages of using NLTK. The most important one is that it is entirely self-contained. Not only does it provide raw and annotated ver- sions of real-world data in the form of …show more content…
It is a tree containing chunks and tokens, where every chunk is a sub-tree containing just tokens.
2.2 Why NLTK and python as tool for NLP
2.2.1 Python
Python is a simple yet powerful programming language with excellent functionality for processing linguistic data. Python can be downloaded for free from http://www. python.org/. Installers are available for all platforms. Python has a shallow learning curve, its syntax and semantics are transparent, and it has good string-handling functionality. As an interpreted language, Python facilitates interactive exploration.
As an object-oriented language, Python permits data and methods to be encapsulated and re-used easily. As a dynamic language, Python permits attributes to be added to objects on the y, and permits variables to be typed dynamically, facilitating rapid development. Python comes with an extensive standard library, including components for graphical programming, numerical processing, and web connectivity. Python is heavily used in industry, scientic research, and education around the world. Python is often praised for the way it facilitates productivity, quality, and maintainability of software. Of course, Python is not the only programming language in the world used