Chapter 7 Text and Web Mining
1) DARPA and MITRE teamed up to develop capabilities to automatically filter text-based information sources to generate actionable information in a timely manner.
ANSWER: ???
Diff: 2 Page Ref: 288
2) A vast majority of all business data are captured and stored in structured text documents.
ANSWER: ???
Diff: 2 Page Ref: 289
3) Text mining is important to competitive advantage because knowledge is power, and knowledge is derived from text data sources.
ANSWER: ???
Diff: 2 Page Ref: 289
4) The purpose and processes of text mining are different from those of data mining because with text mining the input to the process are data files such as Word documents, PDF files, text excerpts, and XML files.
ANSWER: ???
Diff: 3 Page Ref: 289
5) The benefits of text mining are greatest in areas where very large amounts of textual data are being generated, such as law, academic research, finance, and medicine.
ANSWER: ???
Diff: 2 Page Ref: 289
6) Unstructured data has a predetermined format. It is usually organized into records as categorical, ordinal, and continuous variables and stored in databases.
ANSWER: ???
Diff: 3 Page Ref: 290
7) Stemming is the process of reducing inflected words to their base or root form.
ANSWER: ???
Diff: 1 Page Ref: 290
8) Stop words, such as a, am, the, and was, are words that are filtered out prior to or after processing of natural language data.
ANSWER: ???
Diff: 2 Page Ref: 290
9) The goal of natural language processing (NLP) is syntax-driven text manipulation.
ANSWER: ???
Diff: 2 Page Ref: 292
10) Two advantages associated with the implementation of NLP are word sense disambiguation and syntactic ambiguity.
ANSWER: ???
Diff: 2 Page Ref: 293
11) By applying a learning algorithm to parsed text, researchers from Stanford University's NLP lab have developed methods that can automatically identify