Chapter 5 Text and Web Mining
1) DARPA and MITRE teamed up to develop capabilities to automatically filter text-based information sources to generate actionable information in a timely manner.
Answer: TRUE
Diff: 2 Page Ref: 190
2) A vast majority of business data is captured and stored in text documents that are structured.
Answer: FALSE
Diff: 2 Page Ref: 192
3) Text mining is important to competitive advantage because knowledge is power, and knowledge is derived from text data sources.
Answer: TRUE
Diff: 2 Page Ref: 192
4) The purpose and processes of text mining are different from those of data mining because with text mining the input to the process are data files such as Word documents, PDF files, text excerpts, and XML files.
Answer: FALSE
Diff: 3 Page Ref: 192
5) The benefits of text mining are greatest in areas where very large amounts of textual data are being generated, such as law, academic research, finance, and medicine.
Answer: TRUE
Diff: 2 Page Ref: 192
6) Unstructured data has a predetermined format. It is usually organized into records as categorical, ordinal, and continuous variables and stored in databases.
Answer: FALSE
Diff: 2 Page Ref: 193
7) Stemming is the process of reducing inflected words to their base or root form.
Answer: TRUE
Diff: 1 Page Ref: 193
8) Stop words, such as a, am, the, and was, are words that are filtered out prior to or after processing of natural language data.
Answer: TRUE
Diff: 2 Page Ref: 193
9) The goal of natural language processing (NLP) is syntax-driven text manipulation.
Answer: FALSE
Diff: 2 Page Ref: 196
10) Two advantages associated with the implementation of NLP are word sense disambiguation and syntactic ambiguity.
Answer: FALSE
Diff: 2 Page Ref: 196
11) By applying a learning algorithm to parsed text, researchers from Stanford University's NLP lab have developed methods that can