Questions for the opening vignette (Mining text for Security and Counterterrorism) 1. How can text mining be used in a crisis situation? Text mining makes it easy for the end user to take the knowledge discovered by the analytics tools and embed it in a concise and useful form in an intelligence product. MITRE would allow the user to select various text mining tools and, with a few mouse clicks, assemble them to create a complex filter that fulfills whatever knowledge discovery function is currently needed. An analyst might use text mining to discover important nuggest of information in a large collection of news sources. 2. What is Genoa project? What is the motivation behind projects like Genoa?The Genoa project, …show more content…
part of Darpa’s total information awareness program, seeks to provide advanced tools and techniques to rapidly analyze informationrelated to a current situation to support better decision making. Genoa provides knowledge discovery tools to better “mine” relevant information sources for discovery of patterns in the form of actionable information. 3. What is TopCat? What does TopCat do? TopCat is a system developed by MITRE that identifies different topics in a collection of documents and displays the key “players” for each topic. TopCat used association rule mining technology to identify relationships among people, organizations, locations, and events. This tool enables an analyst to discover an association between people involved in a bombing incident. 4. What is a summarization filter? The summarization filter, developed with DARPA funding, identifies and aggregates descriptions of people from a collection of documents by means of an efficient syntactic analysis, the use of a thesaurus, and some simple natural language-processing techniques. It also extracts from these documents salient sentences related to these people by weighting sentences based on the presence of the names of people as well as the location and proximity of terms in a document, their frequency, and their correlations with other terms in the document collection. 5. Comment on the future of text mining tools for counterterrorism. Text mining tools have been part of national intelligence initiatives such as the information awareness program for decades. what has been achieved in the intelligence field is a good indicator of what can potentially be accomplished with the use of knowledge discovery tools and techniques in the near future.
Chapter 5 Section Review Questions
Section 5.1 1. What is text mining? Text mining is the application of data mining to nonstructured or less structured text files. It entails the generation of meaningful numeric indices from unstructured text and then processing those indices using various data-mining algorithms. How does it differ from data mining? With text mining, the input to the process is a collection of unstructured data files such as Word documents, PDF files, text excerpts, XML files, and so on. 2. Why is the popularity of text mining as a BI tool increasing’? Businesses that effectively and efficiently tap into their text data sources will have the necessary knowledge to make better decisions leading to a competitive advantage over those businesses that lag behind. 3. What are some of popular application areas of text mining? ○ Info extraction ○ Topic tracking ○ Summarization ○ Categorization ○ Clustering ○ Concept Linking ○ Question Answering
Section 5.2 1. What is natural language processing? It studies the problem of “understanding” the natural human language using more formal representations that are easier for computer programs to manipulate. 2. How does NLP relate to text mining? Text mining uses NLP to understand text. Algorithms are used to interpret texts and are stored in databases like WorldNet. The databases hold information like definitons to words, synonyms and relationships to other words. 3. What are some of the benefits and challenges of NLP? NLP improves CRM using Sentiment Analysis to help detect a customers favorable or unfavorable opinion on products. However, even with NLP computers may never fully understand the human languages the same way humans do due to: ○ Parts-of-Speech Tagging ○ Text Segmentation ○ Word Sense Disambiguation ○ Syntactic Ambiguity ○ Imperfect or Irregular input ○ Speech Acts 4. What are the most common tasks addressed by NLP? ○ Info Retrieval ○ Info Extraction ○ Named-entity recognition ○ Question Answering ○ Automatic Summarization ○ Natural Language Generation ○ Natural Language Understanding ○ Machine Translation ○ Foreign Language Writing ○ Foreigen Language Reading ○ Speech Recognition ○ Text-to-Speech ○ Text Proofing ○ Optical Character Recognition
Section 5.3 1. List and briefly discuss some of the text mining applications in marketing. Text mining can be used to analyze unstructured data from call centers to increase cross selling and up selling. Text mining can also determine a customers opinion on products from their intereactions with call centers and customer reviews. 2. How can text mining be used in security and counterterrorism? Text mining can be used to identify content though any means of communication (phone, email, fax, etc.) Text mining can also be used for Deception Detection comparing new persons of interest statements to statements made by past criminals. 3. What are some promising text mining applications in biomedicine? Experimental techniques such as DNA microarray analysis, serial analysis of gene expression (SAGE), and mass spectrometry proteomics, among others, are generating large amounts of data related to genes and proteins.
Section 5.4 1. What are the main steps in the text mining process? ○ Establish the Corpus: collect and organize the domain-specific unstructured data ○ Create the Term-Document Matrix: Introduce structure to the corpus ○ Extract Knowledge: Discover novel patterns from the T-D matrix 4. What are the main knowledge extraction methods from corpus? The main categories of knowledge extraction methods are classification, clustering, association, and trend analysis.
Section 5.6 1. What are some of the main challenges the Web poses for knowledge discovery? ○ The Web is too big for effective data mining ○ The Web is too complex ○ The Web is too dynamic ○ The Web is not specific to a domain ○ The Web has everything 2. What is Web mining? How does it differ from regular data mining? Web mining is the process of discovering intrinsic relationships from Web data, which are expressed in the form of textual, linkage, or usage information. (look at definition for data mining) 3. What are the three main areas of Web mining? ○ Web Content Mining: unstructured textual content of the Web pages ○ Web Structure Mining: the Uniform Resource Locator links contained in the Web pages ○ Web Usage Mining: the detailed description of a Web site’s visits
Section 5.7 1. What is Web content mining? How does it differ from text mining? Web content mining refers to the extraction of useful information from Web pages. The information gathered may include document characteristics similar to what is used in text mining, but it may include additional concepts such as the document hierarchy. 2. Define Web structure mining, and differentiate it from Web content mining. Web structure mining is the process of extracting useful information from the links embedded in Web documents. 3. What are the main goals of Web structure mining? It is used to identify authoritative pages and hubs, which are the cornerstones of the contemporary page-rank algorithms that are central to popular search engines such as Google and Yahoo! 4. What are hubs and authorities? What is the HITS algorithm? Hub: One or more Web pages that provide a collection of links to authoritative pages. Authorities: Web pages that are identified as particularly popular based on links by other Web pages and directories. HITS algorithm: The most popular, publicly known and referenced algorithm in Web mining, which is used to discover hubs and authorities.
Section 5.8 1. Define Web usage mining. Web usage mining is the extraction of useful information from data generated through Web pages visits and transactions. 2. In an e-commerce environment, what are the potential applications of Web usage mining? ○ Determine the lifetime value of clients ○ Design cross-marketing strategies across products ○ Evaluate promotional campaigns. ○ Target electronic ads and coupons at user groups based on user access patterns. ○ Predict user behavior base on previously learned rules and users’ profiles. ○ Present dynamic information to users based on their interests and profiles. 3. What is clickstream? Why is it important in Web usage mining? A clickstream is data that provide a trail of the user’s activities and show the user’s browsing patterns. It is important in Web usage mining because it helps us understand user behavior. It helps the Web site and customer value. 4. What types of information do Web servers collect when users visit a Web page and engage in an interactive activity? 5. Identify value-added features developed by major e-commerce sites that may be based on Web usage mining.
Figure 5.1 Topics Derived from Clustering 60,000 News Stories
[pic]
Figure 5.5 The 3-step text mining process
[pic]
Figure 5.6 A simple term-document matrix
[pic]
Chapter 5 Vocab
|Analytical Model: |Mathematical models into which data are loaded for analysis. |
|Analytical Techniques: |Methods that use mathematical formulas to derive an optimal solution directly or to predict a certain |
| |result, mainly in solving structured problems. |
|Association: |A category of data mining algorithm that establishes relationships about items that occur in a given |
| |record. |
|Authorative Pages: |Web pages that are identified as particularly popular based on links by other Web pages and directories |
|Classification: |Supervised induction used to analyze the historical data stored in a database and to automatically |
| |generate a model that can predict future behavior. |
|Clickstream Analysis: |The analysis of data that occur in the Web environment. |
|Clickstream Data: |Data that provide a trail of the users activities and show the users browsing patterns (e.g., which sites|
| |are visited, which pages, how long). |
|Clustering: |Partitioning a database into segments in which the members of a segment share similar qualities. |
|Corpus: |In linguistics, a large and structured set of tests (usually stored and processed electronically) |
| |prepared for the purpose of conducting knowledge discovery. |
|Customer Experience |Applications designed to report on the overall user experience by detecting Web application issues and |
|Management: |problems, by tracking and resolving on-site performance and availability, by enabling real-time alerting |
| |and monitoring, and by supporting deep diagnosis of observed visitor behavior. |
|Deception Detection: |A way of identifying deception (intentionally propagating beliefs that are not true) in voice, text, |
| |and/or body language of humans. |
|Hub: |One or more Web pages that provide a collection of links to authoritative pages. |
|Hyperlink-Induced Topic |The most popular, publicly known and referenced algorithm in Web mining, which is used to discover hubs |
|Search (HITS): |and authorities. |
|Inverse Document |A common and very useful transformation of indices in a term-by-document matrix that reflects both the |
|Frequency: |specificity of words (document frequencies) as well as the overall frequencies of their occurrences (term|
| |frequencies). |
|Natural Language |Using a natural language processor to interface with a computer-based system. |
|Processing (NLP) | |
|Part-of-Speech Tagging: |The process of marking up the words in a text as corresponding to a particular par of speech (such as |
| |nouns, verbs, adjectives, adverbs, etc.) based on a words definition and context of its use. |
|Polysemes: |Words also called homonyms they are syntactically identical words (i.e., spelled exactly the same) with |
| |different meanings (e,g., bow can mean “to bend forward,” “ the front of a ship,” the weapon that shoots |
| |arrows,” or “a kind of tied ribbon.” |
|Search Engine: |A program that finds and lists Web sites or pages (designated by URLs) that match some user-selected |
| |criteria. |
|Sentiment Analysis: |The technique used to detect favorable and unfavorable opinions towards specific products and services |
| |using a large number of textual data sources (consumer feedback in the form of Web pages) |
|Sequence Discovery: |The identification of associations over time.
|
|Singular Value |Closely related to principal components analysis, it reduces the overall dimensionality of the input |
|Decomposition (SVD): |matrix (number of input documents by number of extracted terms) to a lower dimensional space, where each |
| |consecutive dimension represents the largest degree of variability (between words and documents). |
|Speech Synthesis: |The technology by which computers convert text to voice (i.e. speak). |
|Stemming: |A process of reducing words to their respective root forms in order to better represent them in a |
| |text-mining project. |
|Stop Words: |Words that are filtered out prior to or after processing of natural language data (i.e., text) |
|Term-Document Matrix |A frequency matrix created from digitized and organized documents (the corpus) where the columns …show more content…
|
|(TDM): |represent the terms and rows represent individual documents. |
|Text Mining: |The application of data mining to nonstructured or less structured text files. It entails the generation|
| |of meaningful and numeric indicies from the unstructured text and then processing those indices using |
| |various data-mining algorithms. |
|Tokenizing: |Categorizing a block of text (token) according to the function it performs. |
|Trend Analysis: |The collecting of information and attempting to a spot pattern, or trend, in the information. |
|Unstructured Data: |A method of training artificial neutral networks in which only input stimuli are shown to the network, |
| |which is self-organizing. |
|Voice of Customer (VOC): |Applications that focus on “who and how” questions by gathering and reporting direct feedback from site |
| |visitors, by benchmarking against other sites and offline channels, and by supporting predictive modeling|
| |of the future visitor behavior. |
|Web Analysis: |The application of business analytics activities to Web-based processes, including e-commerce. |
|Web Content Mining: |The extraction of useful information from Web pages. |
|Web Crawler: |An application used to read through the content of a web site automatically. |
|Web Mining: |The discovery and analysis of interesting and useful information from the Web, about the Web, and usually|
| |through Web-based tools. |
|Web Structure Mining: |The development of useful information from the links included in the Web documents. |
|Web Usage Mining: |The extraction of useful information from the links included in Web documents. |
|Wiki: |A piece of server software available in a Web site that allows users to freely create and edit Web page |
| |content using any Web browser.
|
Chapter 6
Questions for the Opening Vignette (BI Eastern Monutain Sports) 1. What was the original BI system used for? The BI system included business performance management and dashboards. It collects raw data from multiple sources, processes them into data, and conducts analyses that include comparing performance to operational metrics in order to assess the health of the business. 2. Why was it beneficial to integrate it with social software? The BI system was not performing too well mainly due to insufficient data access, communication, and collaboration among all participants. 3. Differentiate between internal and external integrations in this case and their contributions to EMS. 4. What are the benefits of the integration to suppliers?
Chapter 6 Section Review Questions
Section 6.1 1. What are the major types of BI implementation influencing factors? a. Reporting and analysis tools i. Features and
functionality ii. Scalability and deployability iii. Usability and manageability iv. Ability to customize applications b. Database i. Scalability and performance ii. Manageability and availability iii. Security and customization iv. Ability to write back c. Extraction, transformation, and Load (ETL) tools i. Ability to read any source ii. Efficiency and productivity iii. Cross platform support d. Costs involved i. Hardware costs (actual or opportunity) ii. Costs of software (ETL, database, applications, and front end) iii. Internal development costs iv. External development costs v. Internal training vi. Ongoing maintenance e. Benefits i. Time savings and operational efficiencies ii. Lower cost of operations iii. Improved customer service and satisfaction iv. Improved operational and strategic decision making v. Improved employee communications and satisfaction vi. Improved knowledge sharing 4. What are some of the critical success factors in BI projects? Business driven methodology and project management, clear vision and planning, committed management support and sponsorship, data management and quality issues, mapping the solutions to the user requirements, performance considerations of the BI system, robust and extensible framework
Section 6.2 1. List several types of integration. Functional integration: the provision of different support functions as a single system through a single, consistent interface. Physical integration: the seamless integration of several systems into one functioning system. 2. Describe the need for BI integration. ● A) Implementing BI: For BI systems to operate, they usually need to be connected to data sources, utilities, other applications, and so on. Such connections must be accomplished effectively and efficiently. ● B) Increasing the capabilities of the BI applications: several BI development tools may complement each other. Each tool performs the subtasks at which it is the best. For example, BA can be used to recommend an optimal resource-allocation plan, and an attached dashboard can provide the control system that will alert management to deviations from the plan. ● C) enabling real-time decision support: By having tight integration, it is possible to support decision making in a real-time environment. An example is a transportation system that uses wireless communication and Web services to foster data flow. ● D) Enabling more powerful applications: an example is using intelligent systems to provide real-time capabilities. E) Facilitating system development: tighter integration allows faster application development and communication among system components. E) Enhancing support activities: several support activities can improve the operations of BI applications. For example, blogs, Twitter, wikis, and RSS feeds provide communication and collaboration support.
Section 6.4 1. What is on-demand BI? Unlimited computing power and storage capacity that, like electricity, water, and telephone services, can be obtained on demand, used, and reallocated for any application and that are billed on a pay-per-use basis. 2. What are its major benefits? a. Ability to handle fluctuating demand i. Flexible use of the BI technology pool b. Reduced investment/cost i. Hardware (servers and peripherals) ii. Software (more features for less) iii. Maintenance (centralized timely updates) c. Embodiment of recognized best practices d. Better flexibility and connectivity with other systems via SaaS infrastructure e. Better RIO 3. What are the limitations of on-demand systems? Integration of vendors’ software with company’s software may be difficult, The vendor can go out of business, leaving the company without a service, It is difficult or even impossible to modify hosted software for better fit with the users’ needs, Upgrading may become a problem, You may relinquish strategic data to strangers (lack of privacy/security of corporate data)
Section 6.5 1. List some legal issues of BI. Who is liable if an enterprise finds itself bankrupt as a result of using the advice of an intelligent BI analysis? Will the enterprise itself be held responsible for not testing the BI system adequately before entrusting it with sensitive data? Will auditing and accounting firms share the liability for failing to apply adequate auditing tests? Will the software developers of systems be jointly liable? 2. Describe privacy concerns in BI. The right to privacy is not absolute. Privacy must be balanced against the needs of society. The public’s right to know is superior to the individual’s right to privacy. 3. Explain privacy concerns on the Web. 4. List ethical issues in BI. ○ Electronic surveillance ○ Ethics in BI design ○ Invasion of individuals’ privacy ○ Use of proprietary databases ○ Use of intellectual property such as knowledge and expertise ○ Accuracy of data, information, and knowledge ○ Accessibility to information, and knowledge ○ Use of corporate computers for non-work related purposed ○ How much decision making to delegate to computers.
Section 6.7 1. Define Web 2.0. The popular term for advanced internet technology and applications, including blogs, wikis, RSS, and social bookmarking. One of the most significant differences between web 2.0 and the traditional World Wide Web is greater collaboration among internet users and other users, content providers, and enterprises. 2. List the major characteristics of Web 2.0. ● The ability to tap into the collective intelligence of users. The more users contribute, the better. ● Data is made available in new or never-intended ways. Web 2.0 data can be remixed or “mashed up”. ● Web 2.0 relies on user-generated and user-controlled content and data (enhanced collaboration). ● Lightweight programming techniques and tools let nearly anyone act as a Web site developer. ● The virtual elimination of software-upgrade cycles makes everything a perpetual beta or work-in-progress and allows rapid prototyping, using the Web as an application development platform. ● Users can access and manage applications entirely through a browser. ● An architecture of participation and digital democracy encourages users to add value to the application as they use it. ● There is a major emphasis on social networks and computing. ● Information sharing and collaboration is greatly supported. ● This allows for rapid and continuous creation of new business models. ● “dynamic content, rich user experience, metadata, scalability, open source, and freedom (net neutrality)”
Section 6.8 1. Define social network. A social network is a place where people create their own space, or homepage, on which they write blogs; post pictures, videos, or music; share ideas; and link to other Web locations they find interesting. 2. List some major social network sites. Facebook, Myspace, Twitter, Orkut, Google+ 3. Describe the global nature of social network sites. 4. Describe mobile social networking. Social networking where members converse and connect with one another using cell phones or other mobile devices
Section 6.9 1. What is a virtual world? Artificial worlds created by computer systems in which the user has the impression of being immersed 2. What are the advantages and disadvantages of providing decision support through virtual worlds? Advantages: ◦ Easy access and low cost ◦ Experienced and dedicated designer/builders ◦ Tools and venues for communications-driven decision support (DecisionSupportWorld.com) ◦ A large, dedicated user base ◦ Impression management/creativity enhancement ◦ Time compression ◦ Easy data integration from real life using RSS feeds ◦ Encourages active participation and experiential learning Disadvantages: ◦ Learning time and training costs ◦ Distractions are numerous ◦ Pranksters and spam are common ◦ Technology problems persist ◦ Chat is a very slow communication tool ◦ Resistance to use ◦ Addiction ◦ Participation in most virtual environments requires downloading a "plug-in" 3. What activities of a physical trade show can be experience in a virtual event? Which ones cannot be replicated? Available: ● Make instant voice calls to skype ● Send emails to the exhibitor ● Videos and audio greetings ● Business cards Not Available: ○ Personally getting someones attention and dealing with more complex problems 4. What type of data analysis might perform on data about users in a specific virtual world setting (e.g., a company island in Second Life, a virtual trade-show booth)?
Section 6.10 1. What is the logic of combining BI and social networks?
-CDM is a category of decision-support system for non-routine, complex decisions that require iterative human interactions.
-Ad hoc tagging regarding value, relevance, credibility, and decision context can substantially enrich both the decision process and the content that contributes to the decisions.
-Tying BI to decisions and outcomes that can be measured will enable organizations to better demonstrate the business value of BI. 2. Why is it referred to as a collaborative decision making? 3. What can you learn from the Figure 6.5? (PG.261)
4.What are the major benefits of CDM? the key benefit is "collapsing space" and the ability to blend synchronous and asynchronous activities. "For conferences and events, the benefits are having all the relevant information and people on demand, which removes the constraints of schedule and geography
Section 6.11 1. What is RFID?A generic technology that refers to the use of radio frequency waves to identify objects. 2. What kinds of data are read/recorded through RFID?RFID is a new member of the automatic identification technologies family, which also includes the ubiquitous barcodes and magnetic strips.
3. What can a company learn by reading RFID at a distribution center?
Functions in a distribution center receiving, put-away, picking, and shipping
Sequence of operations at a receiving dock unloading the contents of the trailer verification of the receipt of goods against expected delivery (purchase order) documentation of the discrepancy application of labels to the pallets, cases, items sorting of goods for put-away or cross-dock
Section 6.12 1. Define reality mining.Data mining of location-based data 2. What types of data are used in reality mining? Location information. Cars, buses, taxis, mobile phones, cameras, and personal navigation devices. Using technologies such as GPS, WiFi, and cell tower triangulation. Location enabled services has resulted in a massive database of historical and real-time streaming location information. 3. Briefly describe how the data are used to create profiles of users.
Table 6.1Sample RFID Data
[pic]
Chapter 6 Vocab
|Application Service Provider (ASP): |A software vendor that offends leased software applications to organizations. |
|Brainstorming: |The process by which people generate ideas, usually supported by software (e.g., developing |
| |alternative solutions to a problem). Also known as idea generation. |
|Collaborative Decision Making (CDM):|A new style of decision support that integrates BI and social software. |
|Data Integrity: |Integrating that comprises three major processes: data access, data federation, and change |
| |capture. When these three processes are correctly implemented, data can be accessed and made |
| |accessible to an array of ETL analysis tools, and data warehousing environments. |
|Functional Integration: |The provision of three different support functions as a single system through a single, |
| |consistent interface. |
|Interactivity: |A characteristic of software agents that allow them to interact and/or collaborative) with each |
| |other without having to rely on human intervention. |
|Knowledge Base: |A collection of facts, rules, and procedures organized into schemas. A knowledge base is the |
| |assembly of all of the information and knowledge about a specific field of interest. |
|Knowledge Management: |The active management of the expertise in an organization. It involves collecting, |
| |categorizing, and disseminating knowledge. |
|Middleware: |Software that links application modules from different computer languages and platforms. |
|Mobile Social Networking: |Members converse and connect with one another using cell phones or other mobile devices. |
|Multidimensionality: |The ability to organize, present, and analyze data by several dimensions, such as sales by |
| |region, by product, by salesperson, and by time (four dimensions). |
|Physical Integration: |The seamless integration of several systems into one functioning system. |
|Privacy: |In general, the right to be left alone and the right to be free of unreasonable personal |
| |intrusions. Information privacy is the right to determine when, and to what extent, information|
| |about oneself can be communicated to others. |
|Problem Solving: |A process in which one starts from an initial state and proceeds to search through a problem |
| |space to identify a desired goal. |
|Reality Mining: |Data mining of location-based data. |
|RFID: |A generic technology that refers to the use of radio frequency waves to identify objects. |
|Robot: |A machine that has the capability of performing manual functions without human intervention. |
|Social Network Analysis Software: |The mapping and measuring of relationships and information flows among people, groups, |
| |organizations, computers, and other information or knowledge –processing entities. The nodes in|
| |the network are the people and groups, whereas the links show relationships or flows between the|
| |nodes, SNAs provide both visual and mathematical analysis of relationships. |
|Software as a Service (SaaS): |Software that is rented instead of sold. |
|Utility (On Demand) Computing: |Unlimited computing power and storage capacity that like electricity, water, and telephone |
| |services, can be obtained on demand, used, and reallocated for any application and that are |
| |billed on a pay-per-use basis. |
|Virtual Community: |A group of people with similar interest who interact with one another using the internet. |
|Virtual Team: |A team whose members are in different places while in a meeting together. |
|Virtual Worlds: |Artificial worlds created by computer systems in which the user has the impression of being |
| |immersed. |
|Web 2.0: |The popular term for advanced internet technology and applications, including blogs, wikis, RSS,|
| |and social bookmarking. One of the most significant differences between web 2.0 and the |
| |traditional World Wide Web is greater collaboration among internet users and other users, |
| |content providers, and enterprises. |
Articles
Data Quality ● Encompasses technologies and business practices ● Data quality initiatives are often reactionary and driven by liability. Drivers include: ○ Business intelligence ○ Customer service ○ Global supply chain ○ IT system consolidations and migrations ○ Compliance (regulatory reporting, subpoenas) ● High quality data provides leverage ● What is it? ● Quality of data’s content and structure ● Standard technology and business practices: ○ Name-and-address cleansing ○ Matching ○ House-holding ○ De-duplication ○ Standardization ○ Appending 3rd party data ● Benefits of High Quality Data: ○ Greater confidence in analytic systems (BI) ○ Less reconciliation time ○ Single version of the truth (buzz phrase) ○ Increased customer satisfaction ○ Increased revenues ● Data Governance ○ Increasingly an organizational unit ○ Referred to as Enterprise Data Quality or Management (EDM) ○ Job titles: Data stewards, governors ○ Organization-wide mandates ○ Governance committee ○ Poor data originates in both IT and the business. Most common sources of data quality problems: ■ Inconsistent data definition ■ Data entry ● DQ Myths and Misconceptions: ○ Not just error correction ○ Not a one-time action ○ Not about perfection ● DQ Activities and Terms ○ Data profiling ○ Upstream and downstream ○ Garbage in, garbage out ○ What is the most problematic type of data?
Customer Relationship Management ● More than selecting a technology vendor ● CRM is about mutual benefit – a win-win for the company and the customer ● Definition of CRM: the infrastructure that enables the delineation of and increase in customer value, and the collection of means with which to motivate customers to remain loyal—indeed, to buy again ● CRM components ○ Strategy ○ Business Processes ○ Information ○ Organization ○ Technology ● CRM Categories ○ Sales Force Automation ○ Customer Care ○ E-Business Analysis ○ Business Analysis ○ Campaign Management and Planning ● What is the loyalty effect? Author Frederick Reichheld described it in his book The Loyalty Effect, and in the Harvard Business Review, explaining that a mere 5% increase in customer loyalty can result in up to 125% percent increase in profi ts. Executives read this and rushed to their CRM vendors, hoping to ride the wave of increased customer loyalty. ● Why viral marketing? The goal is not only to improve your customers’ experience and get them to buy again—it’s about turning them into raving fans. Studies show that an unhappy customer tells an average of nine other people about his bad experience. Conversely, the happy Harley Davidson customer tattoos his forearm with the company’s logo. While your company might never be that lucky, the results of a CRM effort will not only improve your customers’ perception of your company—it will undoubtedly improve your bottom line.