Preview

07004272

Powerful Essays
Open Document
Open Document
5897 Words
Grammar
Grammar
Plagiarism
Plagiarism
Writing
Writing
Score
Score
07004272
2014 IEEE International Conference on Big Data

Random Walks on Adjacency Graphs for Mining
Lexical Relations from Big Text Data
Shan Jiang
Department of Computer Science
University of Illinois at Urbana-Champaign
Urbana, IL, 61801 USA sjiang18@illinois.edu ChengXiang Zhai
Department of Computer Science
University of Illinois at Urbana-Champaign
Urbana, IL, 61801 USA czhai@cs.uiuc.edu Abstract—Lexical relations, or semantic relations of words, are useful knowledge fundamental to all applications since they help to capture inherent semantic variations of vocabulary in human languages. Discovering such knowledge in a robust way from arbitrary text data is a significant challenge in big text data mining. In this paper, we propose a novel general probabilistic approach based on random walks on word adjacency graphs to systematically mine two fundamental and complementary lexical relations, i.e., paradigmatic and syntagmatic relations between words from arbitrary text data. We show that representing text data as an adjacency graph opens up many opportunities to define interesting random walks for mining lexical relation patterns, and propose specific random walk algorithms for mining paradigmatic and syntagmatic relations. Evaluation results on multiple corpora show that the proposed random walkbased algorithms can discover meaningful paradigmatic and syntagmatic relations of words from text data.

I. I NTRODUCTION
The dramatic growth of text data creates great opportunities for applying computational methods to mine “big text data” to discover all kinds of useful knowledge and support many data analytics applications. Unfortunately, text data are unstructured, and effective discovery of knowledge from text data requires the computer to understand natural languages, which is known to be an extremely difficult task. In this paper, we study how to mine two fundamental and complementary types of interesting semantic relations between words from arbitrary text data in a

You May Also Find These Documents Helpful

  • Good Essays

    Nt1310 Unit 3 Study Essay

    • 3921 Words
    • 16 Pages

    1. How can text mining be used in a crisis situation? Text mining makes it easy for the end user to take the knowledge discovered by the analytics tools and embed it in a concise and useful form in an intelligence product. MITRE would allow the user to select various text mining tools and, with a few mouse clicks, assemble them to create a complex filter that fulfills whatever knowledge discovery function is currently needed. An analyst might use text mining to discover important nuggest of information in a large collection of news sources.…

    • 3921 Words
    • 16 Pages
    Good Essays
  • Satisfactory Essays

    345263562

    • 1488 Words
    • 9 Pages

    The oldest fossils of modern humans, archaic humans, and early hominins have all been found in…

    • 1488 Words
    • 9 Pages
    Satisfactory Essays
  • Good Essays

    Isds Ch 5

    • 3328 Words
    • 14 Pages

    5) The benefits of text mining are greatest in areas where very large amounts of textual data are being generated, such as law, academic research, finance, and medicine.…

    • 3328 Words
    • 14 Pages
    Good Essays
  • Powerful Essays

    References: Choudhary, A., Harding, J., Lin, H., Tiwari, M., & Shankar, R. (2011). Knowledge Discovery…

    • 2088 Words
    • 6 Pages
    Powerful Essays
  • Good Essays

    The lexical decision is a process of the mind which helps us organize our thoughts when a word is introduced. Associating words to a particular word helps us understand the concept of, for example a conversation. If one were to mention the word “computer”, lexical decision finds the words, “keyboard” and “mouse”. Normally, our mind reacts faster to associated words rather than non-associated words as when a word is used; a concept immediately paints a picture linking the word to its surroundings. It is not likely that when one thinks of a ‘mountain’, the next thought would be words like, ‘cup’ or ‘phone’. An experiment is introduced to measure the reaction time between these words.…

    • 1005 Words
    • 4 Pages
    Good Essays
  • Satisfactory Essays

    ChildLine Activity Cards

    • 384 Words
    • 2 Pages

    Match the examples to these language techniques used in the text. Can you then sort…

    • 384 Words
    • 2 Pages
    Satisfactory Essays
  • Best Essays

    Access to Health Care

    • 2651 Words
    • 11 Pages

    Uzma R., Mitchell T., Day, T., and Hardin, M. (2008). Text mining in healthcare applications…

    • 2651 Words
    • 11 Pages
    Best Essays
  • Better Essays

    ILP Problem Formulation Ajay Kr. Dhamija (N-1/MBA PT 2006-09) Abstract Integer linear programming is a very important class of problems, both algorithmically and combinatori- ally. Following are some of the problems in computer…

    • 4120 Words
    • 17 Pages
    Better Essays
  • Better Essays

    3035045281

    • 1483 Words
    • 5 Pages

    Director of Immigration v. Chong Fung Yuen1 has caused a series of problems in Hong Kong. In order to resolve the problems, there are three alternatives. First, requesting the Standing Committee of the National People's Congress (NPCSC) to interpret the relevant articles of Hong Kong Basic Law by government officials or courts. Second, the Court of Final Appeal (CFA) rules the case again. Third way is to amend the basic law. These resolutions will be analyzed based on the effects on Hong Kong in the aspect of the constitutional framework, rule of law and separation of power.…

    • 1483 Words
    • 5 Pages
    Better Essays
  • Powerful Essays

    0715CD042

    • 7460 Words
    • 30 Pages

    for the Award of the Degree of Bachelor of art B.A (Hons) degree in English…

    • 7460 Words
    • 30 Pages
    Powerful Essays
  • Satisfactory Essays

    16. Definition of the field of word-formation and approaches to the classification of the principal types of word-formation.…

    • 313 Words
    • 2 Pages
    Satisfactory Essays
  • Powerful Essays

    Borrowings In Modern English

    • 12061 Words
    • 34 Pages

    An impоrtant distinctive feature оf any language is that its wоrd-stоck may be subdivided intо twо main sets. The elements оf оne are native, it cоnsists оf оrigin wоrds, the elements оf the оther set are bоrrоwed (lоaned) frоm оther languages.…

    • 12061 Words
    • 34 Pages
    Powerful Essays
  • Powerful Essays

    Network Architecture

    • 2272 Words
    • 10 Pages

    [2] Huang, Shekkar, Xiong, Discovery Collocation Patterns from Spatial Data Sets: A General Approach, IEEE-KDE,volume 16,No:12, dec 2004.…

    • 2272 Words
    • 10 Pages
    Powerful Essays
  • Satisfactory Essays

    A Sad Day for Verona

    • 470 Words
    • 2 Pages

    It feels good to do something nice for somebody else.
Write about a time when you did something that made you feel good.…

    • 470 Words
    • 2 Pages
    Satisfactory Essays
  • Powerful Essays

    Semantic relations

    • 5279 Words
    • 22 Pages

    Current Issues in Language Studies 1(2009)/P. Faber, P. León, J. Prieto/ Semantic Relations, Dynamicity, and Terminological Knowledge Bases SEMANTIC RELATIONS, DYNAMICITY, AND TERMINOLOGICAL KNOWLEDGE BASES Pamela Faber a , Pilar León b , Juan Antonio Prieto c Abstract The linguistic and conceptual shift in Terminology has led to a more discourse-centered approach with a focus on how terms are used in texts (Temmerman and Kerremans, 2003). This shift has affected the construction of terminological knowledge bases, which have an underlying network of semantic relations. Such a network can be derived from corpus analysis and the extraction of terminological units and semantic relations from knowledge-rich contexts (Meyer, 2001).…

    • 5279 Words
    • 22 Pages
    Powerful Essays

Related Topics