L E C T U R E
3
Context-Free Grammars
1 Where are Context-Free Grammars (CFGs) Used?
CFGs are a more powerful formalism than regular expressions. They are more powerful in the sense that whatever can be expressed using regular expressions can be expressed using grammars (short for context-free grammars here), but they can also express languages that do not have regular expressions. An example of such a language is the set of well-matched parenthesis. Grammars are used to express syntactic rules. These rules are used by the compiler to take a steam of tokens (the output from a scanner/lexical analyzer) and parse it for syntactic correctness, e.g. checking that each construct is well formed, all parentheses are matched, or all keywords are spelled correctly. This process is known as parsing.
2
Definitions
A context-free grammar G is a 4-tuple where N is a set of nonterminals, T is a set of terminals, P is a set of production rules of the form A→α, A is an element of set N, i.e. A∈ N, and α ∈ (N ∪ T)*, and S is a specific non-terminal called the start symbol. Sometimes, the set of terminals is also referred to as the alphabet. Recall that for a set of strings I, the notation I*, Kleene closure, refers to the set of all strings obtained by concatenation of zero or more elements taken from the set I in any order. For example, if I={a,b,A,B}, then the set I* is {ε, a, b, A, B, aa, bb, AA, BB, ab, ba, aA, Aa,aB, Ba, bA, Ab, bB, Bb,...} where ε is the empty string. Here are other definitions related to context-free grammars and languages: A derivation using the rule A→α is the process of obtaining a new string from a string w by replacing an occurrence of A in w with α. A sentence is a string consisting of only terminal symbols. A valid sentence with respect to grammar G is a sentence that can be derived using the production rules of G starting from S and ending with a sentence. A leftmost (rightmost) derivation is
Ramki Thurimella ©