Most compilers and interpreters contain a component called a parser that extracts the meaning of a program prior to generating the compiled code or performing the interpreted execution.
Following is an example of a context-free grammar, which we call \(G_1\),
[\begin{align} A&\rightarrow 0A1\ A&\rightarrow B\ B&\rightarrow # \end{align} ]
A grammar consists of a collection of substitution rules, also called productions.
The symbol is called a variable (A, B). The string consists of variables and other symbols called terminals (0,1). One variable is designated as the start variable. It usually occurs on the left-hand side of the topmost rule.
For example, grammar \(G_1\) generates the string \(000\#111\). The sequence of substitutions to obtain a string is called a derivation. Such derivation call also be represented by a parse tree.
All string generated in this way constitute the language of the grammar, we write it as \(L(G_1)\). Any language that can be generated by some context-free grammar is called a context-free language (CFL).
Definition
A context-free grammar (CFG) is a \(4\)-tuple \((V,\sum,R,S)\) , where,
If \(A\rightarrow w\) is a rule of the grammar, we say that \(uAv\) yields \(uAv\Rightarrow uwv\). Say that \(u\) derives \(v\), written \(u\Rightarrow^* v\), if \(u=v\) or if a sequence \(u_1,u_2,\cdots,u_k\) exists for \(k\geq 0\) and
[u\Rightarrow u_1 \Rightarrow u_2\cdots\Rightarrow u_k\Rightarrow v ]
The Language of the grammar is \(\{w\in\sum^*|S\Rightarrow^*w\}\), which represents those string that can be derived by the start variable \(S\).
If a grammar generates the same string in several different ways,we say that the string is derived ambiguously in that grammar. If a grammar generates some string ambiguously, we say that the grammar is ambiguous, when we say so, we mean that the string has two different parse trees, not two different derivations.
A derivation of a string \(w\) in a grammar \(G\) is a leftmost derivation if at every step the leftmost remaining variable is the one replaced.
Definition
A string \(w\) is derived ambiguously in context-free grammar \(G\) if it has two or more different leftmost derivations. Grammar \(G\) is ambiguous if it generates some string ambiguously.
Some CFL can be generated only by ambiguous grammars. Such languages are called inherently ambiguous.
Definition
A CFG is in Chomsky normal form if every rule is of the form
[\begin{align} A&\rightarrow BC\ A&\rightarrow a \end{align} ]
where \(a\) is any terminal and \(A\), \(B\), and C are any variables —— except that \(B\) and \(C\) may not the start variable. In addition, we permit the rule \(S\rightarrow\epsilon\), where \(S\) is the start variable.
Theorem
Any context-free language is generated by a CFG in Chomsky normal form.
To prove that a language is context free, we can:
Definition
A pushdown automaton is a \(6\)-tuple \((Q,\sum,\Gamma,\delta,q_0,F)\), where \(Q,\sum,\Gamma\), and \(F\) are all finite sets, and
Some explanation to the definition:
A pushdown automaton \(M=(Q,\sum,\Gamma,\delta,q_0,F)\) computes as follows. It accepts input \(w\) if \(w\) can be written as \(w=w_1w_2\cdots w_m\), where each \(w_i\in \sum_\epsilon\) and sequences of states \(r_0,r_1,\cdots, r_m\in Q\) and strings \(s_0,s_1,\cdots, s_m\in \Gamma ^*\) exist that satisfy the following three conditions. The string \(s_i\) represent the sequence of stack contains that \(M\) has on the accepting branch of the computation.
Theorem
A language is context free if and only some pushdown automaton recognize it.
Theorem
If a language is context free, then some pushdown automaton recognizes it.
Theorem
if a pushdown automaton recognizes some language, then it is context free.
Theorem: Pumping lemma for context-free languages
If \(A\) is a CFL, then there is a number \(p\) (the pumping length) where, if \(s\) is any string in \(A\) of length at least \(p\), then \(s\) may be divided into five pieces \(s=uvxyz\) satisfying the conditions
It states that every context-free language has a special value called the pumping length such that all longer strings in the language can be 'pumped'. The 'pumped' means that the string can be divided into five parts so that the second and the fourth parts may be repeated together any number of times and the resulting string still remains in the language.
Introduction to the Theory of Computation, 3rd edition, Michael Sipser
声明:该文章系转载,转载该文章的目的在于更广泛的传递信息,并不代表本网站赞同其观点,文章内容仅供参考。
本站是一个个人学习和交流平台,网站上部分文章为网站管理员和网友从相关媒体转载而来,并不用于任何商业目的,内容为作者个人观点, 并不代表本网站赞同其观点和对其真实性负责。
我们已经尽可能的对作者和来源进行了通告,但是可能由于能力有限或疏忽,导致作者和来源有误,亦可能您并不期望您的作品在我们的网站上发布。我们为这些问题向您致歉,如果您在我站上发现此类问题,请及时联系我们,我们将根据您的要求,立即更正或者删除有关内容。本站拥有对此声明的最终解释权。