Cheap and Secure Web Hosting Provider : See Now

# [Solved]: From context-free to context-sensitive

, ,
Problem Detail:

I have a context-free language $L(G)$. I'm reading in a book that $L(G') = L(G) - \{{\epsilon}\}$ is context-sensitive but I cannot find a proof or confirmation of this fact; moreover, in other texts the production $S\to\epsilon$ is allowed (as long $S$ isn't on the right hand side of any production) I'm quite confused.

The book is Formal Models of Computation: The Ultimate Limits of Computing by Arthur Charles Fleck and the passage in question says:

Given two context free grammars $G_1$ and $G_2$ first determine if $\epsilon\in L(G_1)$ and $\epsilon\in L(G_2)$. If so, $L(G_1)\cap L(G_2) \neq \emptyset$, otherwise $L(G_1) - \{{\epsilon}\}$ and $L(G_2) - \{{\epsilon}\}$ are context-sensitive.

Maybe what the book says is that they are also context sensitive (since the set of context-free languages is properly contained in the set of the context-sensitive)?

Saying that CF grammars are properly contained in CS grammars is not quite correct, at least according to some authors (Hopcroft-Ullman 1979, page 223). Their reason is that the CS languages cannot contain the empty word $\epsilon$, at least according to a strict definition of what a CS grammar is: the right-hand side must be at least as long as the left-hand side. Other definitions of CS grammars, such as given in wikipedia, do allow the rule $S\to\epsilon$, provided $S$ never appears in a right-hand side. The purpose is to allow CS languages to contain the empty word, so that they define the same family of languages as the linear bounded automata (LBA). And it also makes CF languages a subset of CS languages.

Context-free (CF) languages can contain the empty word.

Thus, for an author belonging to the strict school (no $\epsilon$ in CS languages), only $\epsilon$-free CF languages are contained in the CS languages.

From what I read in your question, my understanding is that the authors of your book belong to the strict school. So they will not consider a CF language as CS, unless they first remove the empty word, which does not change their CF character as shown in the answer by D.W. that indicates how to obtain the CF grammar $G'$ from the CF grammar $G$.

Apparently the author is analyzing the intersection of two CF grammar, and wants to use the fact that (for him) $\epsilon$-free CF languages are contained in the CS languages. So he considers separately the case when the intersection contains $\epsilon$, and the case when it does not contains it. If it does no contain it, he can remove it from the two languages, and then consider them CS languages, for whatever he is doing.

That is my explanation of what you are reading. Of course, a larger piece of context would help ascertain that view. Or even better, the definition of CS language they give.