WebSVN – tendra.SVN – Blame – /trunk/doc/tcpplus/parse.html

Rev	Author	Line No.	Line
2	7u83	1	`<!-- Crown Copyright (c) 1998 -->`
		2	`<HTML>`
		3	`<HEAD>`
		4	`<TITLE>`
		5	`C++ Producer Guide: Parsing C++`
		6	`</TITLE>`
		7	`</HEAD>`
		8	`<BODY TEXT="#000000" BGCOLOR="#FFFFFF" LINK="#0000FF" VLINK="#400080" ALINK="#FF0000">`
		9
		10	`<H1>C++ Producer Guide</H1>`
		11	`<H3>March 1998</H3>`
		12	`<A HREF="tdf.html"><IMG SRC="../images/next.gif" ALT="next section"></A>`
		13	`<A HREF="error.html"><IMG SRC="../images/prev.gif" ALT="previous section"></A>`
		14	`<A HREF="index.html"><IMG SRC="../images/top.gif" ALT="current document"></A>`
		15	`<A HREF="../index.html"><IMG SRC="../images/home.gif" ALT="TenDRA home page">`
		16	`</A>`
		17	`<IMG SRC="../images/no_index.gif" ALT="document index"><P>`
		18	`<HR>`
		19
		20	`<H2>3.4. Parsing C++</H2>`
		21	`<P>`
		22	`The parser used in the C++ producer is generated using the`
		23	`<A HREF="../utilities/sid.html"><CODE>sid</CODE> tool</A>. Because`
		24	`of the large size of the generated code (1.3MB), the <CODE>sid</CODE>`
		25	`output is run through a simple program, <CODE>sidsplit</CODE>, which`
		26	`splits the output into a number of more manageable modules. It also`
		27	`transforms the code to use the <A HREF="style.html#language"><CODE>PROTO</CODE>`
		28	`macros</A> used in the rest of the program.`
		29	`</P>`
		30	`<P>`
		31	`<CODE>sid</CODE> is designed as a parser for grammars which can be`
		32	`transformed into LL(1) grammars. The distinguishing feature of these`
		33	`grammars is that the parser can always decide what to do next based`
		34	`on the current terminal. This is not the case in C++; in some circumstances`
		35	`a potentially unlimited look-ahead is required to distinguish, for`
		36	`example, declaration statements from expression statements. In the`
		37	`technical phrase, C++ is an LL(k) grammar. Fortunately there are relatively`
		38	`few such situations, and <CODE>sid</CODE>`
		39	`provides a mechanism, <A HREF="../utilities/sid.html#predicate">predicates</A>,`
		40	`for bypassing the normal parsing mechanism in these cases. Thus it`
		41	`is possible, although difficult, to express C++ as a <CODE>sid</CODE>`
		42	`grammar.`
		43	`</P>`
		44	`<P>`
		45	`The <CODE>sid</CODE> grammar file, <CODE>syntax.sid</CODE>, is closely`
		46	`based on the ISO C++ grammar. In particular, the same production`
		47	`names have been used. The grammar has been extended slightly to allow`
		48	`common syntactic errors to be detected elegantly. Other parsing errors`
		49	`are handled by <CODE>sid</CODE>'s exception mechanism. At present`
		50	`there is only limited recovery after such errors.`
		51	`</P>`
		52	`<P>`
		53	`The lexical analysis routines in the C++ producer are hand-crafted,`
		54	`based on an initial version generated by the simple lexical analyser`
		55	`generator,`
		56	`<CODE>lexi</CODE>. <CODE>lexi</CODE> has been used more directly`
		57	`to generate the lexical analysers for certain of the other automatic`
		58	`code generating tools, including <CODE>calculus</CODE>, used in the`
		59	`producer.`
		60	`</P>`
		61	`<P>`
		62	`The <CODE>sid</CODE> grammar contains a number of entry points. The`
		63	`most important is <CODE>parse_file</CODE>, which is used to parse`
		64	`a complete C++ translation unit. The syntax for the`
		65	`<A HREF="pragma.html"><CODE>#pragma TenDRA</CODE> directives</A> is`
		66	`included within the same grammar with two entry points,`
		67	`<CODE>parse_tendra</CODE> in normal use, and <CODE>parse_preproc</CODE>`
		68	`for use in preprocessing mode. There are also entry points in the`
		69	`grammar for each of the kinds of <A HREF="token.html#args">token argument</A>.`
		70	`The parsing routines for token and template arguments are largely`
		71	`hand-crafted, based on these primitives.`
		72	`</P>`
		73	`<P>`
		74	`Certain parsing operations are performed before control passes to`
		75	`the`
		76	`<CODE>sid</CODE> grammar. As mentioned above, these include the processing`
		77	`of token and template applications. The other important case concerns`
		78	`nested name specifiers. For example, in:`
		79	`<PRE>`
		80	`class A {`
		81	`class B {`
		82	`static int c ;`
		83	`} ;`
		84	`} ;`
		85
		86	`int A::B::c = 0 ;`
		87	`</PRE>`
		88	`the qualified identifier <CODE>A::B::c</CODE> is split into two terminals,`
		89	`a nested name specifier, <CODE>A::B::</CODE>, and an identifier, <CODE>c</CODE>,`
		90	`which is looked up in the corresponding namespace. Note that it is`
		91	`at this stage that name look-up occurs. An identifier can be mapped`
		92	`to one of a number of terminals, including keywords, type names,`
		93	`namespace names and other identifiers, according to the result of`
		94	`this look-up. If the look-up gives a macro then this is expanded`
		95	`at this stage.`
		96	`</P>`
		97
		98	`<HR>`
		99	`<P><I>Part of the <A HREF="../index.html">TenDRA Web</A>.<BR>Crown`
		100	`Copyright © 1998.</I></P>`
		101	`</BODY>`
		102	`</HTML>`

Subversion Repositories tendra.SVN

(root)/trunk/doc/tcpplus/parse.html @ 2 – Rev