2 |
7u83 |
1 |
.\" Crown Copyright (c) 1997
|
|
|
2 |
.\"
|
|
|
3 |
.\" This TenDRA(r) Manual Page is subject to Copyright
|
|
|
4 |
.\" owned by the United Kingdom Secretary of State for Defence
|
|
|
5 |
.\" acting through the Defence Evaluation and Research Agency
|
|
|
6 |
.\" (DERA). It is made available to Recipients with a
|
|
|
7 |
.\" royalty-free licence for its use, reproduction, transfer
|
|
|
8 |
.\" to other parties and amendment for any purpose not excluding
|
|
|
9 |
.\" product development provided that any such use et cetera
|
|
|
10 |
.\" shall be deemed to be acceptance of the following conditions:-
|
|
|
11 |
.\"
|
|
|
12 |
.\" (1) Its Recipients shall ensure that this Notice is
|
|
|
13 |
.\" reproduced upon any copies or amended versions of it;
|
|
|
14 |
.\"
|
|
|
15 |
.\" (2) Any amended version of it shall be clearly marked to
|
|
|
16 |
.\" show both the nature of and the organisation responsible
|
|
|
17 |
.\" for the relevant amendment or amendments;
|
|
|
18 |
.\"
|
|
|
19 |
.\" (3) Its onward transfer from a recipient to another
|
|
|
20 |
.\" party shall be deemed to be that party's acceptance of
|
|
|
21 |
.\" these conditions;
|
|
|
22 |
.\"
|
|
|
23 |
.\" (4) DERA gives no warranty or assurance as to its
|
|
|
24 |
.\" quality or suitability for any purpose and DERA accepts
|
|
|
25 |
.\" no liability whatsoever in relation to any use to which
|
|
|
26 |
.\" it may be put.
|
|
|
27 |
.\"
|
|
|
28 |
.TH sid 1
|
|
|
29 |
.SH NAME
|
|
|
30 |
sid \- Syntax Improving Device; parser generator.
|
|
|
31 |
.SH SYNTAX
|
|
|
32 |
.LP
|
|
|
33 |
.B sid
|
|
|
34 |
[\fIoption\fR]... \fIfile\fR...
|
|
|
35 |
.SH DESCRIPTION
|
|
|
36 |
.LP
|
|
|
37 |
The
|
|
|
38 |
.B sid
|
|
|
39 |
command is used to turn descriptions of a language into a program for
|
|
|
40 |
recognising that language. This manual page details the command line
|
|
|
41 |
syntax; for more information, consult the
|
|
|
42 |
.B sid
|
|
|
43 |
user documentation. The number of files specified on the command line
|
|
|
44 |
varies depending upon the output language. The description of the
|
|
|
45 |
\fB\-\-language\fR option specifies the number of files for each language.
|
|
|
46 |
.SH SWITCHES
|
|
|
47 |
.LP
|
|
|
48 |
The new version of
|
|
|
49 |
.B sid
|
|
|
50 |
accepts both short form and long form command line switches.
|
|
|
51 |
.LP
|
|
|
52 |
Short form switches are single characters, and begin with a \&'-' or \&'+'
|
|
|
53 |
character. They can be concatentated into a single command line word, e.g.:
|
|
|
54 |
.IP
|
|
|
55 |
\fB\-vdl\fR \fIdump-file\fR \fIlanguage-name\fR
|
|
|
56 |
.LP
|
|
|
57 |
which contains three different switches (\fB\-v\fR, which takes no
|
|
|
58 |
arguments; \fB\-d\fR, which takes one argument: \fIdump-file\fR; and
|
|
|
59 |
\fB\-l\fR, which takes one argument: \fIlanguage-name\fR).
|
|
|
60 |
.LP
|
|
|
61 |
Long form switches are strings, and begin with \&'--' or \&'++'. With long
|
|
|
62 |
form switches, only the shortest unique prefix need be entered. The long
|
|
|
63 |
form of the above example would be:
|
|
|
64 |
.IP
|
|
|
65 |
\fB\-\-version\fR \fB\-\-dump\-file\fR \fIdump-file\fR
|
|
|
66 |
\fB\-\-language\fR \fIlanguage\-name\fR
|
|
|
67 |
.LP
|
|
|
68 |
In most cases the arguments to the switch should follow the switch as a
|
|
|
69 |
separate word. In the case of short form switches, the arguments to the
|
|
|
70 |
short form switches in a single word should follow the word in the order of
|
|
|
71 |
the switches (as in the first example). For some options, the argument may
|
|
|
72 |
be part of the same word as the switch (such options are shown without a
|
|
|
73 |
space between the switch and the argument in the switch summaries below).
|
|
|
74 |
In the case of short form switches, such a switch would terminate any
|
|
|
75 |
concatentation of switches (either a character would follow it, which would
|
|
|
76 |
be treated as its argument, or it would be the end of the word, and its
|
|
|
77 |
argument would follow as normal).
|
|
|
78 |
.LP
|
|
|
79 |
For binary switches, the \&'-' or \&'--' switch prefixes set (enable) the
|
|
|
80 |
switch, and the \&'+' or \&'++' switch prefixes reset (disable) the switch.
|
|
|
81 |
This is probably back to front, but is in keeping with other programs. The
|
|
|
82 |
switches \&'--' or \&'++' by themselves terminate option parsing.
|
|
|
83 |
.SH ERROR FILE SYNTAX
|
|
|
84 |
.LP
|
|
|
85 |
It is possible to change the error messages that
|
|
|
86 |
.B sid
|
|
|
87 |
uses. In order to do this, make the environment variable
|
|
|
88 |
\fISID_ERROR_FILE\fR contain the name of a file with the new error messages
|
|
|
89 |
in.
|
|
|
90 |
.LP
|
|
|
91 |
The error file consists of zero or more sections. Each section begins
|
|
|
92 |
with a section marker (one of \fB%prefix%\fR, \fB%errors%\fR or
|
|
|
93 |
\fB%strings%\fR). The prefix section takes a single string (this is to
|
|
|
94 |
be the prefix for all error messages). The other sections take zero or
|
|
|
95 |
more pairs of names and strings. A name is a sequence of characters
|
|
|
96 |
surrounded by single quotes. A string is a sequence of characters
|
|
|
97 |
surrounded by double quotes. In the case of the prefix and error
|
|
|
98 |
sections, the strings may contain variables of the form \fB${\fIvariable
|
|
|
99 |
name\fB}\fR. These variables will be replaced by suitable information
|
|
|
100 |
when the error occurs. The backslash character can be used to escape
|
|
|
101 |
characters. The following C style escape sequences are recognized:
|
|
|
102 |
\&'\fB\\n\fR', \&'\fB\\r\fR', \&'\fB\\t\fR', \&'\fB\\0\fR'. Also, the
|
|
|
103 |
sequence \&'\fB\\x\fINN\fR' represents the character with code \fINN\fR
|
|
|
104 |
in hex. The hash character acts as a comment to end of line character.
|
|
|
105 |
.LP
|
|
|
106 |
The \fB\-\-show\-errors\fR option may be used to get a copy of the current
|
|
|
107 |
error messages.
|
|
|
108 |
.SH OPTIONS
|
|
|
109 |
.LP
|
|
|
110 |
.B sid
|
|
|
111 |
accepts the following command line options:
|
|
|
112 |
.LP
|
|
|
113 |
\fB\-\-dump\-file\fR \fIFILE\fR
|
|
|
114 |
.br
|
|
|
115 |
\fB\-d\fR \fIFILE\fR
|
|
|
116 |
.IP
|
|
|
117 |
This option causes intermediate dumps of the grammar to be
|
|
|
118 |
written to the file \fIFILE\fR.
|
|
|
119 |
.LP
|
|
|
120 |
\fB\-\-factor\-limit\fR \fILIMIT\fR
|
|
|
121 |
.br
|
|
|
122 |
\fB\-f\fR \fILIMIT\fR
|
|
|
123 |
.IP
|
|
|
124 |
This option limits the number of rules that can be created during the
|
|
|
125 |
factorisation process. It is probably best not to change this.
|
|
|
126 |
.LP
|
|
|
127 |
\fB\-\-help\fR
|
|
|
128 |
.br
|
|
|
129 |
\fB\-?\fR
|
|
|
130 |
.IP
|
|
|
131 |
Write an option summary to the standard error.
|
|
|
132 |
.LP
|
|
|
133 |
\fB\-\-inline\fR \fIINLINES\fR
|
|
|
134 |
.br
|
|
|
135 |
\fB\-i\fR \fIINLINES\fR
|
|
|
136 |
.IP
|
|
|
137 |
This option controls what inlining will be done in the output parser.
|
|
|
138 |
The inlines argument should be a comma seperated list of the following
|
|
|
139 |
words:
|
|
|
140 |
.RS 1i
|
|
|
141 |
.IP SINGLES
|
|
|
142 |
This causes single alternative rules to be inlined. This inlining is no
|
|
|
143 |
longer performed as a modification to the grammar (it was in version 1.0).
|
|
|
144 |
.IP BASICS
|
|
|
145 |
This causes rules that contain only basics (and no exception handlers or
|
|
|
146 |
empty alternatives) to be inlined. The restriction on exception
|
|
|
147 |
handlers and empty alternatives is rather arbitrary, and may be changed
|
|
|
148 |
later.
|
|
|
149 |
.IP TAIL
|
|
|
150 |
This causes tail recursive calls to be inlined. Without this, tail
|
|
|
151 |
recursion elimination will not be performed.
|
|
|
152 |
.IP OTHER
|
|
|
153 |
This causes other calls to be inlined wherever possible. Unless the
|
|
|
154 |
"MULTI" inlining is also specified, this will be done only for
|
|
|
155 |
productions that are called once.
|
|
|
156 |
.IP MULTI
|
|
|
157 |
This causes calls to be inlined, even if the rule being called is called
|
|
|
158 |
more than once. Turning this inlining on implies "OTHER". Similarly
|
|
|
159 |
turning off "OTHER" inlining will turn off "MULTI" inlining. For
|
|
|
160 |
grammars of any size, this is probably best avoided; if used the
|
|
|
161 |
generated parser may be huge (e.g. a C grammar has produced a file that
|
|
|
162 |
was several hundred MB in size).
|
|
|
163 |
.IP ALL
|
|
|
164 |
.br
|
|
|
165 |
This turns on all inlining.
|
|
|
166 |
.RE
|
|
|
167 |
.IP
|
|
|
168 |
In addition, prefixing a word with "NO" turns off that inlining
|
|
|
169 |
phase. The words may be given in any case. They are evaluated in
|
|
|
170 |
the order given, so:
|
|
|
171 |
.RS
|
|
|
172 |
.IP
|
|
|
173 |
\-inline noall,singles
|
|
|
174 |
.RE
|
|
|
175 |
.IP
|
|
|
176 |
would turn on single alternative rule inlining only, whilst:
|
|
|
177 |
.RS
|
|
|
178 |
.IP
|
|
|
179 |
\-inline singles,noall
|
|
|
180 |
.RE
|
|
|
181 |
.IP
|
|
|
182 |
would turn off all inlining. The default is as if SID were invoked
|
|
|
183 |
with the option:
|
|
|
184 |
.RS
|
|
|
185 |
.IP
|
|
|
186 |
\-inline noall,basics,tail
|
|
|
187 |
.RE
|
|
|
188 |
.LP
|
|
|
189 |
\fB\-\-language\fR \fILANGUAGE\fR
|
|
|
190 |
.br
|
|
|
191 |
\fB\-l\fR \fILANGUAGE\fR
|
|
|
192 |
.IP
|
|
|
193 |
This option specifies the output language. Currently this should be
|
7 |
7u83 |
194 |
either "ansi\-c", "pre\-ansi\-c", or "test". The default is
|
2 |
7u83 |
195 |
"ansi\-c".
|
|
|
196 |
.IP
|
|
|
197 |
The "ansi\-c" and "pre\-ansi\-c" languages are basically the same. The
|
|
|
198 |
only difference is that "ansi\-c" initially uses function prototypes,
|
7 |
7u83 |
199 |
and "pre\-ansi\-c" doesn't. Each language takes two input files, a
|
2 |
7u83 |
200 |
grammar file and an actions file, and produces two output files, a C
|
|
|
201 |
source file containing the generated parser and a C header file containing
|
|
|
202 |
the external declarations for the parser. The C language specific options
|
|
|
203 |
are:
|
|
|
204 |
.RS
|
|
|
205 |
prototypes
|
|
|
206 |
proto
|
|
|
207 |
no\-prototypes
|
|
|
208 |
no\-proto
|
|
|
209 |
.RS
|
7 |
7u83 |
210 |
These enable or disable the use of function prototypes.
|
2 |
7u83 |
211 |
.RE
|
|
|
212 |
split
|
|
|
213 |
split=\fINUMBER\fR
|
|
|
214 |
no\-split
|
|
|
215 |
.RS
|
|
|
216 |
These enable or disable the output file split option. The generated
|
|
|
217 |
files can be very large even without inlining. This option splits the
|
|
|
218 |
main output file into a number of components containing about \fINUMBER\fR
|
|
|
219 |
lines each (the default being 50000). These components are distinguished
|
|
|
220 |
by successively substituting 1, 2, 3, ... for the character '@' in the
|
|
|
221 |
output file name.
|
|
|
222 |
.RE
|
|
|
223 |
numeric\-ids
|
|
|
224 |
numeric
|
|
|
225 |
no\-numeric\-ids
|
|
|
226 |
no\-numeric
|
|
|
227 |
.RS
|
|
|
228 |
These enable or disable the use of numeric identifiers. Numeric
|
|
|
229 |
identifiers replace the identifier name with a number, which is mainly
|
|
|
230 |
of use in stopping identifier names getting too long. The disadvantage
|
|
|
231 |
is that the code becomes less readable, and more difficult to debug.
|
|
|
232 |
Numeric identifiers are not used by default and are never used for
|
|
|
233 |
terminal numbers.
|
|
|
234 |
.RE
|
|
|
235 |
casts
|
|
|
236 |
cast
|
|
|
237 |
no\-casts
|
|
|
238 |
no\-cast
|
|
|
239 |
.RS
|
|
|
240 |
These enable or disable casting of action and assignment operator
|
|
|
241 |
immutable parameters. If enabled, a parameter is cast to its own type
|
|
|
242 |
when it is substituted into the action. This will cause some compilers
|
|
|
243 |
to complain about attempts to modify the parameter (which can help pick
|
|
|
244 |
out attempts at mutating parameters that should not be mutated). The
|
|
|
245 |
disadvantage is that not all compilers will reject attempts at mutation,
|
|
|
246 |
and that ANSI doesn't allow casting to structure and union types, which
|
|
|
247 |
means that some code may be illegal. Parameter casting is disabled by
|
|
|
248 |
default.
|
|
|
249 |
.RE
|
|
|
250 |
unreachable\-macros
|
|
|
251 |
unreachable\-macro
|
|
|
252 |
unreachable\-comments
|
|
|
253 |
unreachable\-comment
|
|
|
254 |
.RS
|
|
|
255 |
These choose whether unreachable code is marked by a macro or a comment.
|
|
|
256 |
The default is to mark unreachable code with a comment "/*UNREACHED*/",
|
|
|
257 |
however a macro "UNREACHED;" may be used instead, if desired.
|
|
|
258 |
.RE
|
|
|
259 |
lines
|
|
|
260 |
line
|
|
|
261 |
no\-lines
|
|
|
262 |
no\-line
|
|
|
263 |
.RS
|
|
|
264 |
These determine whether "#line" directives should be output to relate the
|
|
|
265 |
output file to the actions file. These are generated by default.
|
|
|
266 |
.RE
|
|
|
267 |
.RE
|
|
|
268 |
.IP
|
|
|
269 |
The "test" language only takes one input file, and produces no
|
|
|
270 |
output file. It may be used to check that a grammar is valid. In
|
|
|
271 |
conjunction with the dump file, it may be used to check the
|
|
|
272 |
transformations that would be applied to the grammar. There are no
|
|
|
273 |
language specific options for the "test" language.
|
|
|
274 |
.LP
|
|
|
275 |
\fB\-\-show\-errors\fR
|
|
|
276 |
.br
|
|
|
277 |
\fB\-e\fR
|
|
|
278 |
.IP
|
|
|
279 |
Write the current error message list to the standard output.
|
|
|
280 |
.LP
|
|
|
281 |
\fB\-\-switch\fR \fIOPTION\fR
|
|
|
282 |
.br
|
|
|
283 |
\fB\-s\fR \fIOPTION\fR
|
|
|
284 |
.IP
|
|
|
285 |
Pass through \fIOPTION\fR as a language specific option.
|
|
|
286 |
.LP
|
|
|
287 |
\fB\-\-tab\-width\fR \fINUMBER\fR
|
|
|
288 |
.br
|
|
|
289 |
\fB\-t\fR \fINUMBER\fR
|
|
|
290 |
.IP
|
|
|
291 |
This option specifies the number of spaces that a tab occupies. It
|
|
|
292 |
defaults to 8. It is only used when indenting output.
|
|
|
293 |
.LP
|
|
|
294 |
\fB\-\-version\fR
|
|
|
295 |
.br
|
|
|
296 |
\fB\-v\fR
|
|
|
297 |
.IP
|
|
|
298 |
This option causes the version number and supported languages to be
|
|
|
299 |
written to the standard error stream.
|
|
|
300 |
.SH SEE ALSO
|
|
|
301 |
.LP
|
|
|
302 |
SID users' guide.
|