Subversion Repositories tendra.SVN

Rev

Rev 2 | Details | Compare with Previous | Last modification | View Log | RSS feed

Rev Author Line No. Line
2 7u83 1
.\" 		 Crown Copyright (c) 1997
2
.\" 
3
.\" This TenDRA(r) Manual Page is subject to Copyright
4
.\" owned by the United Kingdom Secretary of State for Defence
5
.\" acting through the Defence Evaluation and Research Agency
6
.\" (DERA).  It is made available to Recipients with a
7
.\" royalty-free licence for its use, reproduction, transfer
8
.\" to other parties and amendment for any purpose not excluding
9
.\" product development provided that any such use et cetera
10
.\" shall be deemed to be acceptance of the following conditions:-
11
.\" 
12
.\"     (1) Its Recipients shall ensure that this Notice is
13
.\"     reproduced upon any copies or amended versions of it;
14
.\" 
15
.\"     (2) Any amended version of it shall be clearly marked to
16
.\"     show both the nature of and the organisation responsible
17
.\"     for the relevant amendment or amendments;
18
.\" 
19
.\"     (3) Its onward transfer from a recipient to another
20
.\"     party shall be deemed to be that party's acceptance of
21
.\"     these conditions;
22
.\" 
23
.\"     (4) DERA gives no warranty or assurance as to its
24
.\"     quality or suitability for any purpose and DERA accepts
25
.\"     no liability whatsoever in relation to any use to which
26
.\"     it may be put.
27
.\"
28
.TH sid 1
29
.SH NAME
30
sid \- Syntax Improving Device; parser generator.
31
.SH SYNTAX
32
.LP
33
.B sid
34
[\fIoption\fR]... \fIfile\fR...
35
.SH DESCRIPTION
36
.LP
37
The
38
.B sid
39
command is used to turn descriptions of a language into a program for
40
recognising that language.  This manual page details the command line
41
syntax; for more information, consult the
42
.B sid
43
user documentation.  The number of files specified on the command line
44
varies depending upon the output language.  The description of the
45
\fB\-\-language\fR option specifies the number of files for each language.
46
.SH SWITCHES
47
.LP
48
The new version of
49
.B sid
50
accepts both short form and long form command line switches.
51
.LP
52
Short form switches are single characters, and begin with a \&'-' or \&'+'
53
character.  They can be concatentated into a single command line word, e.g.:
54
.IP
55
\fB\-vdl\fR \fIdump-file\fR \fIlanguage-name\fR
56
.LP
57
which contains three different switches (\fB\-v\fR, which takes no
58
arguments; \fB\-d\fR, which takes one argument: \fIdump-file\fR; and
59
\fB\-l\fR, which takes one argument: \fIlanguage-name\fR).
60
.LP
61
Long form switches are strings, and begin with \&'--' or \&'++'.  With long
62
form switches, only the shortest unique prefix need be entered.  The long
63
form of the above example would be:
64
.IP
65
\fB\-\-version\fR \fB\-\-dump\-file\fR \fIdump-file\fR
66
\fB\-\-language\fR \fIlanguage\-name\fR
67
.LP
68
In most cases the arguments to the switch should follow the switch as a
69
separate word.  In the case of short form switches, the arguments to the
70
short form switches in a single word should follow the word in the order of
71
the switches (as in the first example).  For some options, the argument may
72
be part of the same word as the switch (such options are shown without a
73
space between the switch and the argument in the switch summaries below).
74
In the case of short form switches, such a switch would terminate any
75
concatentation of switches (either a character would follow it, which would
76
be treated as its argument, or it would be the end of the word, and its
77
argument would follow as normal).
78
.LP
79
For binary switches, the \&'-' or \&'--' switch prefixes set (enable) the
80
switch, and the \&'+' or \&'++' switch prefixes reset (disable) the switch.
81
This is probably back to front, but is in keeping with other programs. The
82
switches \&'--' or \&'++' by themselves terminate option parsing.
83
.SH ERROR FILE SYNTAX
84
.LP
85
It is possible to change the error messages that
86
.B sid
87
uses.  In order to do this, make the environment variable
88
\fISID_ERROR_FILE\fR contain the name of a file with the new error messages
89
in.
90
.LP
91
The error file consists of zero or more sections.  Each section begins
92
with a section marker (one of \fB%prefix%\fR, \fB%errors%\fR or
93
\fB%strings%\fR).  The prefix section takes a single string (this is to
94
be the prefix for all error messages).  The other sections take zero or
95
more pairs of names and strings.  A name is a sequence of characters
96
surrounded by single quotes.  A string is a sequence of characters
97
surrounded by double quotes.  In the case of the prefix and error
98
sections, the strings may contain variables of the form \fB${\fIvariable
99
name\fB}\fR.  These variables will be replaced by suitable information
100
when the error occurs.  The backslash character can be used to escape
101
characters.  The following C style escape sequences are recognized:
102
\&'\fB\\n\fR', \&'\fB\\r\fR', \&'\fB\\t\fR', \&'\fB\\0\fR'.  Also, the
103
sequence \&'\fB\\x\fINN\fR' represents the character with code \fINN\fR
104
in hex.  The hash character acts as a comment to end of line character.
105
.LP
106
The \fB\-\-show\-errors\fR option may be used to get a copy of the current
107
error messages.
108
.SH OPTIONS
109
.LP
110
.B sid
111
accepts the following command line options:
112
.LP
113
\fB\-\-dump\-file\fR \fIFILE\fR
114
.br
115
\fB\-d\fR \fIFILE\fR
116
.IP
117
This option causes intermediate dumps of the grammar to be
118
written to the file \fIFILE\fR.
119
.LP
120
\fB\-\-factor\-limit\fR \fILIMIT\fR
121
.br
122
\fB\-f\fR \fILIMIT\fR
123
.IP
124
This option limits the number of rules that can be created during the
125
factorisation process.  It is probably best not to change this.
126
.LP
127
\fB\-\-help\fR
128
.br
129
\fB\-?\fR
130
.IP
131
Write an option summary to the standard error.
132
.LP
133
\fB\-\-inline\fR \fIINLINES\fR
134
.br
135
\fB\-i\fR \fIINLINES\fR
136
.IP
137
This option controls what inlining will be done in the output parser.
138
The inlines argument should be a comma seperated list of the following
139
words:
140
.RS 1i
141
.IP SINGLES
142
This causes single alternative rules to be inlined.  This inlining is no
143
longer performed as a modification to the grammar (it was in version 1.0).
144
.IP BASICS
145
This causes rules that contain only basics (and no exception handlers or
146
empty alternatives) to be inlined.  The restriction on exception
147
handlers and empty alternatives is rather arbitrary, and may be changed
148
later.
149
.IP TAIL
150
This causes tail recursive calls to be inlined.  Without this, tail
151
recursion elimination will not be performed.
152
.IP OTHER
153
This causes other calls to be inlined wherever possible.  Unless the
154
"MULTI" inlining is also specified, this will be done only for
155
productions that are called once.
156
.IP MULTI
157
This causes calls to be inlined, even if the rule being called is called
158
more than once.  Turning this inlining on implies "OTHER".  Similarly
159
turning off "OTHER" inlining will turn off "MULTI" inlining.  For
160
grammars of any size, this is probably best avoided; if used the
161
generated parser may be huge (e.g. a C grammar has produced a file that
162
was several hundred MB in size).
163
.IP ALL
164
.br
165
This turns on all inlining.
166
.RE
167
.IP
168
In addition, prefixing a word with "NO" turns off that inlining
169
phase.  The words may be given in any case.  They are evaluated in
170
the order given, so:
171
.RS
172
.IP
173
\-inline noall,singles
174
.RE
175
.IP
176
would turn on single alternative rule inlining only, whilst:
177
.RS
178
.IP
179
\-inline singles,noall
180
.RE
181
.IP
182
would turn off all inlining.  The default is as if SID were invoked
183
with the option:
184
.RS
185
.IP
186
\-inline noall,basics,tail
187
.RE
188
.LP
189
\fB\-\-language\fR \fILANGUAGE\fR
190
.br
191
\fB\-l\fR \fILANGUAGE\fR
192
.IP
193
This option specifies the output language.  Currently this should be
194
either "ansi\-c", "pre\-ansi\-c", "ossg\-c", or "test".  The default is
195
"ansi\-c".
196
.IP
197
The "ansi\-c" and "pre\-ansi\-c" languages are basically the same.  The
198
only difference is that "ansi\-c" initially uses function prototypes,
199
and "pre\-ansi\-c" doesn't.  The "ossg\-c" language uses macros to
200
declare and define functions which may be defined to give either
201
prototypes or non-prototypes.  Each language takes two input files, a
202
grammar file and an actions file, and produces two output files, a C
203
source file containing the generated parser and a C header file containing
204
the external declarations for the parser.  The C language specific options
205
are:
206
.RS
207
prototypes
208
proto
209
ossg\-prototypes
210
ossg\-proto
211
no\-prototypes
212
no\-proto
213
.RS
214
These enable or disable the use of function prototypes or the OSSG
215
prototype macros.
216
.RE
217
split
218
split=\fINUMBER\fR
219
no\-split
220
.RS
221
These enable or disable the output file split option.  The generated
222
files can be very large even without inlining.  This option splits the
223
main output file into a number of components containing about \fINUMBER\fR
224
lines each (the default being 50000).  These components are distinguished
225
by successively substituting 1, 2, 3, ... for the character '@' in the
226
output file name.
227
.RE
228
numeric\-ids
229
numeric
230
no\-numeric\-ids
231
no\-numeric
232
.RS
233
These enable or disable the use of numeric identifiers.  Numeric
234
identifiers replace the identifier name with a number, which is mainly
235
of use in stopping identifier names getting too long.  The disadvantage
236
is that the code becomes less readable, and more difficult to debug.
237
Numeric identifiers are not used by default and are never used for
238
terminal numbers.
239
.RE
240
casts
241
cast
242
no\-casts
243
no\-cast
244
.RS
245
These enable or disable casting of action and assignment operator
246
immutable parameters.  If enabled, a parameter is cast to its own type
247
when it is substituted into the action.  This will cause some compilers
248
to complain about attempts to modify the parameter (which can help pick
249
out attempts at mutating parameters that should not be mutated).  The
250
disadvantage is that not all compilers will reject attempts at mutation,
251
and that ANSI doesn't allow casting to structure and union types, which
252
means that some code may be illegal.  Parameter casting is disabled by
253
default.
254
.RE
255
unreachable\-macros
256
unreachable\-macro
257
unreachable\-comments
258
unreachable\-comment
259
.RS
260
These choose whether unreachable code is marked by a macro or a comment.
261
The default is to mark unreachable code with a comment "/*UNREACHED*/",
262
however a macro "UNREACHED;" may be used instead, if desired.
263
.RE
264
lines
265
line
266
no\-lines
267
no\-line
268
.RS
269
These determine whether "#line" directives should be output to relate the
270
output file to the actions file.  These are generated by default.
271
.RE
272
.RE
273
.IP
274
The "test" language only takes one input file, and produces no
275
output file.  It may be used to check that a grammar is valid.  In
276
conjunction with the dump file, it may be used to check the
277
transformations that would be applied to the grammar.  There are no
278
language specific options for the "test" language.
279
.LP
280
\fB\-\-show\-errors\fR
281
.br
282
\fB\-e\fR
283
.IP
284
Write the current error message list to the standard output.
285
.LP
286
\fB\-\-switch\fR \fIOPTION\fR
287
.br
288
\fB\-s\fR \fIOPTION\fR
289
.IP
290
Pass through \fIOPTION\fR as a language specific option.
291
.LP
292
\fB\-\-tab\-width\fR \fINUMBER\fR
293
.br
294
\fB\-t\fR \fINUMBER\fR
295
.IP
296
This option specifies the number of spaces that a tab occupies.  It
297
defaults to 8.  It is only used when indenting output.
298
.LP
299
\fB\-\-version\fR
300
.br
301
\fB\-v\fR
302
.IP
303
This option causes the version number and supported languages to be
304
written to the standard error stream.
305
.SH SEE ALSO
306
.LP
307
SID users' guide.