2 |
- |
1 |
.HTML "Adding Application Support for a New Architecture in Plan 9
|
|
|
2 |
.TL
|
|
|
3 |
Adding Application Support for a New Architecture in Plan 9
|
|
|
4 |
.AU
|
|
|
5 |
Bob Flandrena
|
|
|
6 |
bobf@plan9.bell-labs.com
|
|
|
7 |
.SH
|
|
|
8 |
Introduction
|
|
|
9 |
.LP
|
|
|
10 |
Plan 9 has five classes of architecture-dependent software:
|
|
|
11 |
headers, kernels, compilers and loaders, the
|
|
|
12 |
.CW libc
|
|
|
13 |
system library, and a few application programs. In general,
|
|
|
14 |
architecture-dependent programs
|
|
|
15 |
consist of a portable part shared by all architectures and a
|
|
|
16 |
processor-specific portion for each supported architecture.
|
|
|
17 |
The portable code is often compiled and stored in a library
|
|
|
18 |
associated with
|
|
|
19 |
each architecture. A program is built by
|
|
|
20 |
compiling the architecture-specific code and loading it with the
|
|
|
21 |
library. Support for a new architecture is provided
|
|
|
22 |
by building a compiler for the architecture, using it to
|
|
|
23 |
compile the portable code into libraries,
|
|
|
24 |
writing the architecture-specific code, and
|
|
|
25 |
then loading that code with
|
|
|
26 |
the libraries.
|
|
|
27 |
.LP
|
|
|
28 |
This document describes the organization of the architecture-dependent
|
|
|
29 |
code and headers on Plan 9.
|
|
|
30 |
The first section briefly discusses the layout of
|
|
|
31 |
the headers and the source code for the kernels, compilers, loaders, and the
|
|
|
32 |
system library,
|
|
|
33 |
.CW libc .
|
|
|
34 |
The second section provides a detailed
|
|
|
35 |
discussion of the structure of
|
|
|
36 |
.CW libmach ,
|
|
|
37 |
a library containing almost
|
|
|
38 |
all architecture-dependent code
|
|
|
39 |
used by application programs.
|
|
|
40 |
The final section describes the steps required to add
|
|
|
41 |
application program support for a new architecture.
|
|
|
42 |
.SH
|
|
|
43 |
Directory Structure
|
|
|
44 |
.PP
|
|
|
45 |
Architecture-dependent information for the new processor
|
|
|
46 |
is stored in the directory tree rooted at \f(CW/\fP\fIm\fP
|
|
|
47 |
where
|
|
|
48 |
.I m
|
|
|
49 |
is the name of the new architecture (e.g.,
|
|
|
50 |
.CW mips ).
|
|
|
51 |
The new directory should be initialized with several important
|
|
|
52 |
subdirectories, notably
|
|
|
53 |
.CW bin ,
|
|
|
54 |
.CW include ,
|
|
|
55 |
and
|
|
|
56 |
.CW lib .
|
|
|
57 |
The directory tree of an existing architecture
|
|
|
58 |
serves as a good model for the new tree.
|
|
|
59 |
The architecture-dependent
|
|
|
60 |
.CW mkfile
|
|
|
61 |
must be stored in the newly created root directory
|
|
|
62 |
for the architecture. It is easiest to copy the
|
|
|
63 |
mkfile for an existing architecture and modify
|
|
|
64 |
it for the new architecture. When the mkfile
|
|
|
65 |
is correct, change the
|
|
|
66 |
.CW OS
|
|
|
67 |
and
|
|
|
68 |
.CW CPUS
|
|
|
69 |
variables in the
|
|
|
70 |
.CW /sys/src/mkfile.proto
|
|
|
71 |
to reflect the addition of the new architecture.
|
|
|
72 |
.SH
|
|
|
73 |
Headers
|
|
|
74 |
.LP
|
|
|
75 |
Architecture-dependent headers are stored in directory
|
|
|
76 |
.CW /\fIm\fP/include
|
|
|
77 |
where
|
|
|
78 |
.I m
|
|
|
79 |
is the name of the architecture (e.g.,
|
|
|
80 |
.CW mips ).
|
|
|
81 |
Two header files are required:
|
|
|
82 |
.CW u.h
|
|
|
83 |
and
|
|
|
84 |
.CW ureg.h .
|
|
|
85 |
The first defines fundamental data types,
|
|
|
86 |
bit settings for the floating point
|
|
|
87 |
status and control registers, and
|
|
|
88 |
.CW va_list
|
|
|
89 |
processing which depends on the stack
|
|
|
90 |
model for the architecture. This file
|
|
|
91 |
is best built by copying and modifying the
|
|
|
92 |
.CW u.h
|
|
|
93 |
file from an architecture
|
|
|
94 |
with a similar stack model.
|
|
|
95 |
The
|
|
|
96 |
.CW ureg.h
|
|
|
97 |
file
|
|
|
98 |
contains a structure describing the layout
|
|
|
99 |
of the saved register set for
|
|
|
100 |
the architecture; it is defined by the kernel.
|
|
|
101 |
.LP
|
|
|
102 |
Header file
|
|
|
103 |
.CW /sys/include/a.out.h
|
|
|
104 |
contains the definitions of the magic
|
|
|
105 |
numbers used to identify executables for
|
|
|
106 |
each architecture. When support for a new
|
|
|
107 |
architecture is added, the magic number
|
|
|
108 |
for the architecture must be added to this file.
|
|
|
109 |
.LP
|
|
|
110 |
The header format of a bootable executable is defined by
|
|
|
111 |
each manufacturer. Header file
|
|
|
112 |
.CW /sys/include/bootexec.h
|
|
|
113 |
contains structures describing the headers currently
|
|
|
114 |
supported. If the new architecture uses a common header
|
|
|
115 |
such as COFF,
|
|
|
116 |
the header format is probably already defined,
|
|
|
117 |
but if the bootable header format is non-standard,
|
|
|
118 |
a structure defining the format must be added to this file.
|
|
|
119 |
.LP
|
|
|
120 |
.SH
|
|
|
121 |
Kernel
|
|
|
122 |
.LP
|
|
|
123 |
Although the kernel depends critically on the properties of the underlying
|
|
|
124 |
hardware, most of the
|
|
|
125 |
higher-level kernel functions, including process
|
|
|
126 |
management, paging, pseudo-devices, and some
|
|
|
127 |
networking code, are independent of processor
|
|
|
128 |
architecture. The portable kernel code
|
|
|
129 |
is divided into two parts: that implementing kernel
|
|
|
130 |
functions and that devoted to the boot process.
|
|
|
131 |
Code in the first class is stored in directory
|
|
|
132 |
.CW /sys/src/9/port
|
|
|
133 |
and the portable boot code is stored in
|
|
|
134 |
.CW /sys/src/9/boot .
|
|
|
135 |
Architecture-dependent kernel code is stored in the
|
|
|
136 |
subdirectories of
|
|
|
137 |
.CW /sys/src/9
|
|
|
138 |
named for each architecture.
|
|
|
139 |
.LP
|
|
|
140 |
The relationship between the kernel code and the boot code
|
|
|
141 |
is convoluted and subtle. The portable boot code
|
|
|
142 |
is compiled into a library for each architecture. An architecture-specific
|
|
|
143 |
main program is loaded with the appropriate library and the resulting
|
|
|
144 |
executable is compiled into the kernel where it is executed as
|
|
|
145 |
a user process during the final stages of kernel initialization. The boot process
|
|
|
146 |
performs authentication, attaches the name space root to the appropriate
|
|
|
147 |
file system and starts the
|
|
|
148 |
.CW init
|
|
|
149 |
process.
|
|
|
150 |
.LP
|
|
|
151 |
The organization of the portable kernel source code differs from that
|
|
|
152 |
of most other architecture-specific code.
|
|
|
153 |
Instead of storing the portable code in a library
|
|
|
154 |
and loading it with the architecture-specific
|
|
|
155 |
code, the portable code is compiled directly into
|
|
|
156 |
the directory containing the architecture-specific code
|
|
|
157 |
and linked with the object files built from the source in that directory.
|
|
|
158 |
.LP
|
|
|
159 |
.SH
|
|
|
160 |
Compilers and Loaders
|
|
|
161 |
.LP
|
|
|
162 |
The compiler source code conforms to the usual
|
|
|
163 |
organization: portable code is compiled into a library
|
|
|
164 |
for each architecture
|
|
|
165 |
and the architecture-dependent code is loaded with
|
|
|
166 |
that library.
|
|
|
167 |
The common compiler code is stored in
|
|
|
168 |
.CW /sys/src/cmd/cc .
|
|
|
169 |
The
|
|
|
170 |
.CW mkfile
|
|
|
171 |
in this directory compiles the portable source and
|
|
|
172 |
archives the objects in a library for each architecture.
|
|
|
173 |
The architecture-specific compiler source
|
|
|
174 |
is stored in a subdirectory of
|
|
|
175 |
.CW /sys/src/cmd
|
|
|
176 |
with the same name as the compiler (e.g.,
|
|
|
177 |
.CW /sys/src/cmd/vc ).
|
|
|
178 |
.LP
|
|
|
179 |
There is no portable code shared by the loaders.
|
|
|
180 |
Each directory of loader source
|
|
|
181 |
code is self-contained, except for
|
|
|
182 |
a header file and an instruction name table
|
|
|
183 |
included from the
|
|
|
184 |
directory of the associated
|
|
|
185 |
compiler.
|
|
|
186 |
.LP
|
|
|
187 |
.SH
|
|
|
188 |
Libraries
|
|
|
189 |
.LP
|
|
|
190 |
Most C library modules are
|
|
|
191 |
portable; the source code is stored in
|
|
|
192 |
directories
|
|
|
193 |
.CW /sys/src/libc/port
|
|
|
194 |
and
|
|
|
195 |
.CW /sys/src/libc/9sys .
|
|
|
196 |
Architecture-dependent library code
|
|
|
197 |
is stored in the subdirectory of
|
|
|
198 |
.CW /sys/src/libc
|
|
|
199 |
named the same as the target processor.
|
|
|
200 |
Non-portable functions not only
|
|
|
201 |
implement architecture-dependent operations
|
|
|
202 |
but also supply assembly language implementations
|
|
|
203 |
of functions where speed is critical.
|
|
|
204 |
Directory
|
|
|
205 |
.CW /sys/src/libc/9syscall
|
|
|
206 |
is unusual because it
|
|
|
207 |
contains architecture-dependent information
|
|
|
208 |
for all architectures.
|
|
|
209 |
It holds only a header file defining
|
|
|
210 |
the names and numbers of system calls
|
|
|
211 |
and a
|
|
|
212 |
.CW mkfile .
|
|
|
213 |
The
|
|
|
214 |
.CW mkfile
|
|
|
215 |
executes an
|
|
|
216 |
.CW rc
|
|
|
217 |
script that parses the header file, constructs
|
|
|
218 |
assembler language functions implementing the system
|
|
|
219 |
call for each architecture, assembles the code,
|
|
|
220 |
and archives the object files in
|
|
|
221 |
.CW libc .
|
|
|
222 |
The assembler language syntax and the system interface
|
|
|
223 |
differ for each architecture.
|
|
|
224 |
The
|
|
|
225 |
.CW rc
|
|
|
226 |
script in this
|
|
|
227 |
.CW mkfile
|
|
|
228 |
must be modified to support a new architecture.
|
|
|
229 |
.LP
|
|
|
230 |
.SH
|
|
|
231 |
Applications
|
|
|
232 |
.LP
|
|
|
233 |
Application programs process two forms of architecture-dependent
|
|
|
234 |
information: executable images and intermediate object files.
|
|
|
235 |
Almost all processing is on executable files.
|
|
|
236 |
System library
|
|
|
237 |
.CW libmach
|
|
|
238 |
provides functions that convert
|
|
|
239 |
architecture-specific data
|
|
|
240 |
to a portable format so application programs
|
|
|
241 |
can process this data independent of its
|
|
|
242 |
underlying representation.
|
|
|
243 |
Further, when a new architecture is implemented
|
|
|
244 |
almost all code changes
|
|
|
245 |
are confined to the library;
|
|
|
246 |
most affected application programs need only be reloaded.
|
|
|
247 |
The source code for the library is stored in
|
|
|
248 |
.CW /sys/src/libmach .
|
|
|
249 |
.LP
|
|
|
250 |
An application program running on one type of
|
|
|
251 |
processor must be able to interpret
|
|
|
252 |
architecture-dependent information for all
|
|
|
253 |
supported processors.
|
|
|
254 |
For example, a debugger must be able to debug
|
|
|
255 |
the executables of
|
|
|
256 |
all architectures, not just the
|
|
|
257 |
architecture on which it is executing, since
|
|
|
258 |
.CW /proc
|
|
|
259 |
may be imported from a different machine.
|
|
|
260 |
.LP
|
|
|
261 |
A small part of the application library
|
|
|
262 |
provides functions to
|
|
|
263 |
extract symbol references from object files.
|
|
|
264 |
The remainder provides the following processing
|
|
|
265 |
of executable files or memory images:
|
|
|
266 |
.IP \(bu
|
|
|
267 |
Header interpretation.
|
|
|
268 |
.IP \(bu
|
|
|
269 |
Symbol table interpretation.
|
|
|
270 |
.IP \(bu
|
|
|
271 |
Execution context interpretation, such as stack traces
|
|
|
272 |
and stack frame location.
|
|
|
273 |
.IP \(bu
|
|
|
274 |
Instruction interpretation including disassembly and
|
|
|
275 |
instruction size and follow-set calculations.
|
|
|
276 |
.IP \(bu
|
|
|
277 |
Exception and floating point number interpretation.
|
|
|
278 |
.IP \(bu
|
|
|
279 |
Architecture-independent read and write access through a
|
|
|
280 |
relocation map.
|
|
|
281 |
.LP
|
|
|
282 |
Header file
|
|
|
283 |
.CW /sys/include/mach.h
|
|
|
284 |
defines the interfaces to the
|
|
|
285 |
application library. Manual pages
|
|
|
286 |
.I mach (2),
|
|
|
287 |
.I symbol (2),
|
|
|
288 |
and
|
|
|
289 |
.I object (2)
|
|
|
290 |
describe the details of the
|
|
|
291 |
library functions.
|
|
|
292 |
.LP
|
|
|
293 |
Two data structures, called
|
|
|
294 |
.CW Mach
|
|
|
295 |
and
|
|
|
296 |
.CW Machdata ,
|
|
|
297 |
contain architecture-dependent parameters and
|
|
|
298 |
a jump table of functions.
|
|
|
299 |
Global variables
|
|
|
300 |
.CW mach
|
|
|
301 |
and
|
|
|
302 |
.CW machdata
|
|
|
303 |
point to the
|
|
|
304 |
.CW Mach
|
|
|
305 |
and
|
|
|
306 |
.CW Machdata
|
|
|
307 |
data structures associated with the target architecture.
|
|
|
308 |
An application determines the target architecture of
|
|
|
309 |
a file or executable image, sets the global pointers
|
|
|
310 |
to the data structures associated with that architecture,
|
|
|
311 |
and subsequently performs all references indirectly through the
|
|
|
312 |
pointers.
|
|
|
313 |
As a result, direct references to the tables for each
|
|
|
314 |
architecture are avoided and the application code intrinsically
|
|
|
315 |
supports all architectures (though only one at a time).
|
|
|
316 |
.LP
|
|
|
317 |
Object file processing is handled similarly: architecture-dependent
|
|
|
318 |
functions identify and
|
|
|
319 |
decode the intermediate files for the processor.
|
|
|
320 |
The application indirectly
|
|
|
321 |
invokes a classification function to identify
|
|
|
322 |
the architecture of the object code and to select the
|
|
|
323 |
appropriate decoding function. Subsequent calls
|
|
|
324 |
then use that function to decode each record. Again,
|
|
|
325 |
the layer of indirection allows the application code
|
|
|
326 |
to support all architectures without modification.
|
|
|
327 |
.LP
|
|
|
328 |
Splitting the architecture-dependent information
|
|
|
329 |
between the
|
|
|
330 |
.CW Mach
|
|
|
331 |
and
|
|
|
332 |
.CW Machdata
|
|
|
333 |
data structures
|
|
|
334 |
allows applications to choose
|
|
|
335 |
an appropriate level of service. Even though an application
|
|
|
336 |
does not directly reference the architecture-specific data structures,
|
|
|
337 |
it must load the
|
|
|
338 |
architecture-dependent tables and code
|
|
|
339 |
for all architectures it supports. The size of this data
|
|
|
340 |
can be substantial and many applications do not require
|
|
|
341 |
the full range of architecture-dependent functionality.
|
|
|
342 |
For example, the
|
|
|
343 |
.CW size
|
|
|
344 |
command does not require the disassemblers for every architecture;
|
|
|
345 |
it only needs to decode the header.
|
|
|
346 |
The
|
|
|
347 |
.CW Mach
|
|
|
348 |
data structure contains a few architecture-specific parameters
|
|
|
349 |
and a description of the processor register set.
|
|
|
350 |
The size of the structure
|
|
|
351 |
varies with the size of the register
|
|
|
352 |
set but is generally small.
|
|
|
353 |
The
|
|
|
354 |
.CW Machdata
|
|
|
355 |
data structure contains
|
|
|
356 |
a jump table of architecture-dependent functions;
|
|
|
357 |
the amount of code and data referenced by this table
|
|
|
358 |
is usually large.
|
|
|
359 |
.SH
|
|
|
360 |
Libmach Source Code Organization
|
|
|
361 |
.LP
|
|
|
362 |
The
|
|
|
363 |
.CW libmach
|
|
|
364 |
library provides four classes of functionality:
|
|
|
365 |
.LP
|
|
|
366 |
.IP "Header and Symbol Table Decoding\ -\ "
|
|
|
367 |
Files
|
|
|
368 |
.CW executable.c
|
|
|
369 |
and
|
|
|
370 |
.CW sym.c
|
|
|
371 |
contain code to interpret the header and
|
|
|
372 |
symbol tables of
|
|
|
373 |
an executable file or executing image.
|
|
|
374 |
Function
|
|
|
375 |
.CW crackhdr
|
|
|
376 |
decodes the header,
|
|
|
377 |
reformats the
|
|
|
378 |
information into an
|
|
|
379 |
.CW Fhdr
|
|
|
380 |
data structure, and points
|
|
|
381 |
global variable
|
|
|
382 |
.CW mach
|
|
|
383 |
to the
|
|
|
384 |
.CW Mach
|
|
|
385 |
data structure of the target architecture.
|
|
|
386 |
The symbol table processing
|
|
|
387 |
uses the data in the
|
|
|
388 |
.CW Fhdr
|
|
|
389 |
structure to decode the symbol table.
|
|
|
390 |
A variety of symbol table access functions then support
|
|
|
391 |
queries on the reformatted table.
|
|
|
392 |
.IP "Debugger Support\ -\ "
|
|
|
393 |
Files named
|
|
|
394 |
.CW \fIm\fP.c ,
|
|
|
395 |
where
|
|
|
396 |
.I m
|
|
|
397 |
is the code letter assigned to the architecture,
|
|
|
398 |
contain the initialized
|
|
|
399 |
.CW Mach
|
|
|
400 |
data structure and the definition of the register
|
|
|
401 |
set for each architecture.
|
|
|
402 |
Architecture-specific debugger support functions and
|
|
|
403 |
an initialized
|
|
|
404 |
.CW Machdata
|
|
|
405 |
structure are stored in
|
|
|
406 |
files named
|
|
|
407 |
.CW \fIm\fPdb.c .
|
|
|
408 |
Files
|
|
|
409 |
.CW machdata.c
|
|
|
410 |
and
|
|
|
411 |
.CW setmach.c
|
|
|
412 |
contain debugger support functions shared
|
|
|
413 |
by multiple architectures.
|
|
|
414 |
.IP "Architecture-Independent Access\ -\ "
|
|
|
415 |
Files
|
|
|
416 |
.CW map.c ,
|
|
|
417 |
.CW access.c ,
|
|
|
418 |
and
|
|
|
419 |
.CW swap.c
|
|
|
420 |
provide accesses through a relocation map
|
|
|
421 |
to data in an executable file or executing image.
|
|
|
422 |
Byte-swapping is performed as needed. Global variables
|
|
|
423 |
.CW mach
|
|
|
424 |
and
|
|
|
425 |
.CW machdata
|
|
|
426 |
must point to the
|
|
|
427 |
.CW Mach
|
|
|
428 |
and
|
|
|
429 |
.CW Machdata
|
|
|
430 |
data structures of the target architecture.
|
|
|
431 |
.IP "Object File Interpretation\ -\ "
|
|
|
432 |
These files contain functions to identify the
|
|
|
433 |
target architecture of an
|
|
|
434 |
intermediate object file
|
|
|
435 |
and extract references to symbols. File
|
|
|
436 |
.CW obj.c
|
|
|
437 |
contains code common to all architectures;
|
|
|
438 |
file
|
|
|
439 |
.CW \fIm\fPobj.c
|
|
|
440 |
contains the architecture-specific source code
|
|
|
441 |
for the machine with code character
|
|
|
442 |
.I m .
|
|
|
443 |
.LP
|
|
|
444 |
The
|
|
|
445 |
.CW Machdata
|
|
|
446 |
data structure is primarily a jump
|
|
|
447 |
table of architecture-dependent debugger support
|
|
|
448 |
functions. Functions select the
|
|
|
449 |
.CW Machdata
|
|
|
450 |
structure for a target architecture based
|
|
|
451 |
on the value of the
|
|
|
452 |
.CW type
|
|
|
453 |
code in the
|
|
|
454 |
.CW Fhdr
|
|
|
455 |
structure or the name of the architecture.
|
|
|
456 |
The jump table provides functions to swap bytes, interpret
|
|
|
457 |
machine instructions,
|
|
|
458 |
perform stack
|
|
|
459 |
traces, find stack frames, format floating point
|
|
|
460 |
numbers, and decode machine exceptions. Some functions, such as
|
|
|
461 |
machine exception decoding, are idiosyncratic and must be
|
|
|
462 |
supplied for each architecture. Others depend
|
|
|
463 |
on the compiler run-time model and several
|
|
|
464 |
architectures may share code common to a model. For
|
|
|
465 |
example, many architectures share the code to
|
|
|
466 |
process the fixed-frame stack model implemented by
|
|
|
467 |
several of the compilers.
|
|
|
468 |
Finally, some
|
|
|
469 |
functions, such as byte-swapping, provide a general capability and
|
|
|
470 |
the jump table need only select an implementation appropriate
|
|
|
471 |
to the architecture.
|
|
|
472 |
.LP
|
|
|
473 |
.SH
|
|
|
474 |
Adding Application Support for a New Architecture
|
|
|
475 |
.LP
|
|
|
476 |
This section describes the
|
|
|
477 |
steps required to add application-level
|
|
|
478 |
support for a new architecture.
|
|
|
479 |
We assume
|
|
|
480 |
the kernel, compilers, loaders and system libraries
|
|
|
481 |
for the new architecture are already in place. This
|
|
|
482 |
implies that a code-character has been assigned and
|
|
|
483 |
that the architecture-specific headers have been
|
|
|
484 |
updated.
|
|
|
485 |
With the exception of two programs,
|
|
|
486 |
application-level changes are confined to header
|
|
|
487 |
files and the source code in
|
|
|
488 |
.CW /sys/src/libmach .
|
|
|
489 |
.LP
|
|
|
490 |
.IP 1.
|
|
|
491 |
Begin by updating the application library
|
|
|
492 |
header file in
|
|
|
493 |
.CW /sys/include/mach.h .
|
|
|
494 |
Add the following symbolic codes to the
|
|
|
495 |
.CW enum
|
|
|
496 |
statement near the beginning of the file:
|
|
|
497 |
.RS
|
|
|
498 |
.IP \(bu
|
|
|
499 |
The processor type code, e.g.,
|
|
|
500 |
.CW MSPARC .
|
|
|
501 |
.IP \(bu
|
|
|
502 |
The type of the executable. There are usually
|
|
|
503 |
two codes needed: one for a bootable
|
|
|
504 |
executable (i.e., a kernel) and one for an
|
|
|
505 |
application executable.
|
|
|
506 |
.IP \(bu
|
|
|
507 |
The disassembler type code. Add one entry for
|
|
|
508 |
each supported disassembler for the architecture.
|
|
|
509 |
.IP \(bu
|
|
|
510 |
A symbolic code for the object file.
|
|
|
511 |
.RE
|
|
|
512 |
.LP
|
|
|
513 |
.IP 2.
|
|
|
514 |
In a file name
|
|
|
515 |
.CW /sys/src/libmach/\fIm\fP.c
|
|
|
516 |
(where
|
|
|
517 |
.I m
|
|
|
518 |
is the identifier character assigned to the architecture),
|
|
|
519 |
initialize
|
|
|
520 |
.CW Reglist
|
|
|
521 |
and
|
|
|
522 |
.CW Mach
|
|
|
523 |
data structures with values defining
|
|
|
524 |
the register set and various system parameters.
|
|
|
525 |
The source file for a similar architecture
|
|
|
526 |
can serve as template.
|
|
|
527 |
Most of the fields of the
|
|
|
528 |
.CW Mach
|
|
|
529 |
data structure are obvious
|
|
|
530 |
but a few require further explanation.
|
|
|
531 |
.RS
|
|
|
532 |
.IP "\f(CWkbase\fP\ -\ "
|
|
|
533 |
This field
|
|
|
534 |
contains the address of the kernel
|
|
|
535 |
.CW ublock .
|
|
|
536 |
The debuggers
|
|
|
537 |
assume the first entry of the kernel
|
|
|
538 |
.CW ublock
|
|
|
539 |
points to the
|
|
|
540 |
.CW Proc
|
|
|
541 |
structure for a kernel thread.
|
|
|
542 |
.IP "\f(CWktmask\fP\ -\ "
|
|
|
543 |
This field
|
|
|
544 |
is a bit mask used to calculate the kernel text address from
|
|
|
545 |
the kernel
|
|
|
546 |
.CW ublock
|
|
|
547 |
address.
|
|
|
548 |
The first page of the
|
|
|
549 |
kernel text segment is calculated by
|
|
|
550 |
ANDing
|
|
|
551 |
the negation of this mask with
|
|
|
552 |
.CW kbase .
|
|
|
553 |
.IP "\f(CWkspoff\fP\ -\ "
|
|
|
554 |
This field
|
|
|
555 |
contains the byte offset in the
|
|
|
556 |
.CW Proc
|
|
|
557 |
data structure to the saved kernel
|
|
|
558 |
stack pointer for a suspended kernel thread. This
|
|
|
559 |
is the offset to the
|
|
|
560 |
.CW sched.sp
|
|
|
561 |
field of a
|
|
|
562 |
.CW Proc
|
|
|
563 |
table entry.
|
|
|
564 |
.IP "\f(CWkpcoff\fP\ -\ "
|
|
|
565 |
This field contains the byte offset into the
|
|
|
566 |
.CW Proc
|
|
|
567 |
data structure
|
|
|
568 |
of
|
|
|
569 |
the program counter of a suspended kernel thread.
|
|
|
570 |
This is the offset to
|
|
|
571 |
field
|
|
|
572 |
.CW sched.pc
|
|
|
573 |
in that structure.
|
|
|
574 |
.IP "\f(CWkspdelta\fP and \f(CWkpcdelta\fP\ -\ "
|
|
|
575 |
These fields
|
|
|
576 |
contain corrections to be added to
|
|
|
577 |
the stack pointer and program counter, respectively,
|
|
|
578 |
to properly locate the stack and next
|
|
|
579 |
instruction of a kernel thread. These
|
|
|
580 |
values bias the saved registers retrieved
|
|
|
581 |
from the
|
|
|
582 |
.CW Label
|
|
|
583 |
structure named
|
|
|
584 |
.CW sched
|
|
|
585 |
in the
|
|
|
586 |
.CW Proc
|
|
|
587 |
data structure.
|
|
|
588 |
Most architectures require no bias
|
|
|
589 |
and these fields contain zeros.
|
|
|
590 |
.IP "\f(CWscalloff\fP\ -\ "
|
|
|
591 |
This field
|
|
|
592 |
contains the byte offset of the
|
|
|
593 |
.CW scallnr
|
|
|
594 |
field in the
|
|
|
595 |
.CW ublock
|
|
|
596 |
data structure associated with a process.
|
|
|
597 |
The
|
|
|
598 |
.CW scallnr
|
|
|
599 |
field contains the number of the
|
|
|
600 |
last system call executed by the process.
|
|
|
601 |
The location of the field varies depending on
|
|
|
602 |
the size of the floating point register set
|
|
|
603 |
which precedes it in the
|
|
|
604 |
.CW ublock .
|
|
|
605 |
.RE
|
|
|
606 |
.LP
|
|
|
607 |
.IP 3.
|
|
|
608 |
Add an entry to the initialization of the
|
|
|
609 |
.CW ExecTable
|
|
|
610 |
data structure at the beginning of file
|
|
|
611 |
.CW /sys/src/libmach/executable.c .
|
|
|
612 |
Most architectures
|
|
|
613 |
require two entries: one for
|
|
|
614 |
a normal executable and
|
|
|
615 |
one for a bootable
|
|
|
616 |
image. Each table entry contains:
|
|
|
617 |
.RS
|
|
|
618 |
.IP \(bu
|
|
|
619 |
Magic Number\ \-\
|
|
|
620 |
The big-endian magic number assigned to the architecture in
|
|
|
621 |
.CW /sys/include/a.out.h .
|
|
|
622 |
.IP \(bu
|
|
|
623 |
Name\ \-\
|
|
|
624 |
A string describing the executable.
|
|
|
625 |
.IP \(bu
|
|
|
626 |
Executable type code\ \-\
|
|
|
627 |
The executable code assigned in
|
|
|
628 |
.CW /sys/include/mach.h .
|
|
|
629 |
.IP \(bu
|
|
|
630 |
\f(CWMach\fP pointer\ \-\
|
|
|
631 |
The address of the initialized
|
|
|
632 |
.CW Mach
|
|
|
633 |
data structure constructed in Step 2.
|
|
|
634 |
You must also add the name of this table to the
|
|
|
635 |
list of
|
|
|
636 |
.CW Mach
|
|
|
637 |
table definitions immediately preceding the
|
|
|
638 |
.CW ExecTable
|
|
|
639 |
initialization.
|
|
|
640 |
.IP \(bu
|
|
|
641 |
Header size\ \-\
|
|
|
642 |
The number of bytes in the executable file header.
|
|
|
643 |
The size of a normal executable header is always
|
|
|
644 |
.CW sizeof(Exec) .
|
|
|
645 |
The size of a bootable header is
|
|
|
646 |
determined by the size of the structure
|
|
|
647 |
for the architecture defined in
|
|
|
648 |
.CW /sys/include/bootexec.h .
|
|
|
649 |
.IP \(bu
|
|
|
650 |
Byte-swapping function\ \-\
|
|
|
651 |
The address of
|
|
|
652 |
.CW beswal
|
|
|
653 |
or
|
|
|
654 |
.CW leswal
|
|
|
655 |
for big-endian and little-endian
|
|
|
656 |
architectures, respectively.
|
|
|
657 |
.IP \(bu
|
|
|
658 |
Decoder function\ -\
|
|
|
659 |
The address of a function to decode the header.
|
|
|
660 |
Function
|
|
|
661 |
.CW adotout
|
|
|
662 |
decodes the common header shared by all normal
|
|
|
663 |
(i.e., non-bootable) executable files.
|
|
|
664 |
The header format of bootable
|
|
|
665 |
executable files is defined by the manufacturer and
|
|
|
666 |
a custom function is almost always
|
|
|
667 |
required to decode it.
|
|
|
668 |
Header file
|
|
|
669 |
.CW /sys/include/bootexec.h
|
|
|
670 |
contains data structures defining the bootable
|
|
|
671 |
headers for all architectures. If the new architecture
|
|
|
672 |
uses an existing format, the appropriate
|
|
|
673 |
decoding function should already be in
|
|
|
674 |
.CW executable.c .
|
|
|
675 |
If the header format is unique, then
|
|
|
676 |
a new function must be added to this file.
|
|
|
677 |
Usually the decoding function for an existing
|
|
|
678 |
architecture can be adopted with minor modifications.
|
|
|
679 |
.RE
|
|
|
680 |
.LP
|
|
|
681 |
.IP 4.
|
|
|
682 |
Write an object file parser and
|
|
|
683 |
store it in file
|
|
|
684 |
.CW /sys/src/libmach/\fIm\fPobj.c
|
|
|
685 |
where
|
|
|
686 |
.I m
|
|
|
687 |
is the identifier character assigned to the architecture.
|
|
|
688 |
Two functions are required: a predicate to identify an
|
|
|
689 |
object file for the architecture and a function to extract
|
|
|
690 |
symbol references from the object code.
|
|
|
691 |
The object code format is obscure but
|
|
|
692 |
it is often possible to adopt the
|
|
|
693 |
code of an existing architecture
|
|
|
694 |
with minor modifications.
|
|
|
695 |
When these
|
|
|
696 |
functions are in hand, insert their addresses
|
|
|
697 |
in the jump table at the beginning of file
|
|
|
698 |
.CW /sys/src/libmach/obj.c .
|
|
|
699 |
.LP
|
|
|
700 |
.IP 5.
|
|
|
701 |
Implement the required debugger support functions and
|
|
|
702 |
initialize the parameters and jump table of the
|
|
|
703 |
.CW Machdata
|
|
|
704 |
data structure for the architecture.
|
|
|
705 |
This code is conventionally stored in
|
|
|
706 |
a file named
|
|
|
707 |
.CW /sys/src/libmach/\fIm\fPdb.c
|
|
|
708 |
where
|
|
|
709 |
.I m
|
|
|
710 |
is the identifier character assigned to the architecture.
|
|
|
711 |
The fields of the
|
|
|
712 |
.CW Machdata
|
|
|
713 |
structure are:
|
|
|
714 |
.RS
|
|
|
715 |
.IP "\f(CWbpinst\fP and \f(CWbpsize\fP\ -\ "
|
|
|
716 |
These fields
|
|
|
717 |
contain the breakpoint instruction and the size
|
|
|
718 |
of the instruction, respectively.
|
|
|
719 |
.IP "\f(CWswab\fP\ -\ "
|
|
|
720 |
This field
|
|
|
721 |
contains the address of a function to
|
|
|
722 |
byte-swap a 16-bit value. Choose
|
|
|
723 |
.CW leswab
|
|
|
724 |
or
|
|
|
725 |
.CW beswab
|
|
|
726 |
for little-endian or big-endian architectures, respectively.
|
|
|
727 |
.IP "\f(CWswal\fP\ -\ "
|
|
|
728 |
This field
|
|
|
729 |
contains the address of a function to
|
|
|
730 |
byte-swap a 32-bit value. Choose
|
|
|
731 |
.CW leswal
|
|
|
732 |
or
|
|
|
733 |
.CW beswal
|
|
|
734 |
for little-endian or big-endian architectures, respectively.
|
|
|
735 |
.IP "\f(CWctrace\fP\ -\ "
|
|
|
736 |
This field
|
|
|
737 |
contains the address of a function to perform a
|
|
|
738 |
C-language stack trace. Two general trace functions,
|
|
|
739 |
.CW risctrace
|
|
|
740 |
and
|
|
|
741 |
.CW cisctrace ,
|
|
|
742 |
traverse fixed-frame and relative-frame stacks,
|
|
|
743 |
respectively. If the compiler for the
|
|
|
744 |
new architecture conforms to one of
|
|
|
745 |
these models, select the appropriate function. If the
|
|
|
746 |
stack model is unique,
|
|
|
747 |
supply a custom stack trace function.
|
|
|
748 |
.IP "\f(CWfindframe\fP\ -\ "
|
|
|
749 |
This field
|
|
|
750 |
contains the address of a function to locate the stack
|
|
|
751 |
frame associated with a text address.
|
|
|
752 |
Generic functions
|
|
|
753 |
.CW riscframe
|
|
|
754 |
and
|
|
|
755 |
.CW ciscframe
|
|
|
756 |
process fixed-frame and relative-frame stack
|
|
|
757 |
models.
|
|
|
758 |
.IP "\f(CWufixup\fP\ -\ "
|
|
|
759 |
This field
|
|
|
760 |
contains the address of a function to adjust
|
|
|
761 |
the base address of the register save area.
|
|
|
762 |
Currently, only the
|
|
|
763 |
68020 requires this bias
|
|
|
764 |
to offset over the active
|
|
|
765 |
exception frame.
|
|
|
766 |
.IP "\f(CWexcep\fP\ -\ "
|
|
|
767 |
This field
|
|
|
768 |
contains the address of a function to produce a
|
|
|
769 |
text
|
|
|
770 |
string describing the
|
|
|
771 |
current exception.
|
|
|
772 |
Each architecture stores exception
|
|
|
773 |
information uniquely, so this code must always be supplied.
|
|
|
774 |
.IP "\f(CWbpfix\fP\ -\ "
|
|
|
775 |
This field
|
|
|
776 |
contains the address of a function to adjust an
|
|
|
777 |
address prior to laying down a breakpoint.
|
|
|
778 |
.IP "\f(CWsftos\fP\ -\ "
|
|
|
779 |
This field
|
|
|
780 |
contains the address of a function to convert a single
|
|
|
781 |
precision floating point value
|
|
|
782 |
to a string. Choose
|
|
|
783 |
.CW leieeesftos
|
|
|
784 |
for little-endian
|
|
|
785 |
or
|
|
|
786 |
.CW beieeesftos
|
|
|
787 |
for big-endian architectures.
|
|
|
788 |
.IP "\f(CWdftos\fP\ -\ "
|
|
|
789 |
This field
|
|
|
790 |
contains the address of a function to convert a double
|
|
|
791 |
precision floating point value
|
|
|
792 |
to a string. Choose
|
|
|
793 |
.CW leieeedftos
|
|
|
794 |
for little-endian
|
|
|
795 |
or
|
|
|
796 |
.CW beieeedftos
|
|
|
797 |
for big-endian architectures.
|
|
|
798 |
.IP "\f(CWfoll\fP, \f(CWdas\fP, \f(CWhexinst\fP, and \f(CWinstsize\fP\ -\ "
|
|
|
799 |
These fields point to functions that interpret machine
|
|
|
800 |
instructions.
|
|
|
801 |
They rely on disassembly of the instruction
|
|
|
802 |
and are unique to each architecture.
|
|
|
803 |
.CW Foll
|
|
|
804 |
calculates the follow set of an instruction.
|
|
|
805 |
.CW Das
|
|
|
806 |
disassembles a machine instruction to assembly language.
|
|
|
807 |
.CW Hexinst
|
|
|
808 |
formats a machine instruction as a text
|
|
|
809 |
string of
|
|
|
810 |
hexadecimal digits.
|
|
|
811 |
.CW Instsize
|
|
|
812 |
calculates the size in bytes, of an instruction.
|
|
|
813 |
Once the disassembler is written, the other functions
|
|
|
814 |
can usually be implemented as trivial extensions of it.
|
|
|
815 |
.LP
|
|
|
816 |
It is possible to provide support for a new architecture
|
|
|
817 |
incrementally by filling the jump table entries
|
|
|
818 |
of the
|
|
|
819 |
.CW Machdata
|
|
|
820 |
structure as code is written. In general, if
|
|
|
821 |
a jump table entry contains a zero, application
|
|
|
822 |
programs requiring that function will issue an
|
|
|
823 |
error message instead of attempting to
|
|
|
824 |
call the function. For example,
|
|
|
825 |
the
|
|
|
826 |
.CW foll ,
|
|
|
827 |
.CW das ,
|
|
|
828 |
.CW hexinst ,
|
|
|
829 |
and
|
|
|
830 |
.CW instsize
|
|
|
831 |
jump table slots can be zeroed until a
|
|
|
832 |
disassembler is written.
|
|
|
833 |
Other capabilities, such as
|
|
|
834 |
stack trace or variable inspection,
|
|
|
835 |
can be supplied and will be available to
|
|
|
836 |
the debuggers but attempts to use the
|
|
|
837 |
disassembler will result in an error message.
|
|
|
838 |
.RE
|
|
|
839 |
.IP 6.
|
|
|
840 |
Update the table named
|
|
|
841 |
.CW machines
|
|
|
842 |
near the beginning of
|
|
|
843 |
.CW /sys/src/libmach/setmach.c .
|
|
|
844 |
This table binds the
|
|
|
845 |
file type code and machine name to the
|
|
|
846 |
.CW Mach
|
|
|
847 |
and
|
|
|
848 |
.CW Machdata
|
|
|
849 |
structures of an architecture.
|
|
|
850 |
The names of the initialized
|
|
|
851 |
.CW Mach
|
|
|
852 |
and
|
|
|
853 |
.CW Machdata
|
|
|
854 |
structures built in steps 2 and 5
|
|
|
855 |
must be added to the list of
|
|
|
856 |
structure definitions immediately
|
|
|
857 |
preceding the table initialization.
|
|
|
858 |
If both Plan 9 and
|
|
|
859 |
native disassembly are supported, add
|
|
|
860 |
an entry for each disassembler to the table. The
|
|
|
861 |
entry for the default disassembler (usually
|
|
|
862 |
Plan 9) must be first.
|
|
|
863 |
.IP 7.
|
|
|
864 |
Add an entry describing the architecture to
|
|
|
865 |
the table named
|
|
|
866 |
.CW trans
|
|
|
867 |
near the end of
|
|
|
868 |
.CW /sys/src/cmd/prof.c .
|
|
|
869 |
.RE
|
|
|
870 |
.IP 8.
|
|
|
871 |
Add an entry describing the architecture to
|
|
|
872 |
the table named
|
|
|
873 |
.CW objtype
|
|
|
874 |
near the start of
|
|
|
875 |
.CW /sys/src/cmd/pcc.c .
|
|
|
876 |
.RE
|
|
|
877 |
.IP 9.
|
|
|
878 |
Recompile and install
|
|
|
879 |
all application programs that include header file
|
|
|
880 |
.CW mach.h
|
|
|
881 |
and load with
|
|
|
882 |
.CW libmach.a .
|