Subversion Repositories planix.SVN

Rev

Details | Last modification | View Log | RSS feed

Rev Author Line No. Line
2 - 1
.HTML "A Manual for the Plan 9 assembler
2
.ft CW
3
.ta 8n +8n +8n +8n +8n +8n +8n
4
.ft
5
.TL
6
A Manual for the Plan 9 assembler
7
.AU
8
Rob Pike
9
rob@plan9.bell-labs.com
10
.SH
11
Machines
12
.PP
13
There is an assembler for each of the MIPS, SPARC, Intel 386, AMD64,
14
Power PC, and ARM.
15
The 68020 assembler,
16
.CW 2a ,
17
(no longer distributed)
18
is the oldest and in many ways the prototype.
19
The assemblers are really just variations of a single program:
20
they share many properties such as left-to-right assignment order for
21
instruction operands and the synthesis of macro instructions
22
such as
23
.CW MOVE
24
to hide the peculiarities of the load and store structure of the machines.
25
To keep things concrete, the first part of this manual is
26
specifically about the 68020.
27
At the end is a description of the differences among
28
the other assemblers.
29
.PP
30
The document, ``How to Use the Plan 9 C Compiler'', by Rob Pike,
31
is a prerequisite for this manual.
32
.SH
33
Registers
34
.PP
35
All pre-defined symbols in the assembler are upper-case.
36
Data registers are
37
.CW R0
38
through
39
.CW R7 ;
40
address registers are
41
.CW A0
42
through
43
.CW A7 ;
44
floating-point registers are
45
.CW F0
46
through
47
.CW F7 .
48
.PP
49
A pointer in
50
.CW A6
51
is used by the C compiler to point to data, enabling short addresses to
52
be used more often.
53
The value of
54
.CW A6
55
is constant and must be set during C program initialization
56
to the address of the externally-defined symbol
57
.CW a6base .
58
.PP
59
The following hardware registers are defined in the assembler; their
60
meaning should be obvious given a 68020 manual:
61
.CW CAAR ,
62
.CW CACR ,
63
.CW CCR ,
64
.CW DFC ,
65
.CW ISP ,
66
.CW MSP ,
67
.CW SFC ,
68
.CW SR ,
69
.CW USP ,
70
and
71
.CW VBR .
72
.PP
73
The assembler also defines several pseudo-registers that
74
manipulate the stack:
75
.CW FP ,
76
.CW SP ,
77
and
78
.CW TOS .
79
.CW FP
80
is the frame pointer, so
81
.CW 0(FP)
82
is the first argument,
83
.CW 4(FP)
84
is the second, and so on.
85
.CW SP
86
is the local stack pointer, where automatic variables are held
87
(SP is a pseudo-register only on the 68020);
88
.CW 0(SP)
89
is the first automatic, and so on as with
90
.CW FP .
91
Finally,
92
.CW TOS
93
is the top-of-stack register, used for pushing parameters to procedures,
94
saving temporary values, and so on.
95
.PP
96
The assembler and loader track these pseudo-registers so
97
the above statements are true regardless of what has been
98
pushed on the hardware stack, pointed to by
99
.CW A7 .
100
The name
101
.CW A7
102
refers to the hardware stack pointer, but beware of mixed use of
103
.CW A7
104
and the above stack-related pseudo-registers, which will cause trouble.
105
Note, too, that the
106
.CW PEA
107
instruction is observed by the loader to
108
alter SP and thus will insert a corresponding pop before all returns.
109
The assembler accepts a label-like name to be attached to
110
.CW FP
111
and
112
.CW SP
113
uses, such as
114
.CW p+0(FP) ,
115
to help document that
116
.CW p
117
is the first argument to a routine.
118
The name goes in the symbol table but has no significance to the result
119
of the program.
120
.SH
121
Referring to data
122
.PP
123
All external references must be made relative to some pseudo-register,
124
either
125
.CW PC
126
(the virtual program counter) or
127
.CW SB
128
(the ``static base'' register).
129
.CW PC
130
counts instructions, not bytes of data.
131
For example, to branch to the second following instruction, that is,
132
to skip one instruction, one may write
133
.P1
134
	BRA	2(PC)
135
.P2
136
Labels are also allowed, as in
137
.P1
138
	BRA	return
139
	NOP
140
return:
141
	RTS
142
.P2
143
When using labels, there is no
144
.CW (PC)
145
annotation.
146
.PP
147
The pseudo-register
148
.CW SB
149
refers to the beginning of the address space of the program.
150
Thus, references to global data and procedures are written as
151
offsets to
152
.CW SB ,
153
as in
154
.P1
155
	MOVL	$array(SB), TOS
156
.P2
157
to push the address of a global array on the stack, or
158
.P1
159
	MOVL	array+4(SB), TOS
160
.P2
161
to push the second (4-byte) element of the array.
162
Note the use of an offset; the complete list of addressing modes is given below.
163
Similarly, subroutine calls must use
164
.CW SB :
165
.P1
166
	BSR	exit(SB)
167
.P2
168
File-static variables have syntax
169
.P1
170
	local<>+4(SB)
171
.P2
172
The
173
.CW <>
174
will be filled in at load time by a unique integer.
175
.PP
176
When a program starts, it must execute
177
.P1
178
	MOVL	$a6base(SB), A6
179
.P2
180
before accessing any global data.
181
(On machines such as the MIPS and SPARC that cannot load a register
182
in a single instruction, constants are loaded through the static base
183
register.  The loader recognizes code that initializes the static
184
base register and treats it specially.  You must be careful, however,
185
not to load large constants on such machines when the static base
186
register is not set up, such as early in interrupt routines.)
187
.SH
188
Expressions
189
.PP
190
Expressions are mostly what one might expect.
191
Where an offset or a constant is expected,
192
a primary expression with unary operators is allowed.
193
A general C constant expression is allowed in parentheses.
194
.PP
195
Source files are preprocessed exactly as in the C compiler, so
196
.CW #define
197
and
198
.CW #include
199
work.
200
.SH
201
Addressing modes
202
.PP
203
The simple addressing modes are shared by all the assemblers.
204
Here, for completeness, follows a table of all the 68020 addressing modes,
205
since that machine has the richest set.
206
In the table,
207
.CW o
208
is an offset, which if zero may be elided, and
209
.CW d
210
is a displacement, which is a constant between -128 and 127 inclusive.
211
Many of the modes listed have the same name;
212
scrutiny of the format will show what default is being applied.
213
For instance, indexed mode with no address register supplied operates
214
as though a zero-valued register were used.
215
For "offset" read "displacement."
216
For "\f(CW.s\fP" read one of
217
.CW .L ,
218
or
219
.CW .W
220
followed by
221
.CW *1 ,
222
.CW *2 ,
223
.CW *4 ,
224
or
225
.CW *8
226
to indicate the size and scaling of the data.
227
.IP
228
.TS
229
l lfCW.
230
data register	R0
231
address register	A0
232
floating-point register	F0
233
special names	CAAR, CACR, etc.
234
constant	$con
235
floating point constant	$fcon
236
external symbol	name+o(SB)
237
local symbol	name<>+o(SB)
238
automatic symbol	name+o(SP)
239
argument	name+o(FP)
240
address of external	$name+o(SB)
241
address of local	$name<>+o(SB)
242
indirect post-increment	(A0)+
243
indirect pre-decrement	-(A0)
244
indirect with offset	o(A0)
245
indexed with offset	o()(R0.s)
246
indexed with offset	o(A0)(R0.s)
247
external indexed	name+o(SB)(R0.s)
248
local indexed	name<>+o(SB)(R0.s)
249
automatic indexed	name+o(SP)(R0.s)
250
parameter indexed	name+o(FP)(R0.s)
251
offset indirect post-indexed	d(o())(R0.s)
252
offset indirect post-indexed	d(o(A0))(R0.s)
253
external indirect post-indexed	d(name+o(SB))(R0.s)
254
local indirect post-indexed	d(name<>+o(SB))(R0.s)
255
automatic indirect post-indexed	d(name+o(SP))(R0.s)
256
parameter indirect post-indexed	d(name+o(FP))(R0.s)
257
offset indirect pre-indexed	d(o()(R0.s))
258
offset indirect pre-indexed	d(o(A0))
259
offset indirect pre-indexed	d(o(A0)(R0.s))
260
external indirect pre-indexed	d(name+o(SB))
261
external indirect pre-indexed	d(name+o(SB)(R0.s))
262
local indirect pre-indexed	d(name<>+o(SB))
263
local indirect pre-indexed	d(name<>+o(SB)(R0.s))
264
automatic indirect pre-indexed	d(name+o(SP))
265
automatic indirect pre-indexed	d(name+o(SP)(R0.s))
266
parameter indirect pre-indexed	d(name+o(FP))
267
parameter indirect pre-indexed	d(name+o(FP)(R0.s))
268
.TE
269
.in
270
.SH
271
Laying down data
272
.PP
273
Placing data in the instruction stream, say for interrupt vectors, is easy:
274
the pseudo-instructions
275
.CW LONG
276
and
277
.CW WORD
278
(but not
279
.CW BYTE )
280
lay down the value of their single argument, of the appropriate size,
281
as if it were an instruction:
282
.P1
283
	LONG	$12345
284
.P2
285
places the long 12345 (base 10)
286
in the instruction stream.
287
(On most machines,
288
the only such operator is
289
.CW WORD
290
and it lays down 32-bit quantities.
291
The 386 has all three:
292
.CW LONG ,
293
.CW WORD ,
294
and
295
.CW BYTE .
296
The AMD64 adds
297
.CW QUAD
298
to that for 64-bit values.
299
The 960 has only one,
300
.CW LONG .)
301
.PP
302
Placing information in the data section is more painful.
303
The pseudo-instruction
304
.CW DATA
305
does the work, given two arguments: an address at which to place the item,
306
including its size,
307
and the value to place there.  For example, to define a character array
308
.CW array
309
containing the characters
310
.CW abc
311
and a terminating null:
312
.P1
313
	DATA    array+0(SB)/1, $'a'
314
	DATA    array+1(SB)/1, $'b'
315
	DATA    array+2(SB)/1, $'c'
316
	GLOBL   array(SB), $4
317
.P2
318
or
319
.P1
320
	DATA    array+0(SB)/4, $"abc\ez"
321
	GLOBL   array(SB), $4
322
.P2
323
The
324
.CW /1
325
defines the number of bytes to define,
326
.CW GLOBL
327
makes the symbol global, and the
328
.CW $4
329
says how many bytes the symbol occupies.
330
Uninitialized data is zeroed automatically.
331
The character
332
.CW \ez
333
is equivalent to the C
334
.CW \e0.
335
The string in a
336
.CW DATA
337
statement may contain a maximum of eight bytes;
338
build larger strings piecewise.
339
Two pseudo-instructions,
340
.CW DYNT
341
and
342
.CW INIT ,
343
allow the (obsolete) Alef compilers to build dynamic type information during the load
344
phase.
345
The
346
.CW DYNT
347
pseudo-instruction has two forms:
348
.P1
349
	DYNT	, ALEF_SI_5+0(SB)
350
	DYNT	ALEF_AS+0(SB), ALEF_SI_5+0(SB)
351
.P2
352
In the first form,
353
.CW DYNT
354
defines the symbol to be a small unique integer constant, chosen by the loader,
355
which is some multiple of the word size.  In the second form,
356
.CW DYNT
357
defines the second symbol in the same way,
358
places the address of the most recently
359
defined text symbol in the array specified by the first symbol at the
360
index defined by the value of the second symbol,
361
and then adjusts the size of the array accordingly.
362
.PP
363
The
364
.CW INIT
365
pseudo-instruction takes the same parameters as a
366
.CW DATA
367
statement.  Its symbol is used as the base of an array and the
368
data item is installed in the array at the offset specified by the most recent
369
.CW DYNT
370
pseudo-instruction.
371
The size of the array is adjusted accordingly.
372
The
373
.CW DYNT
374
and
375
.CW INIT
376
pseudo-instructions are not implemented on the 68020.
377
.SH
378
Defining a procedure
379
.PP
380
Entry points are defined by the pseudo-operation
381
.CW TEXT ,
382
which takes as arguments the name of the procedure (including the ubiquitous
383
.CW (SB) )
384
and the number of bytes of automatic storage to pre-allocate on the stack,
385
which will usually be zero when writing assembly language programs.
386
On machines with a link register, such as the MIPS and SPARC,
387
the special value -4 instructs the loader to generate no PC save
388
and restore instructions, even if the function is not a leaf.
389
Here is a complete procedure that returns the sum
390
of its two arguments:
391
.P1
392
TEXT	sum(SB), $0
393
	MOVL	arg1+0(FP), R0
394
	ADDL	arg2+4(FP), R0
395
	RTS
396
.P2
397
An optional middle argument
398
to the
399
.CW TEXT
400
pseudo-op is a bit field of options to the loader.
401
Setting the 1 bit suspends profiling the function when profiling is enabled for the rest of
402
the program.
403
For example,
404
.P1
405
TEXT	sum(SB), 1, $0
406
	MOVL	arg1+0(FP), R0
407
	ADDL	arg2+4(FP), R0
408
	RTS
409
.P2
410
will not be profiled; the first version above would be.
411
Subroutines with peculiar state, such as system call routines,
412
should not be profiled.
413
.PP
414
Setting the 2 bit allows multiple definitions of the same
415
.CW TEXT
416
symbol in a program; the loader will place only one such function in the image.
417
It was emitted only by the Alef compilers.
418
.PP
419
Subroutines to be called from C should place their result in
420
.CW R0 ,
421
even if it is an address.
422
Floating point values are returned in
423
.CW F0 .
424
Functions that return a structure to a C program
425
receive as their first argument the address of the location to
426
store the result;
427
.CW R0
428
is unused in the calling protocol for such procedures.
429
A subroutine is responsible for saving its own registers,
430
and therefore is free to use any registers without saving them (``caller saves'').
431
.CW A6
432
and
433
.CW A7
434
are the exceptions as described above.
435
.SH
436
When in doubt
437
.PP
438
If you get confused, try using the
439
.CW -S
440
option to
441
.CW 2c
442
and compiling a sample program.
443
The standard output is valid input to the assembler.
444
.SH
445
Instructions
446
.PP
447
The instruction set of the assembler is not identical to that
448
of the machine.
449
It is chosen to match what the compiler generates, augmented
450
slightly by specific needs of the operating system.
451
For example,
452
.CW 2a
453
does not distinguish between the various forms of
454
.CW MOVE
455
instruction: move quick, move address, etc.  Instead the context
456
does the job.  For example,
457
.P1
458
	MOVL	$1, R1
459
	MOVL	A0, R2
460
	MOVW	SR, R3
461
.P2
462
generates official
463
.CW MOVEQ ,
464
.CW MOVEA ,
465
and
466
.CW MOVESR
467
instructions.
468
A number of instructions do not have the syntax necessary to specify
469
their entire capabilities.  Notable examples are the bitfield
470
instructions, the
471
multiply and divide instructions, etc.
472
For a complete set of generated instruction names (in
473
.CW 2a
474
notation, not Motorola's) see the file
475
.CW /sys/src/cmd/2c/2.out.h .
476
Despite its name, this file contains an enumeration of the
477
instructions that appear in the intermediate files generated
478
by the compiler, which correspond exactly to lines of assembly language.
479
.SH
480
Laying down instructions
481
.PP
482
The loader modifies the code produced by the assembler and compiler.
483
It folds branches,
484
copies short sequences of code to eliminate branches,
485
and discards unreachable code.
486
The first instruction of every function is assumed to be reachable.
487
The pseudo-instruction
488
.CW NOP ,
489
which you may see in compiler output,
490
means no instruction at all, rather than an instruction that does nothing.
491
The loader discards all
492
.CW NOP 's.
493
.PP
494
To generate a true
495
.CW NOP
496
instruction, or any other instruction not known to the assembler, use a
497
.CW WORD
498
pseudo-instruction.
499
Such instructions on RISCs are not scheduled by the loader and must have
500
their delay slots filled manually.
501
.SH
502
MIPS
503
.PP
504
The registers are only addressed by number:
505
.CW R0
506
through
507
.CW R31 .
508
.CW R29
509
is the stack pointer;
510
.CW R30
511
is used as the static base pointer, the analogue of
512
.CW A6
513
on the 68020.
514
Its value is the address of the global symbol
515
.CW setR30(SB) .
516
The register holding returned values from subroutines is
517
.CW R1 .
518
When a function is called, space for the first argument
519
is reserved at
520
.CW 0(FP)
521
but in C (not Alef) the value is passed in
522
.CW R1
523
instead.
524
.PP
525
The loader uses
526
.CW R28
527
as a temporary.  The system uses
528
.CW R26
529
and
530
.CW R27
531
as interrupt-time temporaries.  Therefore none of these registers
532
should be used in user code.
533
.PP
534
The control registers are not known to the assembler.
535
Instead they are numbered registers
536
.CW M0 ,
537
.CW M1 ,
538
etc.
539
Use this trick to access, say,
540
.CW STATUS :
541
.P1
542
#define	STATUS	12
543
	MOVW	M(STATUS), R1
544
.P2
545
.PP
546
Floating point registers are called
547
.CW F0
548
through
549
.CW F31 .
550
By convention,
551
.CW F24
552
must be initialized to the value 0.0,
553
.CW F26
554
to 0.5,
555
.CW F28
556
to 1.0, and
557
.CW F30
558
to 2.0;
559
this is done by the operating system.
560
.PP
561
The instructions and their syntax are different from those of the manufacturer's
562
manual.
563
There are no
564
.CW lui
565
and kin; instead there are
566
.CW MOVW
567
(move word),
568
.CW MOVH
569
(move halfword),
570
and
571
.CW MOVB
572
(move byte) pseudo-instructions.  If the operand is unsigned, the instructions
573
are
574
.CW MOVHU
575
and
576
.CW MOVBU .
577
The order of operands is from left to right in dataflow order, just as
578
on the 68020 but not as in MIPS documentation.
579
This means that the
580
.CW Bcond
581
instructions are reversed with respect to the book; for example, a
582
.CW va
583
.CW BGTZ
584
generates a MIPS
585
.CW bltz
586
instruction.
587
.PP
588
The assembler is for the R2000, R3000, and most of the R4000 and R6000 architectures.
589
It understands the 64-bit instructions
590
.CW MOVV ,
591
.CW MOVVL ,
592
.CW ADDV ,
593
.CW ADDVU ,
594
.CW SUBV ,
595
.CW SUBVU ,
596
.CW MULV ,
597
.CW MULVU ,
598
.CW DIVV ,
599
.CW DIVVU ,
600
.CW SLLV ,
601
.CW SRLV ,
602
and
603
.CW SRAV .
604
The assembler does not have any cache, load-linked, or store-conditional instructions.
605
.PP
606
Some assembler instructions are expanded into multiple instructions by the loader.
607
For example the loader may convert the load of a 32 bit constant into an
608
.CW lui
609
followed by an
610
.CW ori .
611
.PP
612
Assembler instructions should be laid out as if there
613
were no load, branch, or floating point compare delay slots;
614
the loader will rearrange\(em\f2schedule\f1\(emthe instructions
615
to guarantee correctness and improve performance.
616
The only exception is that the correct scheduling of instructions
617
that use control registers varies from model to model of machine
618
(and is often undocumented) so you should schedule such instructions
619
by hand to guarantee correct behavior.
620
The loader generates
621
.P1
622
	NOR	R0, R0, R0
623
.P2
624
when it needs a true no-op instruction.
625
Use exactly this instruction when scheduling code manually;
626
the loader recognizes it and schedules the code before it and after it independently.  Also,
627
.CW WORD
628
pseudo-ops are scheduled like no-ops.
629
.PP
630
The
631
.CW NOSCHED
632
pseudo-op disables instruction scheduling
633
(scheduling is enabled by default);
634
.CW SCHED
635
re-enables it.
636
Branch folding, code copying, and dead code elimination are
637
disabled for instructions that are not scheduled.
638
.SH
639
SPARC
640
.PP
641
Once you understand the Plan 9 model for the MIPS, the SPARC is familiar.
642
Registers have numerical names only:
643
.CW R0
644
through
645
.CW R31 .
646
Forget about register windows: Plan 9 doesn't use them at all.
647
The machine has 32 global registers, period.
648
.CW R1
649
[sic] is the stack pointer.
650
.CW R2
651
is the static base register, with value the address of
652
.CW setSB(SB) .
653
.CW R7
654
is the return register and also the register holding the first
655
argument to a C (not Alef) function, again with space reserved at
656
.CW 0(FP) .
657
.CW R14
658
is the loader temporary.
659
.PP
660
Floating-point registers are exactly as on the MIPS.
661
.PP
662
The control registers are known by names such as
663
.CW FSR .
664
The instructions to access these registers are
665
.CW MOVW
666
instructions, for example
667
.P1
668
	MOVW	Y, R8
669
.P2
670
for the SPARC instruction
671
.P1
672
	rdy	%r8
673
.P2
674
.PP
675
Move instructions are similar to those on the MIPS: pseudo-operations
676
that turn into appropriate sequences of
677
.CW sethi
678
instructions, adds, etc.
679
Instructions read from left to right.  Because the arguments are
680
flipped to
681
.CW SUBCC ,
682
the condition codes are not inverted as on the MIPS.
683
.PP
684
The syntax for the ASI stuff is, for example to move a word from ASI 2:
685
.P1
686
	MOVW	(R7, 2), R8
687
.P2
688
The syntax for double indexing is
689
.P1
690
	MOVW	(R7+R8), R9
691
.P2
692
.PP
693
The SPARC's instruction scheduling is similar to the MIPS's.
694
The official no-op instruction is:
695
.P1
696
	ORN	R0, R0, R0
697
.P2
698
.SH
699
i960
700
.PP
701
Registers are numbered
702
.CW R0
703
through
704
.CW R31 .
705
Stack pointer is
706
.CW R29 ;
707
return register is
708
.CW R4 ;
709
static base is
710
.CW R28 ;
711
it is initialized to the address of
712
.CW setSB(SB) .
713
.CW R3
714
must be zero; this should be done manually early in execution by
715
.P1
716
	SUBO	R3, R3
717
.P2
718
.CW R27
719
is the loader temporary.
720
.PP
721
There is no support for floating point.
722
.PP
723
The Intel calling convention is not supported and cannot be used; use
724
.CW BAL
725
instead.
726
Instructions are mostly as in the book.  The major change is that
727
.CW LOAD
728
and
729
.CW STORE
730
are both called
731
.CW MOV .
732
The extension character for
733
.CW MOV
734
is as in the manual:
735
.CW O
736
for ordinal,
737
.CW W
738
for signed, etc.
739
.SH
740
i386
741
.PP
742
The assembler assumes 32-bit protected mode.
743
The register names are
744
.CW SP ,
745
.CW AX ,
746
.CW BX ,
747
.CW CX ,
748
.CW DX ,
749
.CW BP ,
750
.CW DI ,
751
and
752
.CW SI .
753
The stack pointer (not a pseudo-register) is
754
.CW SP
755
and the return register is
756
.CW AX .
757
There is no physical frame pointer but, as for the MIPS,
758
.CW FP
759
is a pseudo-register that acts as
760
a frame pointer.
761
.PP
762
Opcode names are mostly the same as those listed in the Intel manual
763
with an
764
.CW L ,
765
.CW W ,
766
or
767
.CW B
768
appended to identify 32-bit, 
769
16-bit, and 8-bit operations.
770
The exceptions are loads, stores, and conditionals.
771
All load and store opcodes to and from general registers, special registers
772
(such as
773
.CW CR0,
774
.CW CR3,
775
.CW GDTR,
776
.CW IDTR,
777
.CW SS,
778
.CW CS,
779
.CW DS,
780
.CW ES,
781
.CW FS,
782
and
783
.CW GS )
784
or memory are written
785
as
786
.P1
787
	MOV\f2x\fP	src,dst
788
.P2
789
where
790
.I x
791
is
792
.CW L ,
793
.CW W ,
794
or
795
.CW B .
796
Thus to get
797
.CW AL
798
use a
799
.CW MOVB
800
instruction.  If you need to access
801
.CW AH ,
802
you must mention it explicitly in a
803
.CW MOVB :
804
.P1
805
	MOVB	AH, BX
806
.P2
807
There are many examples of illegal moves, for example,
808
.P1
809
	MOVB	BP, DI
810
.P2
811
that the loader actually implements as pseudo-operations.
812
.PP
813
The names of conditions in all conditional instructions
814
.CW J , (
815
.CW SET )
816
follow the conventions of the 68020 instead of those of the Intel
817
assembler:
818
.CW JOS ,
819
.CW JOC ,
820
.CW JCS ,
821
.CW JCC ,
822
.CW JEQ ,
823
.CW JNE ,
824
.CW JLS ,
825
.CW JHI ,
826
.CW JMI ,
827
.CW JPL ,
828
.CW JPS ,
829
.CW JPC ,
830
.CW JLT ,
831
.CW JGE ,
832
.CW JLE ,
833
and
834
.CW JGT
835
instead of
836
.CW JO ,
837
.CW JNO ,
838
.CW JB ,
839
.CW JNB ,
840
.CW JZ ,
841
.CW JNZ ,
842
.CW JBE ,
843
.CW JNBE ,
844
.CW JS ,
845
.CW JNS ,
846
.CW JP ,
847
.CW JNP ,
848
.CW JL ,
849
.CW JNL ,
850
.CW JLE ,
851
and
852
.CW JNLE .
853
.PP
854
The addressing modes have syntax like
855
.CW AX ,
856
.CW (AX) ,
857
.CW (AX)(BX*4) ,
858
.CW 10(AX) ,
859
and
860
.CW 10(AX)(BX*4) .
861
The offsets from
862
.CW AX
863
can be replaced by offsets from
864
.CW FP
865
or
866
.CW SB
867
to access names, for example
868
.CW extern+5(SB)(AX*2) .
869
.PP
870
Other notes: Non-relative
871
.CW JMP
872
and
873
.CW CALL
874
have a
875
.CW *
876
added to the syntax.
877
Only
878
.CW LOOP ,
879
.CW LOOPEQ ,
880
and
881
.CW LOOPNE
882
are legal loop instructions.  Only
883
.CW REP
884
and
885
.CW REPN
886
are recognized repeaters.  These are not prefixes, but rather
887
stand-alone opcodes that precede the strings, for example
888
.P1
889
	CLD; REP; MOVSL
890
.P2
891
Segment override prefixes in
892
.CW MOD/RM
893
fields are not supported.
894
.SH
895
AMD64
896
.PP
897
The assembler assumes 64-bit mode unless a
898
.CW MODE
899
pseudo-operation is given:
900
.P1
901
	MODE $32
902
.P2
903
to change to 32-bit mode.
904
The effect is mainly to diagnose instructions that are illegal in
905
the given mode, but the loader will also assume 32-bit operands and addresses,
906
and 32-bit PC values for call and return.
907
The assembler's conventions are similar to those for the 386, above.
908
The architecture provides extra fixed-point registers
909
.CW R8
910
to
911
.CW R15 .
912
All registers are 64 bit, but instructions access low-order 8, 16 and 32 bits
913
as described in the processor handbook.
914
For example,
915
.CW MOVL
916
to
917
.CW AX
918
puts a value in the low-order 32 bits and clears the top 32 bits to zero.
919
Literal operands are limited to signed 32 bit values, which are sign-extended
920
to 64 bits in 64 bit operations; the exception is
921
.CW MOVQ ,
922
which allows 64-bit literals.
923
The external registers in Plan 9's C are allocated from
924
.CW R15
925
down.
926
.PP
927
There are many new instructions, including the MMX and XMM media instructions,
928
and conditional move instructions.
929
MMX registers are
930
.CW M0
931
to
932
.CW M7 ,
933
and
934
XMM registers are
935
.CW X0
936
to
937
.CW X15 .
938
As with the 386 instruction names,
939
all new 64-bit integer instructions, and the MMX and XMM instructions
940
uniformly use
941
.CW L
942
for `long word' (32 bits) and
943
.CW Q
944
for `quad word' (64 bits).
945
Some instructions use
946
.CW O
947
(`octword') for 128-bit values, where the processor handbook
948
variously uses
949
.CW O
950
or
951
.CW DQ .
952
The assembler also consistently uses
953
.CW PL
954
for `packed long' in
955
XMM instructions, instead of
956
.CW Q ,
957
.CW DQ
958
or
959
.CW PI .
960
Either
961
.CW MOVL
962
or
963
.CW MOVQ
964
can be used to move values to and from control registers, even when
965
the registers might be 64 bits.
966
The assembler often accepts the handbook's name to ease conversion
967
of existing code (but remember that the operand order is uniformly
968
source then destination).
969
.PP
970
C's
971
.CW long
972
.CW long
973
type is 64 bits, but passed and returned by value, not by reference.
974
More notably, C pointer values are 64 bits, and thus
975
.CW long
976
.CW long
977
and
978
.CW unsigned
979
.CW long
980
.CW long
981
are the only integer types wide enough to hold a pointer value.
982
The C compiler and library use the XMM floating-point instructions, not
983
the old 387 ones, although the latter are implemented by assembler and loader.
984
Unlike the 386, the first integer or pointer argument is passed in a register, which is
985
.CW BP
986
for an integer or pointer (it can be referred to in assembly code by the pseudonym
987
.CW RARG ).
988
.CW AX
989
holds the return value from subroutines as before.
990
Floating-point results are returned in
991
.CW X0 ,
992
although currently the first floating-point parameter is not passed in a register.
993
All parameters less than 8 bytes in length have 8 byte slots reserved on the stack
994
to preserve alignment and simplify variable-length argument list access,
995
including the first parameter when passed in a register,
996
even though bytes 4 to 7 are not initialized.
997
.
998
.SH
999
Power PC
1000
.PP
1001
The Power PC follows the Plan 9 model set by the MIPS and SPARC,
1002
not the elaborate ABIs.
1003
The 32-bit instructions of the 60x and 8xx PowerPC architectures are supported;
1004
there is no support for the older POWER instructions.
1005
Registers are
1006
.CW R0
1007
through
1008
.CW R31 .
1009
.CW R0
1010
is initialized to zero; this is done by C start up code
1011
and assumed by the compiler and loader.
1012
.CW R1
1013
is the stack pointer.
1014
.CW R2
1015
is the static base register, with value the address of
1016
.CW setSB(SB) .
1017
.CW R3
1018
is the return register and also the register holding the first
1019
argument to a C function, with space reserved at
1020
.CW 0(FP)
1021
as on the MIPS.
1022
.CW R31
1023
is the loader temporary.
1024
The external registers in Plan 9's C are allocated from
1025
.CW R30
1026
down.
1027
.PP
1028
Floating point registers are called
1029
.CW F0
1030
through
1031
.CW F31 .
1032
By convention, several registers are initialized
1033
to specific values; this is done by the operating system.
1034
.CW F27
1035
must be initialized to the value
1036
.CW 0x4330000080000000
1037
(used by float-to-int conversion),
1038
.CW F28
1039
to the value 0.0,
1040
.CW F29
1041
to 0.5,
1042
.CW F30
1043
to 1.0, and
1044
.CW F31
1045
to 2.0.
1046
.PP
1047
As on the MIPS and SPARC, the assembler accepts arbitrary literals
1048
as operands to
1049
.CW MOVW ,
1050
and also to
1051
.CW ADD
1052
and others where `immediate' variants exist,
1053
and the loader generates sequences
1054
of
1055
.CW addi ,
1056
.CW addis ,
1057
.CW oris ,
1058
etc. as required.
1059
The register indirect addressing modes use the same syntax as the SPARC,
1060
including double indexing when allowed.
1061
.PP
1062
The instruction names are generally derived from the Motorola ones,
1063
subject to slight transformation:
1064
the
1065
.CW . ' `
1066
marking the setting of condition codes is replaced by
1067
.CW CC ,
1068
and when the letter
1069
.CW o ' `
1070
represents `OE=1' it is replaced by
1071
.CW V .
1072
Thus
1073
.CW add ,
1074
.CW addo.
1075
and
1076
.CW subfzeo.
1077
become
1078
.CW ADD ,
1079
.CW ADDVCC
1080
and
1081
.CW SUBFZEVCC .
1082
As well as the three-operand conditional branch instruction
1083
.CW BC ,
1084
the assembler provides pseudo-instructions for the common cases:
1085
.CW BEQ ,
1086
.CW BNE ,
1087
.CW BGT ,
1088
.CW BGE ,
1089
.CW BLT ,
1090
.CW BLE ,
1091
.CW BVC ,
1092
and
1093
.CW BVS .
1094
The unconditional branch instruction is
1095
.CW BR .
1096
Indirect branches use
1097
.CW "(CTR)"
1098
or
1099
.CW "(LR)"
1100
as target.
1101
.PP
1102
Load or store operations are replaced by
1103
.CW MOV
1104
variants in the usual way:
1105
.CW MOVW
1106
(move word),
1107
.CW MOVH
1108
(move halfword with sign extension), and
1109
.CW MOVB
1110
(move byte with sign extension, a pseudo-instruction),
1111
with unsigned variants
1112
.CW MOVHZ
1113
and
1114
.CW MOVBZ ,
1115
and byte-reversing
1116
.CW MOVWBR
1117
and
1118
.CW MOVHBR .
1119
`Load or store with update' versions are
1120
.CW MOVWU ,
1121
.CW MOVHU ,
1122
and
1123
.CW MOVBZU .
1124
Load or store multiple is
1125
.CW MOVMW .
1126
The exceptions are the string instructions, which are
1127
.CW LSW
1128
and
1129
.CW STSW ,
1130
and the reservation instructions
1131
.CW lwarx
1132
and
1133
.CW stwcx. ,
1134
which are
1135
.CW LWAR
1136
and
1137
.CW STWCCC ,
1138
all with operands in the usual data-flow order.
1139
Floating-point load or store instructions are
1140
.CW FMOVD ,
1141
.CW FMOVDU ,
1142
.CW FMOVS ,
1143
and
1144
.CW FMOVSU .
1145
The register to register move instructions
1146
.CW fmr
1147
and
1148
.CW fmr.
1149
are written
1150
.CW FMOVD
1151
and
1152
.CW FMOVDCC .
1153
.PP
1154
The assembler knows the commonly used special purpose registers:
1155
.CW CR ,
1156
.CW CTR ,
1157
.CW DEC ,
1158
.CW LR ,
1159
.CW MSR ,
1160
and
1161
.CW XER .
1162
The rest, which are often architecture-dependent, are referenced as
1163
.CW SPR(n) . 
1164
The segment registers of the 60x series are similarly
1165
.CW SEG(n) ,
1166
but
1167
.I n
1168
can also be a register name, as in
1169
.CW SEG(R3) .
1170
Moves between special purpose registers and general purpose ones,
1171
when allowed by the architecture,
1172
are written as
1173
.CW MOVW ,
1174
replacing
1175
.CW mfcr ,
1176
.CW mtcr ,
1177
.CW mfmsr ,
1178
.CW mtmsr ,
1179
.CW mtspr ,
1180
.CW mfspr ,
1181
.CW mftb ,
1182
and many others.
1183
.PP
1184
The fields of the condition register
1185
.CW CR
1186
are referenced as
1187
.CW CR(0)
1188
through
1189
.CW CR(7) .
1190
They are used by the
1191
.CW MOVFL
1192
(move field) pseudo-instruction,
1193
which produces
1194
.CW mcrf
1195
or
1196
.CW mtcrf .
1197
For example:
1198
.P1
1199
	MOVFL	CR(3), CR(0)
1200
	MOVFL	R3, CR(1)
1201
	MOVFL	R3, $7, CR
1202
.P2
1203
They are also accepted in
1204
the conditional branch instruction, for example
1205
.P1
1206
	BEQ	CR(7), label
1207
.P2
1208
Fields of the
1209
.CW FPSCR
1210
are accessed using
1211
.CW MOVFL
1212
in a similar way:
1213
.P1
1214
	MOVFL	FPSCR, F0
1215
	MOVFL	F0, FPSCR
1216
	MOVFL	F0, $7, FPSCR
1217
	MOVFL	$0, FPSCR(3)
1218
.P2
1219
producing
1220
.CW mffs ,
1221
.CW mtfsf
1222
or
1223
.CW mtfsfi ,
1224
as appropriate.
1225
.SH
1226
ARM
1227
.PP
1228
The assembler provides access to
1229
.CW R0
1230
through
1231
.CW R14
1232
and the
1233
.CW PC .
1234
The stack pointer is
1235
.CW R13 ,
1236
the link register is
1237
.CW R14 ,
1238
and the static base register is
1239
.CW R12 .
1240
.CW R0
1241
is the return register and also the register holding
1242
the first argument to a subroutine.
1243
The external registers in Plan 9's C are allocated from
1244
.CW R10
1245
down.
1246
.CW R11
1247
is used by the loader as a temporary register.
1248
The assembler supports the
1249
.CW CPSR
1250
and
1251
.CW SPSR
1252
registers.
1253
It also knows about coprocessor registers
1254
.CW C0
1255
through
1256
.CW C15 .
1257
Floating registers are
1258
.CW F0
1259
through
1260
.CW F7 ,
1261
.CW FPSR
1262
and
1263
.CW FPCR .
1264
.PP
1265
As with the other architectures, loads and stores are called
1266
.CW MOV ,
1267
e.g.
1268
.CW MOVW
1269
for load word or store word, and
1270
.CW MOVM
1271
for
1272
load or store multiple,
1273
depending on the operands.
1274
.PP
1275
Addressing modes are supported by suffixes to the instructions:
1276
.CW .IA
1277
(increment after),
1278
.CW .IB
1279
(increment before),
1280
.CW .DA
1281
(decrement after), and
1282
.CW .DB
1283
(decrement before).
1284
These can only be used with the
1285
.CW MOV
1286
instructions.
1287
The move multiple instruction,
1288
.CW MOVM ,
1289
defines a range of registers using brackets, e.g.
1290
.CW [R0-R12] .
1291
The special
1292
.CW MOVM
1293
addressing mode bits
1294
.CW W ,
1295
.CW U ,
1296
and
1297
.CW P
1298
are written in the same manner, for example,
1299
.CW MOVM.DB.W .
1300
A
1301
.CW .S
1302
suffix allows a
1303
.CW MOVM
1304
instruction to access user
1305
.CW R13
1306
and
1307
.CW R14
1308
when in another processor mode.
1309
Shifts and rotates in addressing modes are supported by binary operators
1310
.CW <<
1311
(logical left shift),
1312
.CW >>
1313
(logical right shift),
1314
.CW ->
1315
(arithmetic right shift), and
1316
.CW @>
1317
(rotate right); for example
1318
.CW "R7>>R2" or
1319
.CW "R2@>2" .
1320
The assembler does not support indexing by a shifted expression;
1321
only names can be doubly indexed.
1322
.PP
1323
Any instruction can be followed by a suffix that makes the instruction conditional:
1324
.CW .EQ ,
1325
.CW .NE ,
1326
and so on, as in the ARM manual, with synonyms
1327
.CW .HS
1328
(for
1329
.CW .CS )
1330
and
1331
.CW .LO
1332
(for
1333
.CW .CC ),
1334
for example
1335
.CW ADD.NE .
1336
Arithmetic
1337
and logical instructions
1338
can have a
1339
.CW .S
1340
suffix, as ARM allows, to set condition codes.
1341
.PP
1342
The syntax of the
1343
.CW MCR
1344
and
1345
.CW MRC
1346
coprocessor instructions is largely as in the manual, with the usual adjustments.
1347
The assembler directly supports only the ARM floating-point coprocessor
1348
operations used by the compiler:
1349
.CW CMP ,
1350
.CW ADD ,
1351
.CW SUB ,
1352
.CW MUL ,
1353
and
1354
.CW DIV ,
1355
all with
1356
.CW F
1357
or
1358
.CW D
1359
suffix selecting single or double precision.
1360
Floating-point load or store become
1361
.CW MOVF
1362
and
1363
.CW MOVD .
1364
Conversion instructions are also specified by moves:
1365
.CW MOVWD ,
1366
.CW MOVWF ,
1367
.CW MOVDW ,
1368
.CW MOVWD ,
1369
.CW MOVFD ,
1370
and
1371
.CW MOVDF .