Subversion Repositories planix.SVN

Rev

Rev 2 | Details | Compare with Previous | Last modification | View Log | RSS feed

Rev Author Line No. Line
2 - 1
.FP palatino
2
.TM
3
.TL
4
Plan 9 on the Mikrotik RB450G Routerboard
5
.AU
6
Geoff Collyer
7
.AI
8
.MH
9
.NH 1
10
Motivation
11
.LP
12
I ported Plan 9 to the Routerboard mainly to verify
13
that Plan 9's MIPS-related code
14
(compiler, assembler, loader,
15
.CW libmach ,
16
etc.) was still in working order and would
17
work on newer machines than the 1993-era ones that we last owned
18
(MIPS Magnum, SGI Challenge, Carrera and the like).
19
The verdict is that,
20
with a few surprising exceptions, the code still works on newish machines
21
(the MIPS 24K CPU in the Routerboard dates to about 2003 originally;
22
this revision is from about 2005).
23
So we now have a
24
machine on which to test MIPS executables.
25
.LP
26
The other reason I did the port was
27
as an incremental step toward
28
running Plan 9 on a MIPS64 machine (e.g., the dual-core, dual-issue
29
Cavium CN5020 in the Ubiquiti Edgerouter Lite 3).
30
.NH 1
31
The new MIPS world
32
.LP
33
These newer MIPS systems are aimed at embedded applications, so they
34
typically lack FPUs and may also lack L2 caches or have small TLBs;
35
the MIPS 24K in the Atheros 7161 SoC lacks FPU and L2 cache, and has a
36
16-entry TLB.
37
It is a MIPS32R2 architecture system and lacks the 64-bit instructions
38
of the R4000.
39
These new MIPS systems are still big-endian,
40
so provide a useful test case to expose byte-ordering bugs.
41
.NH 1
42
Plan 9 changes and additions
43
.NH 2
44
CPU Bug Workarounds
45
.LP
46
The Linux MIPS people cite MIPS 24K erratum 48:
47
3 consecutive stores lose data.
48
MIPS only distribute their errata lists under NDA and to their
49
corporate partners, so we have only the Linux report to go on.
50
The fix requires
51
.I both
52
write-through data cache and
53
no more than two consecutive single-word stores in all executables.
54
I have made a crude optional change to
55
.I vl
56
to generate a NOP before every third consecutive store.
57
The fix could be better, in particular the technique for
58
keeping stores out of branch delay slots.
59
.NH 2
60
Driver for Undocumented Ethernet Controller
61
.LP
62
The FreeBSD Atheros
63
.I arge
64
driver
65
(in
66
.CW /usr/src/sys/mips/atheros )
67
provided inspiration for our Gigabit Ethernet driver, since the
68
hardware is otherwise largely undocumented.
69
I haven't got the second
70
Ethernet controller entirely working yet;
71
it's perhaps complicated by having a switch attached to it (the Atheros 8316).
72
At minimum, it probably needs MII or PHY initialisation.
73
.NH 2
74
Floating-point Emulation
75
.LP
76
Floating-point emulation works but is
77
.I very
78
slow:
79
.I astro
80
takes about 8 seconds.
81
I added an
82
.CW fpemudebug
83
command to
84
.CW /dev/archctl ;
85
it
86
takes a number as argument corresponding to the
87
.CW Dbg*
88
bits in
89
.CW fpimips.c ,
90
but requires the kernel to be compiled with
91
.CW FPEMUDEBUG
92
defined.
93
.NH 3
94
\&... in Locking Code
95
.LP
96
The big surprises included that
97
.CW /sys/src/libc/mips/lock.c
98
read
99
.CW FCR0
100
to
101
choose the locking style.
102
That's been broken out into
103
.CW c_fcr0.s
104
so that we can change it, but the kernel also emulates the
105
.CW MOVW
106
.CW FCR0,R1
107
(and via a fast code path), to keep alive the possibility of running
108
old binaries from the dump.
109
.NH 2
110
No 64-bit Instructions
111
.LP
112
The other big surprise was that
113
.CW /sys/src/libmp/mips/mpdigdiv.s
114
used 64-bit instructions (SLLV, SRLV, ADDVU, DIVVU).
115
For now I've resolved the problem by pushing it into a
116
subdirectory (\c
117
.CW r4k )
118
and editing the
119
.CW mkfile s
120
to use the
121
.CW port
122
version
123
(and similarly in APE).
124
.br
125
.ne 8
126
.NH 2
127
Page Size vs TLB Faults
128
.LP
129
I started out with a 4K page size and reduced the number of TLB
130
entries reserved for the kernel to 2, leaving 14 for user programs,
131
but
132
.CW /dev/sysstat
133
was reporting 6 times as many TLB faults as page
134
faults, and the number increased at a furious rate.
135
.LP
136
So I switched to
137
a 16K page size, adjusted
138
.CW vl
139
.CW -H2
140
accordingly and recompiled the
141
.CW /mips
142
world.
143
This reduced the TLB faults to just 10% more than the number of page faults.
144
(That number is now around 15% more, due to a better soft-TLB hash function
145
that makes the soft TLB more effective.)
146
16K pages also produce consecutive (even recursive) page faults
147
for the same address at the same PC
148
and the system runs at about 10% of its normal speed,
149
so 4K pages are currently the only sensible choice;
150
we'll just live with the absurdly-high number of TLB faults
151
(around 20k–30k per second).
152
It probably doesn't help that one 16K page is half of the L1 data cache
153
and one quarter of the L1 instruction cache.
154
.LP
155
Page size is controlled by
156
.CW BIGPAGES
157
in
158
.CW mem.h .
159
.NH 3
160
Combined TLB Pool
161
.LP
162
I also changed
163
.CW mmu.c
164
to collapse the separate kernel and user TLB pools into one,
165
once user processes start running,
166
but that only helps to reduce TLB faults a little.
167
.
168
.br
169
.ne 8
170
.
171
.NH 1
172
Remaining Problems
173
.LP
174
Interrupt-driven UART output isn't quite right.
175
It can get stuck and then input makes it resume.
176
The UART is apparently connected via the APB and requires
177
interrupt unmasking in the APB (which we now do).
178
There's some kludgey stuff in
179
.CW uarti8250.c
180
that makes output work most of the time
181
(characters do sometimes get dropped).
182
.LP
183
The Ethernet driver currently does not
184
dig out the MAC addresses from the hardware,
185
so you'll need to edit the
186
.CW rb
187
configuration file for each Routerboard; the format should be obvious.
188
I don't have the stomach to dig the MAC address out of the hardware
189
via SPI or whatever vile interface it requires.