Subversion Repositories planix.SVN

Rev

Details | Last modification | View Log | RSS feed

Rev Author Line No. Line
2 - 1
.TH IP 3
2
.SH NAME
3
ip, esp, gre, icmp, icmpv6, ipmux, rudp, tcp, udp \- network protocols over IP
4
.SH SYNOPSIS
5
.nf
6
.2C
7
.B bind -a #I\fIspec\fP /net
8
.sp 0.3v
9
.B /net/ipifc
10
.B /net/ipifc/clone
11
.B /net/ipifc/stats
12
.BI /net/ipifc/ n
13
.BI /net/ipifc/ n /status
14
.BI /net/ipifc/ n /ctl
15
\&...
16
.sp 0.3v
17
.B /net/arp
18
.B /net/bootp
19
.B /net/iproute
20
.B /net/ipselftab
21
.B /net/log
22
.B /net/ndb
23
.sp 0.3v
24
.B /net/esp
25
.B /net/gre
26
.B /net/icmp
27
.B /net/icmpv6
28
.B /net/ipmux
29
.B /net/rudp
30
.B /net/tcp
31
.B /net/udp
32
.sp 0.3v
33
.B /net/tcp/clone
34
.B /net/tcp/stats
35
.BI /net/tcp/ n
36
.BI /net/tcp/ n /data
37
.BI /net/tcp/ n /ctl
38
.BI /net/tcp/ n /local
39
.BI /net/tcp/ n /remote
40
.BI /net/tcp/ n /status
41
.BI /net/tcp/ n /listen
42
\&...
43
.1C
44
.fi
45
.SH DESCRIPTION
46
The
47
.I ip
48
device provides the interface to Internet Protocol stacks.
49
.I Spec
50
is an integer from 0 to 15 identifying a stack.
51
Each stack implements IPv4 and IPv6.
52
Each stack is independent of all others:
53
the only information transfer between them is via programs that
54
mount multiple stacks.
55
Normally a system uses only one stack.
56
However multiple stacks can be used for debugging
57
new IP networks or implementing firewalls or proxy
58
services.
59
.PP
60
All addresses used are 16-byte IPv6 addresses.
61
IPv4 addresses are a subset of the IPv6 addresses and both standard
62
.SM ASCII
63
formats are accepted.
64
In binary representation, all v4 addresses start with the 12 bytes, in hex:
65
.IP
66
.EX
67
00 00 00 00 00 00 00 00 00 00 ff ff
68
.EE
69
.
70
.SS "Configuring interfaces
71
Each stack may have multiple interfaces and each interface
72
may have multiple addresses.
73
The
74
.B /net/ipifc
75
directory contains a
76
.B clone
77
file, a
78
.B stats
79
file, and numbered subdirectories for each physical interface.
80
.PP
81
Opening the
82
.B clone
83
file reserves an interface.
84
The file descriptor returned from the
85
.IR open (2)
86
will point to the control file,
87
.BR ctl ,
88
of the newly allocated interface.
89
Reading
90
.B ctl
91
returns a text string representing the number of the interface.
92
Writing
93
.B ctl
94
alters aspects of the interface.
95
The possible
96
.I ctl
97
messages are those described under
98
.B "Protocol directories"
99
below and these:
100
.TF "\fLbind loopback\fR"
101
.PD
102
.
103
.\" from devip.c
104
.
105
.TP
106
.BI "bind ether " path
107
Treat the device mounted at
108
.I path
109
as an Ethernet medium carrying IP and ARP packets
110
and associate it with this interface.
111
The kernel will
112
.IR dial (2)
113
.IR path !0x800,
114
.IR path !0x806
115
and
116
.IR path !0x86dd
117
and use the connections for IPv4, ARP and IPv6 respectively.
118
.TP
119
.B "bind pkt
120
Treat this interface as a packet interface.  Assume
121
a user program will read and write the
122
.I data
123
file to receive and transmit IP packets to the kernel.
124
This is used by programs such as
125
.IR ppp (8)
126
to mediate IP packet transfer between the kernel and
127
a PPP encoded device.
128
.TP
129
.BI "bind netdev " path
130
Treat this interface as a packet interface.
131
The kernel will open
132
.I path
133
and read and write the resulting file descriptor
134
to receive and transmit IP packets.
135
.TP
136
.BI "bind loopback "
137
Treat this interface as a local loopback.  Anything
138
written to it will be looped back.
139
.
140
.\" from ipifc.c
141
.
142
.TP
143
.B "unbind
144
Disassociate the physical device from an IP interface.
145
.TP
146
.BI add\  "local [ mask remote mtu " proxy " ]
147
.PD 0
148
.TP
149
.BI try\  "local [ mask remote mtu " proxy " ]
150
.PD
151
Add a local IP address to the interface.
152
.I Try
153
adds the
154
.I local
155
address as a tentative address
156
if it's an IPv6 address.
157
The
158
.IR mask ,
159
.IR remote ,
160
.IR mtu ,
161
and
162
.B proxy
163
arguments are all optional.
164
The default
165
.I mask
166
is the class mask for the local address.
167
The default
168
.I remote
169
address is
170
.I local
171
ANDed with
172
.IR mask .
173
The default
174
.I mtu
175
(maximum transmission unit)
176
is 1514 for Ethernet and 4096 for packet media.
177
The
178
.I mtu
179
is the size in bytes of the largest packet that this interface can send.
180
.IR Proxy ,
181
if specified, means that this machine should answer
182
ARP requests for the remote address.
183
.IR Ppp (8)
184
does this to make remote machines appear
185
to be connected to the local Ethernet.
186
.TP
187
.BI remove\  "local mask"
188
Remove a local IP address from an interface.
189
.TP
190
.BI mtu\  n
191
Set the maximum transfer unit for this device to
192
.IR n .
193
The mtu is the maximum size of the packet including any
194
medium-specific headers.
195
.TP
196
.BI reassemble
197
Reassemble IP fragments before forwarding to this interface
198
.TP
199
.BI iprouting\  n
200
Allow
201
.RI ( n
202
is missing or non-zero) or disallow
203
.RI ( n
204
is 0) forwarding packets between this interface and
205
others.
206
.
207
.\" remainder from netif.c (thus called from devether.c),
208
.\" except add6 and ra6 from ipifc.c
209
.
210
.TP
211
.B bridge
212
Enable bridging (see
213
.IR bridge (3)).
214
.TP
215
.B promiscuous
216
Set the interface into promiscuous mode,
217
which makes it accept all incoming packets,
218
whether addressed to it or not.
219
.TP
220
.BI "connect " type
221
marks the Ethernet packet
222
.I type
223
as being in use, if not already in use
224
on this interface.
225
A
226
.I type
227
of -1 means `all' but appears to be a no-op.
228
.TP
229
.BI addmulti\  Media-addr
230
Treat the multicast
231
.I Media-addr
232
on this interface as a local address.
233
.TP
234
.BI remmulti\  Media-addr
235
Remove the multicast address
236
.I Media-addr
237
from this interface.
238
.TP
239
.B scanbs
240
Make the wireless interface scan for base stations.
241
.TP
242
.B headersonly
243
Set the interface to pass only packet headers, not data too.
244
.
245
.\" remainder from ipifc.c; tedious, so put them last
246
.
247
.TP
248
.BI "add6 " "v6addr pfx-len [onlink auto validlt preflt]"
249
Add the local IPv6 address
250
.I v6addr
251
with prefix length
252
.I pfx-len
253
to this interface.
254
See RFC 2461 ยง6.2.1 for more detail.
255
The remaining arguments are optional:
256
.RS
257
.TF "\fIonlink\fR"
258
.TP
259
.I onlink
260
flag: address is `on-link'
261
.TP
262
.I auto
263
flag: autonomous
264
.TP
265
.I validlt
266
valid life-time in seconds
267
.TP
268
.I preflt
269
preferred life-time in seconds
270
.RE
271
.PD
272
.TP
273
.BI "ra6 " "keyword value ..."
274
Set IPv6 router advertisement (RA) parameter
275
.IR keyword 's
276
.IR value .
277
Known
278
.IR keyword s
279
and the meanings of their values follow.
280
See RFC 2461 ยง6.2.1 for more detail.
281
Flags are true iff non-zero.
282
.RS
283
.TF "\fLreachtime\fR"
284
.TP
285
.B recvra
286
flag: receive and process RAs.
287
.TP
288
.B sendra
289
flag: generate and send RAs.
290
.TP
291
.B mflag
292
flag: ``Managed address configuration'',
293
goes into RAs.
294
.TP
295
.B oflag
296
flag: ``Other stateful configuration'',
297
goes into RAs.
298
.TP
299
.B maxraint
300
``maximum time allowed between sending unsolicited multicast''
301
RAs from the interface, in ms.
302
.TP
303
.B minraint
304
``minimum time allowed between sending unsolicited multicast''
305
RAs from the interface, in ms.
306
.TP
307
.B linkmtu
308
``value to be placed in MTU options sent by the router.''
309
Zero indicates none.
310
.TP
311
.B reachtime
312
sets the Reachable Time field in RAs sent by the router.
313
``Zero means unspecified (by this router).''
314
.TP
315
.B rxmitra
316
sets the Retrans Timer field in RAs sent by the router.
317
``Zero means unspecified (by this router).''
318
.TP
319
.B ttl
320
default value of the Cur Hop Limit field in RAs sent by the router.
321
Should be set to the ``current diameter of the Internet.''
322
``Zero means unspecified (by this router).''
323
.TP
324
.B routerlt
325
sets the Router Lifetime field of RAs sent from the interface, in ms.
326
Zero means the router is not to be used as a default router.
327
.PD
328
.RE
329
.PP
330
Reading the interface's
331
.I status
332
file returns information about the interface, one line for each
333
local address on that interface.  The first line
334
has 9 white-space-separated fields: device, mtu, local address,
335
mask, remote or network address, packets in, packets out, input errors,
336
output errors.  Each subsequent line contains all but the device and mtu.
337
See
338
.I readipifc
339
in
340
.IR ip (2).
341
.
342
.SS "Routing
343
The file
344
.I iproute
345
controls information about IP routing.
346
When read, it returns one line per routing entry.
347
Each line contains six white-space-separated fields:
348
target address, target mask, address of next hop, flags,
349
tag, and interface number.
350
The entry used for routing an IP packet is the one with
351
the longest mask for which destination address ANDed with
352
target mask equals the target address.
353
The one-character flags are:
354
.TF m
355
.TP
356
.B 4
357
IPv4 route
358
.TP
359
.B 6
360
IPv6 route
361
.TP
362
.B i
363
local interface
364
.TP
365
.B b
366
broadcast address
367
.TP
368
.B u
369
local unicast address
370
.TP
371
.B m
372
multicast route
373
.TP
374
.B p
375
point-to-point route
376
.PD
377
.PP
378
The tag is an arbitrary, up to 4 character, string.  It is normally used to
379
indicate what routing protocol originated the route.
380
.PP
381
Writing to
382
.B /net/iproute
383
changes the route table.  The messages are:
384
.TF "\fLroute \fItarget\fR"
385
.PD
386
.TP
387
.B flush
388
Remove all routes.
389
.TP
390
.BI tag\  string
391
Associate the tag,
392
.IR string ,
393
with all subsequent routes added via this file descriptor.
394
.TP
395
.BI add\  "target mask nexthop"
396
Add the route to the table.  If one already exists with the
397
same target and mask, replace it.
398
.TP
399
.BI remove\  "target mask"
400
Remove a route with a matching target and mask.
401
.
402
.TP
403
.BI route\  target
404
Print on the console the route to address
405
.IR target ,
406
if any.
407
Primarily a debugging aid.
408
.
409
.SS "Address resolution
410
The file
411
.B /net/arp
412
controls information about address resolution.
413
The kernel automatically updates the v4 ARP and v6 Neighbour Discovery
414
information for Ethernet interfaces.
415
When read, the file returns one line per address containing the
416
type of medium, the status of the entry (OK, WAIT), the IP
417
address, and the medium address.
418
Writing to
419
.B /net/arp
420
administers the ARP information.
421
The control messages are:
422
.TF "\fLdel \fIIP-addr\fR"
423
.PD
424
.TP
425
.B flush
426
Remove all entries.
427
.TP
428
.BI add\  "type IP-addr Media-addr"
429
Add an entry or replace an existing one for the
430
same IP address.
431
.TP
432
.BI del\  "IP-addr"
433
Delete an individual entry.
434
.PP
435
ARP entries do not time out.  The ARP table is a
436
cache with an LRU replacement policy.  The IP stack
437
listens for all ARP requests and, if the requester is in
438
the table, the entry is updated.
439
Also, whenever a new address is configured onto an
440
Ethernet, an ARP request is sent to help
441
update the table on other systems.
442
.PP
443
Currently, the only medium type is
444
.BR ether .
445
.br
446
.ne 3
447
.
448
.SS "Debugging and stack information
449
If any process is holding
450
.B /net/log
451
open, the IP stack queues debugging information to it.
452
This is intended primarily for debugging the IP stack.
453
The information provided is implementation-defined;
454
see the source for details.  Generally, what is returned is error messages
455
about bad packets.
456
.PP
457
Writing to
458
.B /net/log
459
controls debugging.  The control messages are:
460
.TF "\fLclear \fIarglist\fR"
461
.PD
462
.TP
463
.BI set\  arglist
464
.I Arglist
465
is a space-separated list of items for which to enable debugging.
466
The possible items are:
467
.BR ppp ,
468
.BR ip ,
469
.BR fs ,
470
.BR tcp ,
471
.BR icmp ,
472
.BR udp ,
473
.BR compress ,
474
.BR gre ,
475
.BR tcpwin ,
476
.BR tcprxmt ,
477
.BR udpmsg ,
478
.BR ipmsg ,
479
and
480
.BR esp .
481
.TP
482
.BI clear\  arglist
483
.I Arglist
484
is a space-separated list of items for which to disable debugging.
485
.TP
486
.BI only\  addr
487
If
488
.I addr
489
is non-zero, restrict debugging to only those
490
packets whose source or destination is that
491
address.
492
.PP
493
The file
494
.B /net/ndb
495
can be read or written by
496
programs.  It is normally used by
497
.IR ipconfig (8)
498
to leave configuration information for other programs
499
such as
500
.B dns
501
and
502
.B cs
503
(see
504
.IR ndb (8)).
505
.B /net/ndb
506
may contain up to 1024 bytes.
507
.PP
508
The file
509
.B /net/ipselftab
510
is a read-only file containing all the IP addresses
511
considered local.  Each line in the file contains
512
three white-space-separated fields: IP address, usage count,
513
and flags.  The usage count is the number of interfaces to which
514
the address applies.  The flags are the same as for routing
515
entries.
516
Note that the `IPv4 route' flag will never be set.
517
.br
518
.ne 3
519
.
520
.SS "Protocol directories
521
The
522
.I ip
523
device
524
supports IP as well as several protocols that run over it:
525
TCP, UDP, RUDP, ICMP, GRE, and ESP.
526
TCP and UDP provide the standard Internet
527
protocols for reliable stream and unreliable datagram
528
communication.
529
RUDP is a locally-developed reliable datagram protocol based on UDP.
530
ICMP is IP's catch-all control protocol used to send
531
low level error messages and to implement
532
.IR ping (8).
533
GRE is a general encapsulation protocol.
534
ESP is the encapsulation protocol for IPsec.
535
IL provided a reliable datagram service for communication
536
between Plan 9 machines over IPv4, but is no longer part of the system.
537
.PP
538
Each protocol is a subdirectory of the IP stack.
539
The top level directory of each protocol contains a
540
.B clone
541
file, a
542
.B stats
543
file, and subdirectories numbered from zero to the number of connections
544
opened for this protocol.
545
.PP
546
Opening the
547
.B clone
548
file reserves a connection.  The file descriptor returned from the
549
.IR open (2)
550
will point to the control file,
551
.BR ctl ,
552
of the newly allocated connection.
553
Reading
554
.B ctl
555
returns a text
556
string representing the number of the
557
connection.
558
Connections may be used either to listen for incoming calls
559
or to initiate calls to other machines.
560
.PP
561
A connection is controlled by writing text strings to the associated
562
.B ctl
563
file.
564
After a connection has been established data may be read from
565
and written to
566
.BR data .
567
A connection can be actively established using the
568
.B connect
569
message (see also
570
.IR dial (2)).
571
A connection can be established passively by first
572
using an
573
.B announce
574
message (see
575
.IR dial (2))
576
to bind to a local port and then
577
opening the
578
.B listen
579
file (see
580
.IR dial (2))
581
to receive incoming calls.
582
.PP
583
The following control messages are supported:
584
.TF "\fLremmulti \fIip\fR"
585
.PD
586
.TP
587
.BI connect\  ip-address ! port "!r " local
588
Establish a connection to the remote
589
.I ip-address
590
and
591
.IR port .
592
If
593
.I local
594
is specified, it is used as the local port number.
595
If
596
.I local
597
is not specified but
598
.B !r
599
is, the system will allocate
600
a restricted port number (less than 1024) for the connection to allow communication
601
with Unix
602
.B login
603
and
604
.B exec
605
services.
606
Otherwise a free port number starting at 5000 is chosen.
607
The connect fails if the combination of local and remote address/port pairs
608
are already assigned to another port.
609
.TP
610
.BI announce\  X
611
.I X
612
is a decimal port number or
613
.LR * .
614
Set the local port
615
number to
616
.I X
617
and accept calls to
618
.IR X .
619
If
620
.I X
621
is
622
.LR * ,
623
accept
624
calls for any port that no process has explicitly announced.
625
The local IP address cannot be set.
626
.B Announce
627
fails if the connection is already announced or connected.
628
.TP
629
.BI bind\  X
630
.I X
631
is a decimal port number or
632
.LR * .
633
Set the local port number to
634
.IR X .
635
This exists to support emulation
636
of BSD sockets by the APE libraries (see
637
.IR pcc (1))
638
and is not otherwise used.
639
.\" this is gone
640
.\" .TP
641
.\" .BI backlog\  n
642
.\" Set the maximum number of unanswered (queued) incoming
643
.\" connections to an announced port to
644
.\" .IR n .
645
.\" By default
646
.\" .I n
647
.\" is set to five.  If more than
648
.\" .I n
649
.\" connections are pending,
650
.\" further requests for a service will be rejected.
651
.TP
652
.BI ttl\  n
653
Set the time to live IP field in outgoing packets to
654
.IR n .
655
.TP
656
.BI tos\  n
657
Set the service type IP field in outgoing packets to
658
.IR n .
659
.TP
660
.B ignoreadvice
661
Don't break (UDP) connections because of ICMP errors.
662
.TP
663
.BI addmulti\  "ifc-ip [ mcast-ip ]"
664
Treat
665
.I ifc-ip
666
on this multicast interface as a local address.
667
If
668
.I mcast-ip
669
is present,
670
use it as the interface's multicast address.
671
.TP
672
.BI remmulti\  ip
673
Remove the address
674
.I ip
675
from this multicast interface.
676
.PP
677
Port numbers must be in the range 1 to 32767.
678
.PP
679
Several files report the status of a
680
connection.
681
The
682
.B remote
683
and
684
.B local
685
files contain the IP address and port number for the remote and local side of the
686
connection.  The
687
.B status
688
file contains protocol-dependent information to help debug network connections.
689
On receiving and error or EOF reading or writing the
690
.B data
691
file, the
692
.B err
693
file contains the reason for error.
694
.PP
695
A process may accept incoming connections by
696
.IR open (2)ing
697
the
698
.B listen
699
file.
700
The
701
.B open
702
will block until a new connection request arrives.
703
Then
704
.B open
705
will return an open file descriptor which points to the control file of the
706
newly accepted connection.
707
This procedure will accept all calls for the
708
given protocol.
709
See
710
.IR dial (2).
711
.
712
.SS TCP
713
TCP connections are reliable point-to-point byte streams; there are no
714
message delimiters.
715
A connection is determined by the address and port numbers of the two
716
ends.
717
TCP
718
.B ctl
719
files support the following additional messages:
720
.TF "\fLkeepalive\fI n\fR"
721
.PD
722
.TP
723
.B hangup
724
close down this TCP connection
725
.TP
726
.BI keepalive \ n
727
turn on keep alive messages.
728
.IR N ,
729
if given, is the milliseconds between keepalives
730
(default 30000).
731
.TP
732
.BI checksum \ n
733
emit TCP checksums of zero if
734
.I n
735
is zero; otherwise, and by default,
736
TCP checksums are computed and sent normally.
737
.TP
738
.BI tcpporthogdefense \ onoff
739
.I onoff
740
of
741
.L on
742
enables the TCP port-hog defense for all TCP connections;
743
.I onoff
744
of
745
.L off
746
disables it.
747
The defense is a solution to hijacked systems staking out ports
748
as a form of denial-of-service attack.
749
To avoid stateless TCP conversation hogs,
750
.I ip
751
picks a TCP sequence number at random for keepalives.
752
If that number gets acked by the other end,
753
.I ip
754
shuts down the connection.
755
Some firewalls,
756
notably ones that perform stateful inspection,
757
discard such out-of-specification keepalives,
758
so connections through such firewalls
759
will be killed after five minutes
760
by the lack of keepalives.
761
.
762
.SS UDP
763
UDP connections carry unreliable and unordered datagrams.  A read from
764
.B data
765
will return the next datagram, discarding anything
766
that doesn't fit in the read buffer.
767
A write is sent as a single datagram.
768
.PP
769
By default, a UDP connection is a point-to-point link.
770
Either a
771
.B connect
772
establishes a local and remote address/port pair or
773
after an
774
.BR announce ,
775
each datagram coming from a different remote address/port pair
776
establishes a new incoming connection.
777
However, many-to-one semantics is also possible.
778
.PP
779
If, after an
780
.BR announce ,
781
the message
782
.L headers
783
is written to
784
.BR ctl ,
785
then all messages sent to the announced port
786
are received on the announced connection prefixed
787
with the corresponding structure,
788
declared in
789
.BR <ip.h> :
790
.IP
791
.EX
792
typedef struct Udphdr Udphdr;
793
struct Udphdr
794
{
795
	uchar	raddr[16];	/* V6 remote address and port */
796
	uchar	laddr[16];	/* V6 local address and port */
797
	uchar	ifcaddr[16];	/* V6 interface address (receive only) */
798
	uchar	rport[2];	/* remote port */
799
	uchar	lport[2];	/* local port */
800
};
801
.EE
802
.PP
803
Before a write, a user must prefix a similar structure to each message.
804
The system overrides the user specified local port with the announced
805
one.  If the user specifies an address that isn't a unicast address in
806
.BR /net/ipselftab ,
807
that too is overridden.
808
Since the prefixed structure is the same in read and write, it is relatively
809
easy to write a server that responds to client requests by just copying new
810
data into the message body and then writing back the same buffer that was
811
read.
812
.PP
813
In this case (writing
814
.L headers
815
to the
816
.I ctl
817
file),
818
no
819
.I listen
820
nor
821
.I accept
822
is needed;
823
otherwise,
824
the usual sequence of
825
.IR announce ,
826
.IR listen ,
827
.I accept
828
must be executed before performing I/O on the corresponding
829
.I data
830
file.
831
.
832
.SS RUDP
833
RUDP is a reliable datagram protocol based on UDP,
834
currently only for IPv4.
835
Packets are delivered in order.
836
RUDP does not support
837
.BR listen .
838
One must write either
839
.L connect
840
or
841
.L announce
842
followed immediately by
843
.L headers
844
to
845
.BR ctl .
846
.PP
847
Unlike TCP, the reboot of one end of a connection does
848
not force a closing of the connection.  Communications will
849
resume when the rebooted machine resumes talking.  Any unacknowledged
850
packets queued before the reboot will be lost.  A reboot can
851
be detected by reading the
852
.B err
853
file.  It will contain the message
854
.IP
855
.BI hangup\  address ! port
856
.PP
857
where
858
.I address
859
and
860
.I port
861
are of the far side of the connection.
862
Retransmitting a datagram more than 10 times
863
is treated like a reboot:
864
all queued messages are dropped, an error is queued to the
865
.B err
866
file, and the conversation resumes.
867
.PP
868
RUDP
869
.I ctl
870
files accept the following messages:
871
.TF "\fLranddrop \fI[ percent ]\fR"
872
.TP
873
.B headers
874
Corresponds to the
875
.L headers
876
format of UDP.
877
.TP
878
.BI "hangup " "IP port"
879
Drop the connection to address
880
.I IP
881
and
882
.IR port .
883
.TP
884
.BI "randdrop " "[ percent ]"
885
Randomly drop
886
.I percent
887
of outgoing packets.
888
Default is 10%.
889
.
890
.SS ICMP
891
ICMP is a datagram protocol for IPv4 used to exchange control requests and
892
their responses with other machines' IP implementations.
893
ICMP is primarily a kernel-to-kernel protocol, but it is possible
894
to generate `echo request' and read `echo reply' packets from user programs.
895
.
896
.SS ICMPV6
897
ICMPv6 is the IPv6 equivalent of ICMP.
898
If, after an
899
.BR announce ,
900
the message
901
.L headers
902
is written to
903
.BR ctl ,
904
then before a write,
905
a user must prefix each message with a corresponding structure,
906
declared in
907
.BR <ip.h> :
908
.IP
909
.EX
910
/*
911
 *  user level icmpv6 with control message "headers"
912
 */
913
typedef struct Icmp6hdr Icmp6hdr;
914
struct Icmp6hdr {
915
	uchar	unused[8];
916
	uchar	laddr[IPaddrlen];	/* local address */
917
	uchar	raddr[IPaddrlen];	/* remote address */
918
};
919
.EE
920
.PP
921
In this case (writing
922
.L headers
923
to the
924
.I ctl
925
file),
926
no
927
.I listen
928
nor
929
.I accept
930
is needed;
931
otherwise,
932
the usual sequence of
933
.IR announce ,
934
.IR listen ,
935
.I accept
936
must be executed before performing I/O on the corresponding
937
.I data
938
file.
939
.
940
.SS GRE
941
GRE is the encapsulation protocol used by PPTP.
942
The kernel implements just enough of the protocol
943
to multiplex it.
944
Our implementation encapsulates in IPv4, per RFC 1702.
945
.B Announce
946
is not allowed in GRE, only
947
.BR connect .
948
Since GRE has no port numbers, the port number in the connect
949
is actually the 16 bit
950
.B eproto
951
field in the GRE header.
952
.PP
953
Reads and writes transfer a
954
GRE datagram starting at the GRE header.
955
On write, the kernel fills in the
956
.B eproto
957
field with the port number specified
958
in the connect message.
959
.br
960
.ne 3
961
.
962
.SS ESP
963
ESP is the Encapsulating Security Payload (RFC 1827, obsoleted by RFC 4303)
964
for IPsec (RFC 4301).
965
We currently implement only tunnel mode, not transport mode.
966
It is used to set up an encrypted tunnel between machines.
967
Like GRE, ESP has no port numbers.  Instead, the
968
port number in the
969
.B connect
970
message is the SPI (Security Association Identifier (sic)).
971
IP packets are written to and read from
972
.BR data .
973
The kernel encrypts any packets written to
974
.BR data ,
975
appends a MAC, and prefixes an ESP header before
976
sending to the other end of the tunnel.
977
Received packets are checked against their MAC's,
978
decrypted, and queued for reading from
979
.BR data .
980
In the following,
981
.I secret
982
is the hexadecimal encoding of a key,
983
without a leading
984
.LR 0x .
985
The control messages are:
986
.TF "\fLesp \fIalg secret\fR"
987
.PD
988
.TP
989
.BI esp\  "alg secret
990
Encrypt with the algorithm,
991
.IR alg ,
992
using
993
.I secret
994
as the key.
995
Possible algorithms are:
996
.BR null ,
997
.BR des_56_cbc ,
998
.BR des3_cbc ,
999
and eventually
1000
.BR aes_128_cbc ,
1001
and
1002
.BR aes_ctr .
1003
.TP
1004
.BI ah\  "alg secret
1005
Use the hash algorithm,
1006
.IR alg ,
1007
with
1008
.I secret
1009
as the key for generating the MAC.
1010
Possible algorithms are:
1011
.BR null ,
1012
.BR hmac_sha1_96 ,
1013
.BR hmac_md5_96 ,
1014
and eventually
1015
.BR aes_xcbc_mac_96 .
1016
.TP
1017
.B header
1018
Turn on header mode.  Every buffer read from
1019
.B data
1020
starts with 4 unused bytes, and the first 4 bytes
1021
of every buffer written to
1022
.B data
1023
are ignored.
1024
.TP
1025
.B noheader
1026
Turn off header mode.
1027
.
1028
.SS "IP packet filter
1029
The directory
1030
.B /net/ipmux
1031
looks like another protocol directory.
1032
It is a packet filter built on top of IP.
1033
Each numbered
1034
subdirectory represents a different filter.
1035
The connect messages written to the
1036
.I ctl
1037
file describe the filter. Packets matching the filter can be read on the
1038
.B data
1039
file.  Packets written to the
1040
.B data
1041
file are routed to an interface and transmitted.
1042
.PP
1043
A filter is a semicolon-separated list of
1044
relations.  Each relation describes a portion
1045
of a packet to match.  The possible relations are:
1046
.TF "\fLdata[\fIn\fL:\fIm\fL]=\fIexpr\fR "
1047
.PD
1048
.TP
1049
.BI proto= n
1050
the IP protocol number must be
1051
.IR n .
1052
.TP
1053
.BI data[ n : m ]= expr
1054
bytes
1055
.I n
1056
through
1057
.I m
1058
following the IP packet must match
1059
.IR expr .
1060
.TP
1061
.BI iph[ n : m ]= expr
1062
bytes
1063
.I n
1064
through
1065
.I m
1066
of the IP packet header must match
1067
.IR expr .
1068
.TP
1069
.BI ifc= expr
1070
the packet must have been received on an interface whose address
1071
matches
1072
.IR expr .
1073
.TP
1074
.BI src= expr
1075
The source address in the packet must match
1076
.IR expr .
1077
.TP
1078
.BI dst= expr
1079
The destination address in the packet must match
1080
.IR expr .
1081
.PP
1082
.I Expr
1083
is of the form:
1084
.TP
1085
.I \	value
1086
.TP
1087
.IB \	value | value | ...
1088
.TP
1089
.IB \	value & mask
1090
.TP
1091
.IB \	value | value & mask
1092
.PP
1093
If a mask is given, the relevant field is first ANDed with
1094
the mask.  The result is compared against the value or list
1095
of values for a match.  In the case of
1096
.BR ifc ,
1097
.BR dst ,
1098
and
1099
.B src
1100
the value is a dot-formatted IP address and the mask is a dot-formatted
1101
IP mask.  In the case of
1102
.BR data ,
1103
.B iph
1104
and
1105
.BR proto ,
1106
both value and mask are strings of 2 hexadecimal digits representing
1107
8-bit values.
1108
.PP
1109
A packet is delivered to only one filter.
1110
The filters are merged into a single comparison tree.
1111
If two filters match the same packet, the following
1112
rules apply in order (here '>' means is preferred to):
1113
.IP 1)
1114
protocol > data > source > destination > interface
1115
.IP 2)
1116
lower data offsets > higher data offsets
1117
.IP 3)
1118
longer matches > shorter matches
1119
.IP 4)
1120
older > younger
1121
.PP
1122
So far this has just been used to implement a version of
1123
OSPF in Inferno
1124
and 6to4 tunnelling.
1125
.br
1126
.ne 5
1127
.
1128
.SS Statistics
1129
The
1130
.B stats
1131
files are read only and contain statistics useful to network monitoring.
1132
.br
1133
.ne 12
1134
.PP
1135
Reading
1136
.B /net/ipifc/stats
1137
returns a list of 19 tagged and newline-separated fields representing:
1138
.EX
1139
.ft 1
1140
.2C
1141
.in +0.25i
1142
forwarding status (0 and 2 mean forwarding off,
1143
	1 means on)
1144
default TTL
1145
input packets
1146
input header errors
1147
input address errors
1148
packets forwarded
1149
input packets for unknown protocols
1150
input packets discarded
1151
input packets delivered to higher level protocols
1152
output packets
1153
output packets discarded
1154
output packets with no route
1155
timed out fragments in reassembly queue
1156
requested reassemblies
1157
successful reassemblies
1158
failed reassemblies
1159
successful fragmentations
1160
unsuccessful fragmentations
1161
fragments created
1162
.in -0.25i
1163
.1C
1164
.ft
1165
.EE
1166
.br
1167
.ne 16
1168
.PP
1169
Reading
1170
.B /net/icmp/stats
1171
returns a list of 26 tagged and newline-separated fields representing:
1172
.EX
1173
.ft 1
1174
.2C
1175
.in +0.25i
1176
messages received
1177
bad received messages
1178
unreachables received
1179
time exceededs received
1180
input parameter problems received
1181
source quenches received
1182
redirects received
1183
echo requests received
1184
echo replies received
1185
timestamps received
1186
timestamp replies received
1187
address mask requests received
1188
address mask replies received
1189
messages sent
1190
transmission errors
1191
unreachables sent
1192
time exceededs sent
1193
input parameter problems sent
1194
source quenches sent
1195
redirects sent
1196
echo requests sent
1197
echo replies sent
1198
timestamps sent
1199
timestamp replies sent
1200
address mask requests sent
1201
address mask replies sent
1202
.in -0.25i
1203
.1C
1204
.EE
1205
.PP
1206
Reading
1207
.B /net/tcp/stats
1208
returns a list of 11 tagged and newline-separated fields representing:
1209
.EX
1210
.ft 1
1211
.2C
1212
.in +0.25i
1213
maximum number of connections
1214
total outgoing calls
1215
total incoming calls
1216
number of established connections to be reset
1217
number of currently established connections
1218
segments received
1219
segments sent
1220
segments retransmitted
1221
retransmit timeouts
1222
bad received segments
1223
transmission failures
1224
.in -0.25i
1225
.1C
1226
.EE
1227
.PP
1228
Reading
1229
.B /net/udp/stats
1230
returns a list of 4 tagged and newline-separated fields representing:
1231
.EX
1232
.ft 1
1233
.2C
1234
.in +0.25i
1235
datagrams received
1236
datagrams received for bad ports
1237
malformed datagrams received
1238
datagrams sent
1239
.in -0.25i
1240
.1C
1241
.EE
1242
.PP
1243
Reading
1244
.B /net/gre/stats
1245
returns a list of 1 tagged number representing:
1246
.EX
1247
.ft 1
1248
.in +0.25i
1249
header length errors
1250
.in -0.25i
1251
.EE
1252
.SH "SEE ALSO"
1253
.IR dial (2),
1254
.IR ip (2),
1255
.IR bridge (3),
1256
.\" .IR ike (4),
1257
.IR ndb (6),
1258
.IR listen (8)
1259
.br
1260
.PD 0
1261
.TF "\fL/lib/rfc/rfc2822"
1262
.TP
1263
.B /lib/rfc/rfc2460
1264
IPv6
1265
.TP
1266
.B /lib/rfc/rfc4291
1267
IPv6 address architecture
1268
.TP
1269
.B /lib/rfc/rfc4443
1270
ICMPv6
1271
.SH SOURCE
1272
.B /sys/src/9/ip
1273
.SH BUGS
1274
.I Ipmux
1275
has not been heavily used and should be considered experimental.
1276
It may disappear in favor of a more traditional packet filter in the future.