Subversion Repositories planix.SVN

Rev

Details | Last modification | View Log | RSS feed

Rev Author Line No. Line
2 - 1
.HTML "Lexical File Names in Plan 9 or Getting Dot-Dot Right
2
.hw re-create
3
.hw re-created
4
.TL
5
Lexical File Names in Plan 9
6
.br
7
or
8
.br
9
Getting Dot-Dot Right
10
.AU
11
Rob Pike
12
.CW rob@plan9.bell-labs.com
13
.AI
14
.MH
15
.AB
16
.LP
17
Symbolic links make the Unix file system non-hierarchical, resulting in
18
multiple valid path names for a given file.
19
This ambiguity is a source of confusion, especially since some shells
20
work overtime to present a consistent view from programs such as
21
.CW pwd ,
22
while other programs and
23
the kernel itself do nothing about the problem.
24
.LP
25
Plan 9 has no symbolic links but it does have other mechanisms that produce the same difficulty.
26
Moreover, Plan 9 is founded on the ability to control a program's environment
27
by manipulating its name space.
28
Ambiguous names muddle the result of operations such as copying a name space across
29
the network.
30
.LP
31
To address these problems,
32
the Plan 9 kernel has been modified to maintain an accurate path name for every active
33
file (open file, working directory, mount table entry) in the system.
34
The definition of `accurate' is that the path name for a file is guaranteed to be the rooted,
35
absolute name
36
the program used to acquire it.
37
These names are maintained by an efficient method that combines lexical processing\(emsuch as
38
evaluating
39
.CW ..
40
by just removing the last path name element of a directory\(emwith
41
local operations within the file system to maintain a consistently, easily understood view
42
of the name system.
43
Ambiguous situations are resolved by examining the lexically maintained names themselves.
44
.LP
45
A new kernel call,
46
.CW fd2path ,
47
returns the file name associated with an open file,
48
permitting the use of reliable names to improve system
49
services ranging from
50
.CW pwd
51
to debugging.
52
Although this work was done in Plan 9,
53
Unix systems could also benefit from the addition of
54
a method to recover the accurate name of an
55
open file or the current directory.
56
.AE
57
.SH
58
Motivation
59
.LP
60
Consider the following unedited transcript of a session running the Bourne shell on a modern
61
Unix system:
62
.P1
63
% echo $HOME
64
/home/rob
65
% cd $HOME
66
% pwd
67
/n/bopp/v7/rob
68
% cd /home/rob
69
% cd /home/ken
70
% cd ../rob
71
\&../rob: bad directory
72
% 
73
.P2
74
(The same output results from running
75
.CW tcsh ;
76
we'll discuss
77
.CW ksh
78
in a moment.)
79
To a neophyte being schooled in the delights of a hierarchical file name space,
80
this behavior must be baffling.
81
It is, of course, the consequence of a series of symbolic links intended to give users
82
the illusion they share a disk, when in fact their files are scattered over several devices:
83
.P1
84
.ps -1
85
% ls -ld /home/rob /home/ken
86
lrwxr-xr-x  1 root  sys   14 Dec 26  1998 /home/ken -> /n/bopp/v6/ken
87
lrwxr-xr-x  1 root  sys   14 Dec 23  1998 /home/rob -> /n/bopp/v7/rob
88
% 
89
.ps
90
.P2
91
The introduction of symbolic links has changed the Unix file system from a true
92
hierarchy into a directed graph, rendering
93
.CW ..
94
ambiguous and sowing confusion.
95
.LP
96
Unix popularized hierarchical naming, but the introduction of symbolic links
97
made its naming irregular.
98
Worse, the
99
.CW pwd
100
command, through the underlying
101
.CW getwd
102
library routine,
103
uses a tricky, expensive algorithm that often delivers the wrong answer.
104
Starting from the current directory,
105
.CW getwd
106
opens the parent,
107
.CW .. ,
108
and searches it for an entry whose i-number matches the current directory;
109
the matching entry is the final path element of the ultimate result.
110
Applying this process iteratively,
111
.CW getwd
112
works back towards the root.
113
Since
114
.CW getwd
115
knows nothing about symbolic links, it will recover surprising names for
116
directories reached by them,
117
as illustrated by the example;
118
the backward paths
119
.CW getwd
120
traverses will not backtrack across the links.
121
.LP
122
Partly for efficiency and partly to make
123
.CW cd
124
and
125
.CW pwd
126
more predictable, the Korn shell
127
.CW ksh
128
[Korn94]
129
implements
130
.CW pwd
131
as a builtin.
132
(The
133
.CW cd
134
command must be a builtin in any shell, since the current directory is unique to each process.)
135
.CW Ksh
136
maintains its own private view of the file system to try to disguise symbolic links;
137
in particular,
138
.CW cd
139
and
140
.CW pwd
141
involve some lexical processing (somewhat like the
142
.CW cleanname
143
function discussed later
144
in this paper), augmented by heuristics such as examining the environment
145
for names like
146
.CW $HOME
147
and
148
.CW $PWD
149
to assist initialization of the state of the private view. [Korn00]
150
.LP
151
This transcript begins with a Bourne shell running:
152
.P1
153
% cd /home/rob
154
% pwd
155
/n/bopp/v7/rob
156
% ksh
157
$ pwd
158
/home/rob
159
$ 
160
.P2
161
This result is encouraging.  Another example, again starting from a Bourne shell:
162
.P1
163
% cd /home/rob
164
% cd ../ken
165
\&../ken: bad directory
166
% ksh
167
$ pwd
168
/home/rob
169
$ cd ../ken
170
$ pwd
171
/home/ken
172
$
173
.P2
174
By doing extra work,
175
the Korn shell is providing more sensible behavior,
176
but it is easy to defeat:
177
.P1
178
% cd /home/rob
179
% pwd
180
/n/bopp/v7/rob
181
% cd bin
182
% pwd
183
/n/bopp/v7/rob/bin
184
% ksh
185
$ pwd
186
/n/bopp/v7/rob/bin
187
$ exit
188
% cd /home/ken
189
% pwd
190
/n/bopp/v6/ken
191
% ksh
192
$ pwd
193
/n/bopp/v6/ken
194
$ 
195
.P2
196
In these examples,
197
.CW ksh 's
198
built-in
199
.CW pwd
200
failed to produce the results
201
.CW /home/rob/bin "" (
202
and
203
.CW /home/ken )
204
that the previous example might have led us to expect.
205
The Korn shell is hiding the problem, not solving it, and in fact is not even hiding it very well.
206
.LP
207
A deeper question is whether the shell should even be trying to make
208
.CW pwd
209
and
210
.CW cd
211
do a better job.
212
If it does, then the
213
.CW getwd
214
library call and every program that uses it will behave differently from the shell,
215
a situation that is sure to confuse.
216
Moreover, the ability to change directory to
217
.CW ../ken
218
with the Korn shell's
219
.CW cd
220
command but not with the
221
.CW chdir
222
system call is a symptom of a diseased system, not a healthy shell.
223
.LP
224
The operating system should provide names that work and make sense.
225
Symbolic links, though, are here to stay, so we need a way to provide
226
sensible, unambiguous names in the face of a non-hierarchical name space.
227
This paper shows how the challenge was met on Plan 9, an operating system
228
with Unix-like naming.
229
.SH
230
Names in Plan 9
231
.LP
232
Except for some details involved with bootstrapping, file names in Plan 9 have the same syntax as in Unix.
233
Plan 9 has no symbolic links, but its name space construction operators,
234
.CW bind
235
and
236
.CW mount ,
237
make it possible to build the same sort of non-hierarchical structures created
238
by symbolically linking directories on Unix.
239
.LP
240
Plan 9's
241
.CW mount
242
system call takes a file descriptor
243
and attaches to the local name space the file system service it represents:
244
.P1
245
mount(fd, "/dir", flags)
246
.P2
247
Here
248
.CW fd
249
is a file descriptor to a communications port such as a pipe or network connection;
250
at the other end of the port is a service, such as file server, that talks 9P, the Plan 9 file
251
system protocol.
252
After the call succeeds, the root directory of the service will be visible at the
253
.I "mount point
254
.CW /dir ,
255
much as with the
256
.CW mount
257
call of Unix.
258
The
259
.CW flag
260
argument specifies the nature of the attachment:
261
.CW MREPL
262
says that the contents of the root directory (appear to) replace the current contents of
263
.CW /dir ;
264
.CW MAFTER
265
says that the current contents of
266
.CW dir
267
remain visible, with the mounted directory's contents appearing
268
.I after
269
any existing files;
270
and
271
.CW MBEFORE
272
says that the contents remain visible, with
273
the mounted directory's contents appearing
274
.I before
275
any existing files.
276
These multicomponent directories are called
277
.I "union directories
278
and are somewhat different from union directories in 4.4BSD-Lite [PeMc95], because
279
only the top-level directory itself is unioned, not its descendents, recursively.
280
(Plan 9's union directories are used differently from 4.4BSD-Lite's, as will become apparent.)
281
.LP
282
For example, to bootstrap a diskless computer the system builds a local name space containing
283
only the root directory,
284
.CW / ,
285
then uses the network to open a connection
286
to the main file server.
287
It then executes
288
.P1
289
mount(rootfd, "/", MREPL);
290
.P2
291
After this call, the entire file server's tree is visible, starting from the root of the local machine.
292
.LP
293
While
294
.CW mount
295
connects a new service to the local name space,
296
.CW bind
297
rearranges the existing name space:
298
.P1
299
bind("tofile", "fromfile", flags)
300
.P2
301
causes subsequent mention of the
302
.CW fromfile
303
(which may be a plain file or a directory)
304
to behave as though
305
.CW tofile
306
had been mentioned instead, somewhat like a symbolic link.
307
(Note, however, that the arguments are in the opposite order
308
compared to
309
.CW ln
310
.CW -s ).
311
The
312
.CW flags
313
argument is the same as with
314
.CW mount .
315
.LP
316
As an example, a sequence something like the following is done at bootstrap time to
317
assemble, under the single directory
318
.CW /bin ,
319
all of the binaries suitable for this architecture, represented by (say) the string
320
.CW sparc :
321
.P1
322
bind("/sparc/bin", "/bin", MREPL);
323
bind("/usr/rob/sparc/bin", "/bin", MAFTER);
324
.P2
325
This sequence of
326
.CW binds
327
causes
328
.CW /bin
329
to contain first the standard binaries, then the contents of
330
.CW rob 's
331
private SPARC binaries.
332
The ability to build such union directories
333
obviates the need for a shell
334
.CW $PATH
335
variable
336
while providing opportunities for managing heterogeneity.
337
If the system were a Power PC, the same sequence would be run with
338
.CW power
339
textually substituted for
340
.CW sparc
341
to place the Power PC binaries in
342
.CW /bin
343
rather than the SPARC binaries.
344
.LP
345
Trouble is already brewing.  After these bindings are set up,
346
where does
347
.P1
348
% cd /bin
349
% cd ..
350
.P2
351
set the current working directory, to
352
.CW /
353
or
354
.CW /sparc
355
or
356
.CW /usr/rob/sparc ?
357
We will return to this issue.
358
.LP
359
There are some important differences between
360
.CW binds
361
and symbolic links.
362
First,
363
symbolic links are a static part of the file system, while
364
Plan 9 bindings are created at run time, are stored in the kernel,
365
and endure only as long as the system maintains them;
366
they are temporary.
367
Since they are known to the kernel but not the file system, they must
368
be set up each time the kernel boots or a user logs in;
369
permanent bindings are created by editing system initialization scripts
370
and user profiles rather than by building them in the file system itself.
371
.LP
372
The Plan 9 kernel records what bindings are active for a process,
373
whereas symbolic links, being held on the Unix file server, may strike whenever the process evaluates
374
a file name.
375
Also, symbolic links apply to all processes that evaluate the affected file, whereas
376
.CW bind
377
has a local scope, applying only to the process that executes it and possibly some of its
378
peers, as discussed in the next section.
379
Symbolic links cannot construct the sort of
380
.CW /bin
381
directory built here; it is possible to have multiple directories point to
382
.CW /bin
383
but not the other way around.
384
.LP
385
Finally,
386
symbolic links are symbolic, like macros: they evaluate the associated names each time
387
they are accessed.
388
Bindings, on the other hand, are evaluated only once, when the bind is executed;
389
after the binding is set up, the kernel associates the underlying files, rather than their names.
390
In fact, the kernel's representation of a bind is identical to its representation of a mount;
391
in effect, a bind is a mount of the
392
.CW tofile
393
upon the
394
.CW fromfile .
395
The binds and mounts coexist in a single
396
.I "mount table" ,
397
the subject of the next section.
398
.SH
399
The Mount Table
400
.LP
401
Unix has a single global mount table
402
for all processes in the system, but Plan 9's mount tables are local to each process.
403
By default it is inherited when a process forks, so mounts and binds made by one
404
process affect the other, but a process may instead inherit a copy,
405
so modifications it makes will be invisible to other processes.
406
The convention is that related processes, such
407
as processes running in a single window, share a mount table, while sets of processes
408
in different windows have distinct mount tables.
409
In practice, the name spaces of the two windows will appear largely the same,
410
but the possibility for different processes to see different files (hence services) under
411
the same name is fundamental to the system,
412
affecting the design of key programs such as the
413
window system [Pike91].
414
.LP
415
The Plan 9 mount table is little more than an ordered list of pairs, mapping the
416
.CW fromfiles
417
to the
418
.CW tofiles .
419
For mounts, the
420
.CW tofile
421
will be an item called a
422
.CW Channel ,
423
similar to a Unix
424
.CW vnode ,
425
pointing to the root of the file service,
426
while for a bind it will be the
427
.CW Channel
428
pointing to the
429
.CW tofile
430
mentioned in the
431
.CW bind
432
call.
433
In both cases, the
434
.CW fromfile
435
entry in the table
436
will be a
437
.CW Channel
438
pointing to the
439
.CW fromfile
440
itself.
441
.LP
442
The evaluation of a file name proceeds as follows.
443
If the name begins with a slash, start with the
444
.CW Channel
445
for the root; otherwise start with the
446
.CW Channel
447
for the current directory of the process.
448
For each path element in the name,
449
such as
450
.CW usr
451
in
452
.CW /usr/rob ,
453
try to `walk' the
454
.CW Channel
455
to that element [Pike93].
456
If the walk succeeds, look to see if the resulting
457
.CW Channel
458
is the same as any
459
.CW fromfile
460
in the mount table, and if so, replace it by the corresponding
461
.CW tofile .
462
Advance to the next element and continue.
463
.LP
464
There are a couple of nuances.  If the directory being walked is a union directory,
465
the walk is attempted in the elements of the union, in order, until a walk succeeds.
466
If none succeed, the operation fails.
467
Also, when the destination of a walk is a directory for a purpose such as the
468
.CW chdir
469
system call or the
470
.CW fromfile
471
in a
472
.CW bind ,
473
once the final walk of the sequence has completed the operation stops;
474
the final check through the mount table is not done.
475
Among other things, this simplifies the management of union directories;
476
for example, subsequent
477
.CW bind
478
calls will append to the union associated with the underlying
479
.CW fromfile
480
instead of what is bound upon it.
481
.SH
482
A Definition of Dot-Dot
483
.LP
484
The ability to construct union directories and other intricate naming structures
485
introduces some thorny problems: as with symbolic links,
486
the name space is no longer hierarchical, files and directories can have multiple
487
names, and the meaning of
488
.CW .. ,
489
the parent directory, can be ambiguous.
490
.LP
491
The meaning of
492
.CW ..
493
is straightforward if the directory is in a locally hierarchical part of the name space,
494
but if we ask what
495
.CW ..
496
should identify when the current directory is a mount point or union directory or
497
multiply symlinked spot (which we will henceforth call just a mount point, for brevity),
498
there is no obvious answer.
499
Name spaces have been part of Plan 9 from the beginning, but the definition of
500
.CW ..
501
has changed several times as we grappled with this issue.
502
In fact, several attempts to clarify the meaning of
503
.CW ..
504
by clever coding
505
resulted in definitions that could charitably be summarized as `what the implementation gives.'
506
.LP
507
Frustrated by this situation, and eager to have better-defined names for some of the
508
applications described later in this paper, we recently proposed the following definition
509
for
510
.CW .. :
511
.IP
512
The parent of a directory
513
.I X ,
514
.I X\f(CW/..\f1,
515
is the same directory that would obtain if
516
we instead accessed the directory named by stripping away the last
517
path name element of
518
.I X .
519
.LP
520
For example, if we are in the directory
521
.CW /a/b/c
522
and
523
.CW chdir
524
to
525
.CW .. ,
526
the result is
527
.I exactly
528
as if we had executed a
529
.CW chdir
530
to
531
.CW /a/b .
532
.LP
533
This definition is easy to understand and seems natural.
534
It is, however, a purely
535
.I lexical
536
definition that flatly ignores evaluated file names, mount tables, and
537
other kernel-resident data structures.
538
Our challenge is to implement it efficiently.
539
One obvious (and correct)
540
implementation is to rewrite path names lexically to fold out
541
.CW .. ,
542
and then evaluate the file name forward from the root,
543
but this is expensive and unappealing.
544
We want to be able to use local operations to evaluate file names,
545
but maintain the global, lexical definition of dot-dot.
546
It isn't too hard.
547
.SH
548
The Implementation
549
.LP
550
To operate lexically on file names, we associate a name with each open file in the kernel, that
551
is, with each 
552
.CW Channel
553
data structure.
554
The first step is therefore to store a
555
.CW char*
556
with each
557
.CW Channel
558
in the system, called its
559
.CW Cname ,
560
that records the
561
.I absolute
562
rooted
563
file name for the
564
.CW Channel .
565
.CW Cnames
566
are stored as full text strings, shared copy-on-write for efficiency.
567
The task is to maintain each
568
.CW Cname
569
as an accurate absolute name using only local operations.
570
.LP
571
When a file is opened, the file name argument in the
572
.CW open
573
(or
574
.CW chdir
575
or
576
.CW bind
577
or ...) call is recorded in the
578
.CW Cname
579
of the resulting
580
.CW Channel .
581
When the file name begins with a slash, the name is stored as is,
582
subject to a cleanup pass described in the next section.
583
Otherwise, it is a local name, and the file name must be made
584
absolute by prefixing it with the
585
.CW Cname
586
of the current directory, followed by a slash.
587
For example, if we are in
588
.CW /home/rob
589
and
590
.CW chdir
591
to
592
.CW bin ,
593
the
594
.CW Cname
595
of the resulting
596
.CW Channel
597
will be the string
598
.CW /home/rob/bin .
599
.LP
600
This assumes, of course, that the local file name contains no
601
.CW ..
602
elements.
603
If it does, instead of storing for example
604
.CW /home/rob/..
605
we delete the last element of the existing name and set the
606
.CW Cname
607
to
608
.CW /home .
609
To maintain the lexical naming property we must guarantee that the resulting
610
.CW Cname ,
611
if it were to be evaluated, would yield the identical directory to the one
612
we actually do get by the local
613
.CW ..
614
operation.
615
.LP
616
If the current directory is not a mount point, it is easy to maintain the lexical property.
617
If it is a mount point, though, it is still possible to maintain it on Plan 9
618
because the mount table, a kernel-resident data structure, contains all the
619
information about the non-hierarchical connectivity of the name space.
620
(On Unix, by contrast, symbolic links are stored on the file server rather than in the kernel.)
621
Moreover, the presence of a full file name for each
622
.CW Channel
623
in the mount table provides the information necessary to resolve ambiguities.
624
.LP
625
The mount table is examined in the
626
.CW from\f1\(->\fPto
627
direction when evaluating a name, but
628
.CW ..
629
points backwards in the hierarchy, so to evaluate
630
.CW ..
631
the table must be examined in the
632
.CW to\f1\(->\fPfrom
633
direction.
634
(``How did we get here?'')
635
.LP
636
The value of
637
.CW ..
638
is ambiguous when there are multiple bindings (mount points) that point to
639
the directories involved in the evaluation of
640
.CW .. .
641
For example, return to our original script with
642
.CW /n/bopp/v6
643
(containing a home directory for
644
.CW ken )
645
and
646
.CW /n/bopp/v7
647
(containing a home directory for
648
.CW rob )
649
unioned into
650
.CW /home .
651
This is represented by two entries in the mount table,
652
.CW from=/home ,
653
.CW to=/n/bopp/v6
654
and
655
.CW from=/home ,
656
.CW to=/n/bopp/v7 .
657
If we have set our current directory to
658
.CW /home/rob
659
(which has landed us in the physical location
660
.CW /n/bopp/v7/rob )
661
our current directory is not a mount point but its parent is.
662
The value of
663
.CW ..
664
is ambiguous: it could be
665
.CW /home ,
666
.CW /n/bopp/v7 ,
667
or maybe even
668
.CW /n/bopp/v6 ,
669
and the ambiguity is caused by two
670
.CW tofiles
671
bound to the same
672
.CW fromfile .
673
By our definition, if we now evaluate
674
.CW .. ,
675
we should acquire the directory
676
.CW /home ;
677
otherwise
678
.CW ../ken
679
could not possibly result in
680
.CW ken 's
681
home directory, which it should.
682
On the other hand, if we had originally gone to
683
.CW /n/bopp/v7/rob ,
684
the name
685
.CW ../ken
686
should
687
.I not
688
evaluate to
689
.CW ken 's
690
home directory because there is no directory
691
.CW /n/bopp/v7/ken
692
.CW ken 's (
693
home directory is on
694
.CW v6 ).
695
The problem is that by using local file operations, it is impossible
696
to distinguish these cases: regardless of whether we got here using the name
697
.CW /home/rob
698
or
699
.CW /n/bopp/v7/rob ,
700
the resulting directory is the same.
701
Moreover, the mount table does not itself have enough information
702
to disambiguate: when we do a local operation to evaluate
703
.CW ..
704
and land in
705
.CW /n/bopp/v7 ,
706
we discover that the directory is a
707
.CW tofile
708
in the mount table; should we step back through the table to
709
.CW /home
710
or not?
711
.LP
712
The solution comes from the
713
.CW Cnames
714
themselves.
715
Whether to step back through the mount point
716
.CW from=/home ,
717
.CW to=/n/bopp/v7
718
when evaluating
719
.CW ..
720
in
721
.CW rob 's
722
directory is trivially resolved by asking the question,
723
Does the
724
.CW Cname
725
for the directory begin
726
.CW /home ?
727
If it does, then the path that was evaluated to get us to the current
728
directory must have gone through this mount point, and we should
729
back up through it to evaluate
730
.CW .. ;
731
if not, then this mount table entry is irrelevant.
732
.LP
733
More precisely,
734
both
735
.I before
736
and
737
.I after
738
each
739
.CW ..
740
element in the path name is evaluated,
741
if the directory is a
742
.CW tofile
743
in the mount table, the corresponding
744
.CW fromfile
745
is taken instead, provided the
746
.CW Cname
747
of the corresponding
748
.CW fromfile
749
is the prefix of the
750
.CW Cname
751
of the original directory.
752
Since we always know the full name of the directory
753
we are evaluating, we can always compare it against all the entries in the mount table that point
754
to it, thereby resolving ambiguous situations
755
and maintaining the
756
lexical property of
757
.CW .. .
758
This check also guarantees we don't follow a misleading mount point, such as the entry pointing to
759
.CW /home
760
when we are really in
761
.CW /n/bopp/v7/rob .
762
Keeping the full names with the
763
.CW Channels
764
makes it easy to use the mount table to decide how we got here and, therefore,
765
how to get back.
766
.LP
767
In summary, the algorithm is as follows.
768
Use the usual file system operations to walk to
769
.CW .. ;
770
call the resulting directory
771
.I d .
772
Lexically remove
773
the last element of the initial file name.
774
Examine all entries in the mount table whose
775
.CW tofile
776
is
777
.I d
778
and whose
779
.CW fromfile
780
has a
781
.CW Cname
782
identical to the truncated name.
783
If one exists, that
784
.CW fromfile
785
is the correct result; by construction, it also has the right
786
.CW Cname .
787
In our example, evaluating
788
.CW ..
789
in
790
.CW /home/rob
791
(really
792
.CW /n/bopp/v7/rob )
793
will set
794
.I d
795
to
796
.CW /n/bopp/v7 ;
797
that is a
798
.CW tofile
799
whose
800
.CW fromfile
801
is
802
.CW /home .
803
Removing the
804
.CW /rob
805
from the original
806
.CW Cname ,
807
we find the name
808
.CW /home ,
809
which matches that of the
810
.CW fromfile ,
811
so the result is the
812
.CW fromfile ,
813
.CW /home .
814
.LP
815
Since this implementation uses only local operations to maintain its names,
816
it is possible to confuse it by external changes to the file system.
817
Deleting or renaming directories and files that are part of a
818
.CW Cname ,
819
or modifying the mount table, can introduce errors.
820
With more implementation work, such mistakes could probably be caught,
821
but in a networked environment, with machines sharing a remote file server, renamings
822
and deletions made by one machine may go unnoticed by others.
823
These problems, however, are minor, uncommon and, most important, easy to understand.
824
The method maintains the lexical property of file names unless an external
825
agent changes the name surreptitiously;
826
within a stable file system, it is always maintained and
827
.CW pwd
828
is always right.
829
.LP
830
To recapitulate, maintaining the
831
.CW Channel 's
832
absolute file names lexically and using the names to disambiguate the
833
mount table entries when evaluating
834
.CW ..
835
at a mount point
836
combine to maintain the lexical definition of
837
.CW ..
838
efficiently.
839
.SH
840
Cleaning names
841
.LP
842
The lexical processing can generate names that are messy or redundant,
843
ones with extra slashes or embedded
844
.CW ../
845
or
846
.CW ./
847
elements and other extraneous artifacts.
848
As part of the kernel's implementation, we wrote a procedure,
849
.CW cleanname ,
850
that rewrites a name in place to canonicalize its appearance.
851
The procedure is useful enough that it is now part of the Plan 9 C
852
library and is employed by many programs to make sure they always
853
present clean file names.
854
.LP
855
.CW Cleanname
856
is analogous to the URL-cleaning rules defined in RFC 1808 [Field95], although
857
the rules are slightly different.
858
.CW Cleanname
859
iteratively does the following until no further processing can be done:
860
.IP
861
1. Reduce multiple slashes to a single slash.
862
.IP
863
2. Eliminate
864
.CW .
865
path name elements
866
(the current directory).
867
.IP
868
3. Eliminate
869
.CW ..
870
path name elements (the parent directory) and the
871
.CW . "" non-
872
.CW .., "" non-
873
element that precedes them.
874
.IP
875
4. Eliminate
876
.CW ..
877
elements that begin a rooted path, that is, replace
878
.CW /..
879
by
880
.CW /
881
at the beginning of a path.
882
.IP
883
5. Leave intact
884
.CW ..
885
elements that begin a non-rooted path.
886
.LP
887
If the result of this process is a null string,
888
.CW cleanname
889
returns the string
890
.CW \&"." ,
891
representing the current directory.
892
.SH
893
The fd2path system call
894
.LP
895
Plan 9 has a new system call,
896
.CW fd2path ,
897
to enable programs to extract the
898
.CW Cname
899
associated with an open file descriptor.
900
It takes three arguments: a file descriptor, a buffer, and the size of the buffer:
901
.P1
902
int fd2path(int fd, char *buf, int nbuf)
903
.P2
904
It returns an error if the file descriptor is invalid; otherwise it fills the buffer with the name
905
associated with
906
.CW fd .
907
(If the name is too long, it is truncated; perhaps this condition should also draw an error.)
908
The
909
.CW fd2path
910
system call is very cheap, since all it does is copy the
911
.CW Cname
912
string to user space.
913
.LP
914
The Plan 9 implementation of
915
.CW getwd
916
uses
917
.CW fd2path
918
rather than the tricky algorithm necessary in Unix:
919
.P1
920
char*
921
getwd(char *buf, int nbuf)
922
{
923
	int n, fd;
924
 
925
	fd = open(".", OREAD);
926
	if(fd < 0)
927
		return NULL;
928
	n = fd2path(fd, buf, nbuf);
929
	close(fd);
930
	if(n < 0)
931
		return NULL;
932
	return buf;
933
}
934
.P2
935
(The Unix specification of
936
.CW getwd
937
does not include a count argument.)
938
This version of
939
.CW getwd
940
is not only straightforward, it is very efficient, reducing the performance
941
advantage of a built-in
942
.CW pwd
943
command while guaranteeing that all commands, not just
944
.CW pwd ,
945
see sensible directory names.
946
.LP
947
Here is a routine that prints the file name associated
948
with each of its open file descriptors; it is useful for tracking down file descriptors
949
left open by network listeners, text editors that spawn commands, and the like:
950
.P1
951
void
952
openfiles(void)
953
{
954
	int i;
955
	char buf[256];
956
 
957
	for(i=0; i<NFD; i++)
958
		if(fd2path(i, buf, sizeof buf) >= 0)
959
			print("%d: %s\en", i, buf);
960
}
961
.P2
962
.SH
963
Uses of good names
964
.LP
965
Although
966
.CW pwd
967
was the motivation for getting names right, good file names are useful in many contexts
968
and have become a key part of the Plan 9 programming environment.
969
The compilers record in the symbol table the full name of the source file, which makes
970
it easy to track down the source of buggy, old software and also permits the
971
implementation of a program,
972
.CW src ,
973
to automate tracking it down.
974
Given the name of a program,
975
.CW src
976
reads its symbol table, extracts the file information,
977
and triggers the editor to open a window on the program's
978
source for its
979
.CW main
980
routine.
981
No guesswork, no heuristics.
982
.LP
983
The
984
.CW openfiles
985
routine was the inspiration for a new file in the
986
.CW /proc
987
file system [Kill84].
988
For process
989
.I n ,
990
the file
991
.CW /proc/\f2n\fP/fd
992
is a list of all its open files, including its working directory,
993
with associated information including its open status,
994
I/O offset, unique id (analogous to i-number)
995
and file name.
996
Here is the contents of the
997
.CW fd
998
file for a process in the window system on the machine being used to write this paper:
999
.P1
1000
% cat /proc/125099/fd 
1001
/usr/rob
1002
 
1003
  1 w  M 5141 00000001.00000000       51 /mnt/term/dev/cons
1004
  2 w  M 5141 00000001.00000000       51 /mnt/term/dev/cons
1005
  3 r  M 5141 0000000b.00000000     1166 /dev/snarf
1006
  4 rw M 5141 0ffffffc.00000000      288 /dev/draw/new
1007
  5 rw M 5141 00000036.00000000  4266337 /dev/draw/3/data
1008
  6 r  M 5141 00000037.00000000        0 /dev/draw/3/refresh
1009
  7 r  c    0 00000004.00000000  6199848 /dev/bintime
1010
% 
1011
.P2
1012
(The Linux implementation of
1013
.CW /proc
1014
provides a related service by giving a directory in which each file-descriptor-numbered file is
1015
a symbolic link to the file itself.)
1016
When debugging errant systems software, such information can be valuable.
1017
.LP
1018
Another motivation for getting names right was the need to extract from the system
1019
an accurate description of the mount table, so that a process's name space could be
1020
recreated on another machine, in order to move (or simulate) a computing environment
1021
across the network.
1022
One program that does this is Plan 9's
1023
.CW cpu
1024
command, which recreates the local name space on a remote machine, typically a large
1025
fast multiprocessor.
1026
Without accurate names, it was impossible to do the job right; now
1027
.CW /proc
1028
provides a description of the name space of each process,
1029
.CW /proc/\f2n\fP/ns :
1030
.P1
1031
% cat /proc/125099/ns
1032
bind  / /
1033
mount -aC #s/boot / 
1034
bind  #c /dev
1035
bind  #d /fd
1036
bind -c #e /env
1037
bind  #p /proc
1038
bind -c #s /srv
1039
bind  /386/bin /bin
1040
bind -a /rc/bin /bin
1041
bind  /net /net
1042
bind -a #l /net
1043
mount -a #s/cs /net 
1044
mount -a #s/dns /net 
1045
bind -a #D /net
1046
mount -c #s/boot /n/emelie 
1047
bind -c /n/emelie/mail /mail
1048
mount -c /net/il/134/data /mnt/term 
1049
bind -a /usr/rob/bin/rc /bin
1050
bind -a /usr/rob/bin/386 /bin
1051
mount  #s/boot /n/emelieother other
1052
bind -c /n/emelieother/rob /tmp
1053
mount  #s/boot /n/dump dump
1054
bind  /mnt/term/dev/cons /dev/cons
1055
\&...
1056
cd /usr/rob
1057
% 
1058
.P2
1059
(The
1060
.CW #
1061
notation identifies raw device drivers so they may be attached to the name space.)
1062
The last line of the file gives the working directory of the process.
1063
The format of this file is that used by a library routine,
1064
.CW newns ,
1065
which reads a textual description like this and reconstructs a name space.
1066
Except for the need to quote
1067
.CW #
1068
characters, the output is also a shell script that invokes the user-level commands
1069
.CW bind
1070
and
1071
.CW mount ,
1072
which are just interfaces to the underlying system calls.
1073
However,
1074
files like
1075
.CW /net/il/134/data
1076
represent network connections; to find out where they point, so that the corresponding
1077
calls can be reestablished for another process,
1078
they must be examined in more detail using the network device files [PrWi93].  Another program,
1079
.CW ns ,
1080
does this; it reads the
1081
.CW /proc/\f2n\fP/ns
1082
file, decodes the information, and interprets it, translating the network
1083
addresses and quoting the names when required:
1084
.P1
1085
\&...
1086
mount -a '#s/dns' /net 
1087
\&...
1088
mount -c il!135.104.3.100!12884 /mnt/term 
1089
\&...
1090
.P2
1091
These tools make it possible to capture an accurate description of a process's
1092
name space and recreate it elsewhere.
1093
And like the open file descriptor table,
1094
they are a boon to debugging; it is always helpful to know
1095
exactly what resources a program is using.
1096
.SH
1097
Adapting to Unix
1098
.LP
1099
This work was done for the Plan 9 operating system, which has the advantage that
1100
the non-hierarchical aspects of the name space are all known to the kernel.
1101
It should be possible, though, to adapt it to a Unix system.
1102
The problem is that Unix has nothing corresponding precisely to a
1103
.CW Channel ,
1104
which in Plan 9 represents the unique result of evaluating a name.
1105
The
1106
.CW vnode
1107
structure is a shared structure that may represent a file
1108
known by several names, while the
1109
.CW file
1110
structure refers only to open files, but for example the current working
1111
directory of a process is not open.
1112
Possibilities to address this discrepancy include
1113
introducing a
1114
.CW Channel -like
1115
structure that connects a name and a
1116
.CW vnode ,
1117
or maintaining a separate per-process table that maps names to
1118
.CW vnodes ,
1119
disambiguating using the techniques described here.
1120
If it could be done
1121
the result would be an implementation of
1122
.CW ..
1123
that reduces the need for a built-in
1124
.CW pwd
1125
in the shell and offers a consistent, sensible interpretation of the `parent directory'.
1126
.LP
1127
We have not done this adaptation, but we recommend that the Unix community try it.
1128
.SH
1129
Conclusions
1130
.LP
1131
It should be easy to discover a well-defined, absolute path name for every open file and
1132
directory in the system, even in the face of symbolic links and other non-hierarchical
1133
elements of the file name space.
1134
In earlier versions of Plan 9, and all current versions of Unix,
1135
names can instead be inconsistent and confusing.
1136
.LP
1137
The Plan 9 operating system now maintains an accurate name for each file,
1138
using inexpensive lexical operations coupled with local file system actions.
1139
Ambiguities are resolved by examining the names themselves;
1140
since they reflect the path that was used to reach the file, they also reflect the path back,
1141
permitting a dependable answer to be recovered even when stepping backwards through
1142
a multiply-named directory.
1143
.LP
1144
Names make sense again: they are sensible and consistent.
1145
Now that dependable names are available, system services can depend on them,
1146
and recent work in Plan 9 is doing just that.
1147
We\(emthe community of Unix and Unix-like systems\(emshould have done this work a long time ago.
1148
.SH
1149
Acknowledgements
1150
.LP
1151
Phil Winterbottom devised the
1152
.CW ns
1153
command and the
1154
.CW fd
1155
and
1156
.CW ns
1157
files in
1158
.CW /proc ,
1159
based on an earlier implementation of path name management that
1160
the work in this paper replaces.
1161
Russ Cox wrote the final version of
1162
.CW cleanname
1163
and helped debug the code for reversing the mount table.
1164
Ken Thompson, Dave Presotto, and Jim McKie offered encouragement and consultation.
1165
.SH
1166
References
1167
.LP
1168
[Field95]
1169
R. Fielding,
1170
``Relative Uniform Resource Locators'',
1171
.I "Network Working Group Request for Comments: 1808" ,
1172
June, 1995.
1173
.LP
1174
[Kill84]
1175
T. J. Killian,
1176
``Processes as Files'',
1177
.I "Proceedings of the Summer 1984 USENIX Conference" ,
1178
Salt Lake City, 1984, pp. 203-207.
1179
.LP
1180
[Korn94]
1181
David G. Korn,
1182
``ksh: An Extensible High Level Language'',
1183
.I "Proceedings of the USENIX Very High Level Languages Symposium" ,
1184
Santa Fe, 1994, pp. 129-146.
1185
.LP
1186
[Korn00]
1187
David G. Korn,
1188
personal communication.
1189
.LP
1190
[PeMc95]
1191
Jan-Simon Pendry and Marshall Kirk McKusick,
1192
``Union Mounts in 4.4BSD-Lite'',
1193
.I "Proceedings of the 1995 USENIX Conference" ,
1194
New Orleans, 1995.
1195
.LP
1196
[Pike91]
1197
Rob Pike,
1198
``8½, the Plan 9 Window System'',
1199
.I "Proceedings of the Summer 1991 USENIX Conference" ,
1200
Nashville, 1991, pp. 257-265.
1201
.LP
1202
[Pike93]
1203
Rob Pike, Dave Presotto, Ken Thompson, Howard Trickey, and Phil Winterbottom,
1204
``The Use of Name Spaces in Plan 9'',
1205
.I "Operating Systems Review" ,
1206
.B 27 ,
1207
2, April 1993, pp. 72-76.
1208
.LP
1209
[PrWi93]
1210
Dave Presotto and Phil Winterbottom,
1211
``The Organization of Networks in Plan 9'',
1212
.I "Proceedings of the Winter 1993 USENIX Conference" ,
1213
San Diego, 1993, pp. 43-50.