WebSVN – planix.SVN – Blame – /os/trunk/sys/doc/prog4.ms

Rev	Author	Line No.	Line
2	-	1	`.HTML "Changes to the Programming Environment in the Fourth Release of Plan 9`
		2	`.FP lucidasans`
		3	`.TL`
		4	`Changes to the Programming Environment`
		5	`.br`
		6	`in the`
		7	`.br`
		8	`Fourth Release of Plan 9`
		9	`.AU`
		10	`Rob Pike`
		11	`.sp`
		12	`rob@plan9.bell-labs.com`
		13	`.SH`
		14	`Introduction`
		15	`.PP`
		16	`The fourth release of Plan 9 includes changes at many levels of the system,`
		17	`with repercussions in the libraries and program interfaces.`
		18	`This document summarizes the changes and describes how`
		19	`existing programs must be modified to run in the new release.`
		20	`It is not exhaustive, of course; for further detail about any of the`
		21	`topics refer to the manual pages, as always.`
		22	`.PP`
		23	`Programmers new to Plan 9 may find valuable tidbits here, but the`
		24	`real audience for this paper is those with a need to update applications`
		25	`and servers written in C for earlier releases of the Plan 9 operating system.`
		26	`.SH`
		27	`9P, NAMELEN, and strings`
		28	`.PP`
		29	`The underlying file service protocol for Plan 9, 9P, retains its basic form`
		30	`but has had a number of adjustments to deal with longer file names and error strings,`
		31	`new authentication mechanisms, and to make it more efficient at`
		32	`evaluating file names.`
		33	`The change to file names affects a number of system interfaces;`
		34	`because file name elements are no longer of fixed size, they can`
		35	`no longer be stored as arrays.`
		36	`.PP`
		37	`9P used to be a fixed-format protocol with`
		38	`.CW NAMELEN -sized`
		39	`byte arrays representing file name elements.`
		40	`Now, it is a variable-format protocol, as described in`
		41	`.I intro (5),`
		42	`in which strings are represented by a count followed by that many bytes.`
		43	`Thus, the string`
		44	`.CW ken`
		45	`would previously have occupied 28`
		46	`.CW NAMELEN ) (`
		47	`bytes in the message; now it occupies 5: a two-byte count followed by the three bytes of`
		48	`.CW ken`
		49	`and no terminal zero.`
		50	`(And of course, a name could now be much longer.)`
		51	`A similar format change has been made to`
		52	`.CW stat`
		53	`buffers: they are no longer`
		54	`.CW DIRLEN`
		55	`bytes long but instead have variable size prefixed by a two-byte count.`
		56	`And in fact the entire 9P message syntax has changed: every message`
		57	`now begins with a message length field that makes it trivial to break the`
		58	`string into messages without parsing them, so`
		59	`.CW aux/fcall`
		60	`is gone.`
		61	`A new library entry point,`
		62	`.CW read9pmsg ,`
		63	`makes it easy for user-level servers to break the client data stream into 9P messages.`
		64	`All servers should switch from using`
		65	`.CW read`
		66	`(or the now gone`
		67	`.CW getS)`
		68	`to using`
		69	`.CW read9pmsg .`
		70	`.PP`
		71	`This change to 9P affects the way strings are handled by the kernel and throughout`
		72	`the system.`
		73	`The consequences are primarily that fixed-size arrays have been replaced`
		74	`by pointers and counts in a variety of system interfaces.`
		75	`Most programs will need at least some adjustment to the new style.`
		76	`In summary:`
		77	`.CW NAMELEN`
		78	`is gone, except as a vestige in the authentication libraries, where it has been`
		79	`rechristened`
		80	`.CW ANAMELEN .`
		81	`.CW DIRLEN`
		82	`and`
		83	`.CW ERRLEN`
		84	`are also gone.`
		85	`All programs that mention`
		86	`these constants`
		87	`will need to be fixed.`
		88	`.PP`
		89	`The simplest place to see this change is in the`
		90	`.CW errstr`
		91	`system call, which no longer assumes a buffer of length`
		92	`.CW ERRLEN`
		93	`but now requires a byte-count argument:`
		94	`.P1`
		95	`char buf[...];`
		96
		97	`errstr(buf, sizeof buf);`
		98	`.P2`
		99	`The buffer can be any size you like.`
		100	`For convenience, the kernel stores error strings internally as 256-byte arrays,`
		101	`so if you like \(em but it's not required \(em you can use the defined constant`
		102	`.CW ERRMAX= 256`
		103	`as a good buffer size.`
		104	`Unlike the old`
		105	`.CW ERRLEN`
		106	`(which had value 64),`
		107	`.CW ERRMAX`
		108	`is advisory, not mandatory, and is not part of the 9P specification.`
		109	`.PP`
		110	`With names, stat buffers, and directories, there isn't even an echo of a fixed-size array any more.`
		111	`.SH`
		112	`Directories and wait messages`
		113	`.PP`
		114	`With strings now variable-length, a number of system calls needed to change:`
		115	`.CW errstr ,`
		116	`.CW stat ,`
		117	`.CW fstat ,`
		118	`.CW wstat ,`
		119	`.CW fwstat ,`
		120	`and`
		121	`.CW wait`
		122	`are all affected, as is`
		123	`.CW read`
		124	`when applied to directories.`
		125	`.PP`
		126	`As far as directories are concerned, most programs don't use the system calls`
		127	`directly anyway, since they operate on the machine-independent form, but`
		128	`instead call the machine-dependent`
		129	`.CW Dir`
		130	`routines`
		131	`.CW dirstat ,`
		132	`.CW dirread ,`
		133	`etc.`
		134	`These used to fill user-provided fixed-size buffers; now they return objects allocated`
		135	`by`
		136	`.CW malloc`
		137	`(which must therefore be freed after use).`
		138	To `stat' a file:
		139	`.P1`
		140	`Dir *d;`
		141
		142	`d = dirstat(filename);`
		143	`if(d == nil){`
		144	`fprint(2, "can't stat %s: %r\en", filename);`
		145	`exits("stat");`
		146	`}`
		147	`use(d);`
		148	`free(d);`
		149	`.P2`
		150	`A common new bug is to forget to free a`
		151	`.CW Dir`
		152	`returned by`
		153	`.CW dirstat .`
		154	`.PP`
		155	`.CW Dirfstat`
		156	`and`
		157	`.CW Dirfwstat`
		158	`work pretty much as before, but changes to 9P make`
		159	`it possible to exercise finer-grained control on what fields`
		160	`of the`
		161	`.CW Dir`
		162	`are to be changed; see`
		163	`.I stat (2)`
		164	`and`
		165	`.I stat (5)`
		166	`for details.`
		167	`.PP`
		168	`Reading a directory works in a similar way to`
		169	`.CW dirstat ,`
		170	`with`
		171	`.CW dirread`
		172	`allocating and filling in an array of`
		173	`.CW Dir`
		174	`structures.`
		175	`The return value is the number of elements of the array.`
		176	`The arguments to`
		177	`.CW dirread`
		178	`now include a pointer to a`
		179	`.CW Dir*`
		180	`to be filled in with the address of the allocated array:`
		181	`.P1`
		182	`Dir *d;`
		183	`int i, n;`
		184
		185	`while((n = dirread(fd, &d)) > 0){`
		186	`for(i=0; i<n; i++)`
		187	`use(&d[i]);`
		188	`free(d);`
		189	`}`
		190	`.P2`
		191	`A new library function,`
		192	`.CW dirreadall ,`
		193	`has the same form as`
		194	`.CW dirread`
		195	`but returns the entire directory in one call:`
		196	`.P1`
		197	`n = dirreadall(fd, &d)`
		198	`for(i=0; i<n; i++)`
		199	`use(&d[i]);`
		200	`free(d);`
		201	`.P2`
		202	`If your program insists on using the underlying`
		203	`.CW stat`
		204	`system call or its relatives, or wants to operate directly on the`
		205	`machine-independent format returned by`
		206	`.CW stat`
		207	`or`
		208	`.CW read ,`
		209	`it will need to be modified.`
		210	`Such programs are rare enough that we'll not discuss them here beyond referring to`
		211	`the man page`
		212	`.I stat (2)`
		213	`for details.`
		214	`Be aware, though, that it used to be possible to regard the buffer returned by`
		215	`.CW stat`
		216	`as a byte array that began with the zero-terminated`
		217	`name of the file; this is no longer true.`
		218	`With very rare exceptions, programs that call`
		219	`.CW stat`
		220	`would be better recast to use the`
		221	`.CW dir`
		222	`routines or, if their goal is just to test the existence of a file,`
		223	`.CW access .`
		224	`.PP`
		225	`Similar changes have affected the`
		226	`.CW wait`
		227	`system call. In fact,`
		228	`.CW wait`
		229	`is no longer a system call but a library routine that calls the new`
		230	`.CW await`
		231	`system call and returns a newly allocated machine-dependent`
		232	`.CW Waitmsg`
		233	`structure:`
		234	`.P1`
		235	`Waitmsg *w;`
		236
		237	`w = wait();`
		238	`if(w == nil)`
		239	`error("wait: %r");`
		240	`print("pid is %d; exit string %s\en", w->pid, w->msg);`
		241	`free(w);`
		242	`.P2`
		243	`The exit string`
		244	`.CW w->msg`
		245	`may be empty but it will never be a nil pointer.`
		246	`Again, don't forget to free the structure returned by`
		247	`.CW wait .`
		248	`If all you need is the pid, you can call`
		249	`.CW waitpid ,`
		250	`which reports just the pid and doesn't return an allocated structure:`
		251	`.P1`
		252	`int pid;`
		253
		254	`pid = waitpid();`
		255	`if(pid < 0)`
		256	`error("wait: %r");`
		257	`print("pid is %d\en", pid);`
		258	`.P2`
		259	`.SH`
		260	`Quoted strings and tokenize`
		261	`.PP`
		262	`.CW Wait`
		263	`gives us a good opportunity to describe how the system copes with all this`
		264	`free-format data.`
		265	`Consider the text returned by the`
		266	`.CW await`
		267	`system call, which includes a set of integers (pids and times) and a string (the exit status).`
		268	`This information is formatted free-form; here is the statement in the kernel that`
		269	`generates the message:`
		270	`.P1`
		271	`n = snprint(a, n, "%d %lud %lud %lud %q",`
		272	`wq->w.pid,`
		273	`wq->w.time[TUser], wq->w.time[TSys], wq->w.time[TReal],`
		274	`wq->w.msg);`
		275	`.P2`
		276	`Note the use of`
		277	`.CW %q`
		278	`to produce a quoted-string representation of the exit status.`
		279	`The`
		280	`.CW %q`
		281	`format is like %s but will wrap`
		282	`.CW rc -style`
		283	`single quotes around the string if it contains white space or is otherwise ambiguous.`
		284	`The library routine`
		285	`.CW tokenize`
		286	`can be used to parse data formatted this way: it splits white-space-separated`
		287	`fields but understands the`
		288	`.CW %q`
		289	`quoting conventions.`
		290	`Here is how the`
		291	`.CW wait`
		292	`library routine builds its`
		293	`.CW Waitmsg`
		294	`from the data returned by`
		295	`.CW await :`
		296	`.P1`
		297	`Waitmsg*`
		298	`wait(void)`
		299	`{`
		300	`int n, l;`
		301	`char buf[512], *fld[5];`
		302	`Waitmsg *w;`
		303
		304	`n = await(buf, sizeof buf-1);`
		305	`if(n < 0)`
		306	`return nil;`
		307	`buf[n] = '\0';`
		308	`if(tokenize(buf, fld, nelem(fld)) != nelem(fld)){`
		309	`werrstr("couldn't parse wait message");`
		310	`return nil;`
		311	`}`
		312	`l = strlen(fld[4])+1;`
		313	`w = malloc(sizeof(Waitmsg)+l);`
		314	`if(w == nil)`
		315	`return nil;`
		316	`w->pid = atoi(fld[0]);`
		317	`w->time[0] = atoi(fld[1]);`
		318	`w->time[1] = atoi(fld[2]);`
		319	`w->time[2] = atoi(fld[3]);`
		320	`w->msg = (char*)&w[1];`
		321	`memmove(w->msg, fld[4], l);`
		322	`return w;`
		323	`}`
		324	`.P2`
		325	`.PP`
		326	`This style of quoted-string and`
		327	`.CW tokenize`
		328	`is used all through the system now.`
		329	`In particular, devices now`
		330	`.CW tokenize`
		331	`the messages written to their`
		332	`.CW ctl`
		333	`files, which means that you can send messages that contain white space, by quoting them,`
		334	`and that you no longer need to worry about whether or not the device accepts a newline.`
		335	`In other words, you can say`
		336	`.P1`
		337	`echo message > /dev/xx/ctl`
		338	`.P2`
		339	`instead of`
		340	`.CW echo`
		341	`.CW -n`
		342	`because`
		343	`.CW tokenize`
		344	`treats the newline character as white space and discards it.`
		345	`.PP`
		346	`While we're on the subject of quotes and strings, note that the implementation of`
		347	`.CW await`
		348	`used`
		349	`.CW snprint`
		350	`rather than`
		351	`.CW sprint .`
		352	`We now deprecate`
		353	`.CW sprint`
		354	`because it has no protection against buffer overflow.`
		355	`We prefer`
		356	`.CW snprint`
		357	`or`
		358	`.CW seprint ,`
		359	`to constrain the output.`
		360	`The`
		361	`.CW %q`
		362	`format is cleverer than most in this regard:`
		363	`if the string is too long to be represented in full,`
		364	`.CW %q`
		365	`is smart enough to produce a truncated but correctly quoted`
		366	`string within the available space.`
		367	`.SH`
		368	`Mount`
		369	`.PP`
		370	`Although strings in 9P are now variable-length and not zero-terminated,`
		371	`this has little direct effect in most of the system interfaces.`
		372	`File and user names are still zero-terminated strings as always;`
		373	`the kernel does the work of translating them as necessary for`
		374	`transport.`
		375	`And of course, they are now free to be as long as you might want;`
		376	`the only hard limit is that their length must be represented in 16 bits.`
		377	`.PP`
		378	`One example where this matters is that the file system specification in the`
		379	`.CW mount`
		380	`system call can now be much longer.`
		381	`Programs like`
		382	`.CW rio`
		383	`that used the specification string in creative ways were limited by the`
		384	`.CW NAMELEN`
		385	`restriction; now they can use the string more freely.`
		386	`.CW Rio`
		387	`now accepts a simple but less cryptic specification language for the window`
		388	`to be created by the`
		389	`.CW mount`
		390	`call, e.g.:`
		391	`.P1`
		392	`% mount $wsys /mnt/wsys 'new -dx 250 -dy 250 -pid 1234'`
		393	`.P2`
		394	`In the old system, this sort of control was impossible through the`
		395	`.CW mount`
		396	`interface.`
		397	`.PP`
		398	`While we're on the subject of`
		399	`.CW mount ,`
		400	`note that with the new security architecture`
		401	`(see`
		402	`.I factotum (4)),`
		403	`9P has moved its authentication outside the protocol proper.`
		404	`(For a full description of this change to 9P, see`
		405	`.I fauth (2),`
		406	`.I attach (5),`
		407	`and the paper`
		408	`.I "Security in Plan 9\f1.)`
		409	`The most explicit effect of this change is that`
		410	`.CW mount`
		411	`now takes another argument,`
		412	`.CW afd ,`
		413	`a file descriptor for the`
		414	`authentication file through which the authentication will be made.`
		415	`For most user-level file servers, which do not require authentication, it is`
		416	`sufficient to provide`
		417	`.CW -1`
		418	`as the value of`
		419	`.CW afd:`
		420	`.P1`
		421	`if(mount(fd, -1, "/mnt/wsys", MREPL,`
		422	`"new -dx 250 -dy 250 -pid 1234") < 0)`
		423	`error("mount failed: %r");`
		424	`.P2`
		425	`To connect to servers that require authentication, use the new`
		426	`.CW fauth`
		427	`system call or the reimplemented`
		428	`.CW amount`
		429	`(authenticated mount) library call.`
		430	`In fact, since`
		431	`.CW amount`
		432	`handles both authenticating and non-authenticating servers, it is often`
		433	`easiest just to replace calls to`
		434	`.CW mount`
		435	`by calls to`
		436	`.CW amount ;`
		437	`see`
		438	`.I auth (2)`
		439	`for details.`
		440	`.SH`
		441	`Print`
		442	`.PP`
		443	`The C library has been heavily reworked in places.`
		444	`Besides the changes mentioned above, it`
		445	`now has a much more complete set of routines for handling`
		446	`.CW Rune`
		447	`strings (that is, zero-terminated arrays of 16-bit character values).`
		448	`The most sweeping changes, however, are in the way formatted I/O is performed.`
		449	`.PP`
		450	`The`
		451	`.CW print`
		452	`routine and all its relatives have been reimplemented to offer a number`
		453	`of improvements:`
		454	`.IP (1)`
		455	`Better buffer management, including the provision of an internal flush`
		456	`routine, makes it unnecessary to provide large buffers.`
		457	`For example,`
		458	`.CW print`
		459	`uses a much smaller buffer now (reducing stack load) while simultaneously`
		460	`removing the need to truncate the output string if it doesn't fit in the buffer.`
		461	`.IP (2)`
		462	`Global variables have been eliminated so no locking is necessary.`
		463	`.IP (3)`
		464	`The combination of (1) and (2) means that the standard implementation of`
		465	`.CW print`
		466	`now works fine in threaded programs, and`
		467	`.CW threadprint`
		468	`is gone.`
		469	`.IP (4)`
		470	`The new routine`
		471	`.CW smprint`
		472	`prints into, and returns, storage allocated on demand by`
		473	`.CW malloc .`
		474	`.IP (5)`
		475	`It is now possible to print into a`
		476	`.CW Rune`
		477	`string; for instance,`
		478	`.CW runesmprint`
		479	`is the`
		480	`.CW Rune`
		481	`analog of`
		482	`.CW smprint .`
		483	`.IP (6)`
		484	`There is improved support for custom`
		485	`print verbs and custom output routines such as error handlers.`
		486	`The routine`
		487	`.CW doprint`
		488	`is gone, but`
		489	`.CW vseprint`
		490	`can always be used instead.`
		491	`However, the new routines`
		492	`.CW fmtfdinit ,`
		493	`.CW fmtstrinit ,`
		494	`.CW fmtprint ,`
		495	`and friends`
		496	`are often a better replacement.`
		497	`The details are too long for exposition here;`
		498	`.I fmtinstall (2)`
		499	`explains the new interface and provides examples.`
		500	`.IP (7)`
		501	`Two new format flags, space and comma, close somewhat the gap between`
		502	`Plan 9 and ANSI C.`
		503	`.PP`
		504	`Despite these changes, most programs will be unaffected;`
		505	`.CW print`
		506	`is still`
		507	`.CW print .`
		508	`Don't forget, though, that`
		509	`you should eliminate calls to`
		510	`.CW sprint`
		511	`and use the`
		512	`.CW %q`
		513	`format when appropriate.`
		514	`.SH`
		515	`Binary compatibility`
		516	`.PP`
		517	`The discussion so far has been about changes at the source level.`
		518	`Existing binaries will probably run without change in the new`
		519	`environment, since the kernel provides backward-compatible`
		520	`system calls for`
		521	`.CW errstr ,`
		522	`.CW stat ,`
		523	`.CW wait ,`
		524	`etc.`
		525	`The only exceptions are programs that do either a`
		526	`.CW mount`
		527	`system call, because of the security changes and because`
		528	`the file descriptor in`
		529	`.CW mount`
		530	`must point to a new 9P connection; or a`
		531	`.CW read`
		532	`system call on a directory, since the returned data will`
		533	`be in the new format.`
		534	`A moment's reflection will discover that this means old`
		535	`user-level file servers will need to be fixed to run on the new system.`
		536	`.SH`
		537	`File servers`
		538	`.PP`
		539	`A full description of what user-level servers must do to provide service with`
		540	`the new 9P is beyond the scope of this paper.`
		541	`Your best source of information is section 5 of the manual,`
		542	`combined with study of a few examples.`
		543	`.CW /sys/src/cmd/ramfs.c`
		544	`is a simple example; it has a counterpart`
		545	`.CW /sys/src/lib9p/ramfs.c`
		546	`that implements the same service using the new`
		547	`.I 9p (2)`
		548	`library.`
		549	`.PP`
		550	`That said, it's worth summarizing what to watch for when converting a file server.`
		551	`The`
		552	`.CW session`
		553	`message is gone, and there is a now a`
		554	`.CW version`
		555	`message that is exchanged at the start of a connection to establish`
		556	`the version of the protocol to use (there's only one at the moment, identified by`
		557	`the string`
		558	`.CW 9P2000 )`
		559	`and what the maximum message size will be.`
		560	`This negotiation makes it easier to handle 9P encapsulation, such as with`
		561	`.CW exportfs ,`
		562	`and also permits larger message sizes when appropriate.`
		563	`.PP`
		564	`If your server wants to authenticate, it will need to implement an authentication file`
		565	`and implement the`
		566	`.CW auth`
		567	`message; otherwise it should return a helpful error string to the`
		568	`.CW Tauth`
		569	`request to signal that authentication is not required.`
		570	`.PP`
		571	`The handling of`
		572	`.CW stat`
		573	`and directory reads will require some changes but they should not be fundamental.`
		574	`Be aware that seeking on directories is forbidden, so it is fine if you disregard the`
		575	`file offset when implementing directory reads; this makes it a little easier to handle`
		576	`the variable-length entries.`
		577	`You should still never return a partial directory entry; if the I/O count is too small`
		578	`to return even one entry, you should return two bytes containing the byte count`
		579	`required to represent the next entry in the directory.`
		580	`User code can use this value to formulate a retry if it desires.`
		581	`See the`
		582	`DIAGNOSTICS section of`
		583	`.I stat (2)`
		584	`for a description of this process.`
		585	`.PP`
		586	`The trickiest part of updating a file server is that the`
		587	`.CW clone`
		588	`and`
		589	`.CW walk`
		590	messages have been merged into a single message, a sort of `clone-multiwalk'.
		591	`The new message, still called`
		592	`.CW walk ,`
		593	`proposes a sequence of file name elements to be evaluated using a possibly`
		594	`cloned fid.`
		595	`The return message contains the qids of the files reached by`
		596	`walking to the sequential elements.`
		597	`If all the elements can be walked, the fid will be cloned if requested.`
		598	`If a non-zero number of elements are requested, but none`
		599	`can be walked, an error should be returned.`
		600	`If only some can be walked, the fid is not cloned, the original fid is left`
		601	`where it was, and the returned`
		602	`.CW Rwalk`
		603	`message should contain the partial list of successfully reached qids.`
		604	`See`
		605	`.I walk (5)`
		606	`for a full description.`

Subversion Repositories planix.SVN

(root)/os/trunk/sys/doc/prog4.ms – Rev 2