WebSVN – planix.SVN – Blame – /os/branches/feature_tlsv12/sys/doc/sam/sam.ms

Rev	Author	Line No.	Line
2	-	1	`.HTML "The Text Editor sam`
		2	`.Vx 17 11 November 87 1 32 "ROB PIKE" "THE TEXT EDITOR SAM"`
		3	`.ds DY "31 May 1987`
		4	`.ds DR "Revised 1 July 1987`
		5	`.de CW \" puts first arg in CW font, same as UL; maintains font`
		6	`\%\&\\$3\f(CW\\$1\fP\&\\$2`
		7	`..`
		8	`.de Cs`
		9	`.br`
		10	`.fi`
		11	`.ft 2`
		12	`.ps -2`
		13	`.vs -2`
		14	`..`
		15	`.de Ce`
		16	`.br`
		17	`.nf`
		18	`.ft 1`
		19	`.ps`
		20	`.vs`
		21	`.sp`
		22	`..`
		23	`.de XP`
		24	`.ie h .html - <center><img src="\\$1.gif" /></center>`
		25	`.el .BP \\$1.ps \\$2`
		26	`..`
		27	`.TL`
		28	`The Text Editor \&\f(CWsam\fP`
		29	`.AU`
		30	`Rob Pike`
		31	`rob@plan9.bell-labs.com`
		32	`.AB`
		33	`.LP`
		34	`.CW Sam`
		35	`is an interactive multi-file text editor intended for`
		36	`bitmap displays.`
		37	`A textual command language`
		38	`supplements the mouse-driven, cut-and-paste interface`
		39	`to make complex or`
		40	`repetitive editing tasks easy to specify.`
		41	`The language is characterized by the composition of regular expressions`
		42	`to describe the structure of the text being modified.`
		43	`The treatment of files as a database, with changes logged`
		44	`as atomic transactions, guides the implementation and`
		45	makes a general `undo' mechanism straightforward.
		46	`.PP`
		47	`.CW Sam`
		48	`is implemented as two processes connected by a low-bandwidth stream,`
		49	`one process handling the display and the other the editing`
		50	`algorithms. Therefore it can run with the display process`
		51	`in a bitmap terminal and the editor on a local host,`
		52	`with both processes on a bitmap-equipped host, or with`
		53	`the display process in the terminal and the editor in a`
		54	`remote host.`
		55	`By suppressing the display process,`
		56	`it can even run without a bitmap terminal.`
		57	`.PP`
		58	`This paper is reprinted from Software\(emPractice and Experience,`
		59	`Vol 17, number 11, pp. 813-845, November 1987.`
		60	`The paper has not been updated for the Plan 9 manuals. Although`
		61	`.CW Sam`
		62	`has not changed much since the paper was written, the system around it certainly has.`
		63	`Nonetheless, the description here still stands as the best introduction to the editor.`
		64	`.AE`
		65	`.SH`
		66	`Introduction`
		67	`.LP`
		68	`.CW Sam`
		69	`is an interactive text editor that combines cut-and-paste interactive editing with`
		70	`an unusual command language based on the composition of regular expressions.`
		71	It is written as two programs: one, the `host part,' runs on a UNIX system
		72	`and implements the command language and provides file access; the other, the`
		73	`terminal part,' runs asynchronously
		74	`on a machine with a mouse and bitmap display`
		75	`and supports the display and interactive editing.`
		76	`The host part may be even run in isolation on an ordinary terminal`
		77	`to edit text using the command`
		78	`language, much like a traditional line editor,`
		79	`without assistance from a mouse or display.`
		80	`Most often,`
		81	`the terminal part runs on a Blit\u\s-4\&1\s+4\d terminal`
		82	`(actually on a Teletype DMD 5620, the production version of the Blit), whose`
		83	`host connection is an ordinary 9600 bps RS232 link;`
		84	`on the SUN computer the host and display processes run on a single machine,`
		85	`connected by a pipe.`
		86	`.PP`
		87	`.CW Sam`
		88	`edits uninterpreted`
		89	`ASCII text.`
		90	`It has no facilities for multiple fonts, graphics or tables,`
		91	`unlike MacWrite,\u\s-4\&2\s+4\d Bravo,\u\s-4\&3\s+4\d Tioga\u\s-4\&4\s+4\d`
		92	`or Lara.\u\s-4\&5\s+4\d`
		93	`Also unlike them, it has a rich command language.`
		94	`(Throughout this paper, the phrase`
		95	`.I`
		96	`command language`
		97	`.R`
		98	`refers to`
		99	`textual commands; commands activated from the mouse form the`
		100	`.I mouse`
		101	`.I language. )`
		102	`.CW Sam`
		103	`developed as an editor for use by programmers, and tries to join`
		104	`the styles of the UNIX text editor`
		105	`.CW ed \u\s-4\&6,7\s+4\d`
		106	`with that of interactive cut-and-paste editors by`
		107	`providing a comfortable mouse-driven interface`
		108	`to a program with a solid command language driven by regular expressions.`
		109	`The command language developed more than the mouse language, and`
		110	`acquired a notation for describing the structure of files`
		111	`more richly than as a sequence of lines,`
		112	`using a dataflow-like syntax for specifying changes.`
		113	`.PP`
		114	`The interactive style was influenced by`
		115	`.CW jim ,\u\s-4\&1\s+4\d`
		116	`an early cut-and-paste editor for the Blit, and by`
		117	`.CW mux ,\u\s-4\&8\s+4\d`
		118	`the Blit window system.`
		119	`.CW Mux`
		120	`merges the original Blit window system,`
		121	`.CW mpx ,\u\s-4\&1\s+4\d`
		122	`with cut-and-paste editing, forming something like a`
		123	`multiplexed version of`
		124	`.CW jim`
		125	`that edits the output of (and input to) command sessions rather than files.`
		126	`.PP`
		127	`The first part of this paper describes the command language, then the mouse`
		128	`language, and explains how they interact.`
		129	`That is followed by a description of the implementation,`
		130	`first of the host part, then of the terminal part.`
		131	`A principle that influenced the design of`
		132	`.CW sam`
		133	`is that it should have no explicit limits, such as upper limits on`
		134	`file size or line length.`
		135	`A secondary consideration is that it be efficient.`
		136	`To honor these two goals together requires a method for efficiently`
		137	`manipulating`
		138	`huge strings (files) without breaking them into lines,`
		139	`perhaps while making thousands of changes`
		140	`under control of the command language.`
		141	`.CW Sam 's`
		142	`method is to`
		143	`treat the file as a transaction database, implementing changes as atomic`
		144	updates. These updates may be unwound easily to `undo' changes.
		145	`Efficiency is achieved through a collection of caches that minimizes`
		146	`disc traffic and data motion, both within the two parts of the program`
		147	`and between them.`
		148	`.PP`
		149	`The terminal part of`
		150	`.CW sam`
		151	`is fairly straightforward.`
		152	`More interesting is how the two halves of the editor stay`
		153	`synchronized when either half may initiate a change.`
		154	`This is achieved through a data structure that organizes the`
		155	`communications and is maintained in parallel by both halves.`
		156	`.PP`
		157	`The last part of the paper chronicles the writing of`
		158	`.CW sam`
		159	`and discusses the lessons that were learned through its development and use.`
		160	`.PP`
		161	`The paper is long, but is composed largely of two papers of reasonable length:`
		162	`a description of the user interface of`
		163	`.CW sam`
		164	`and a discussion of its implementation.`
		165	`They are combined because the implementation is strongly influenced by`
		166	`the user interface, and vice versa.`
		167	`.SH`
		168	`The Interface`
		169	`.LP`
		170	`.CW Sam`
		171	`is a text editor for multiple files.`
		172	`File names may be provided when it is invoked:`
		173	`.P1`
		174	`sam file1 file2 ...`
		175	`.P2`
		176	`and there are commands`
		177	`to add new files and discard unneeded ones.`
		178	`Files are not read until necessary`
		179	`to complete some command.`
		180	`Editing operations apply to an internal copy`
		181	`made when the file is read; the UNIX file associated with the copy`
		182	`is changed only by an explicit command.`
		183	`To simplify the discussion, the internal copy is here called a`
		184	`.I file ,`
		185	`while the disc-resident original is called a`
		186	`.I`
		187	`disc file.`
		188	`.R`
		189	`.PP`
		190	`.CW Sam`
		191	`is usually connected to a bitmap display that presents a cut-and-paste`
		192	`editor driven by the mouse.`
		193	`In this mode, the command language is still available:`
		194	`text typed in a special window, called the`
		195	`.CW sam`
		196	`.I window,`
		197	`is interpreted`
		198	`as commands to be executed in the current file.`
		199	`Cut-and-paste editing may be used in any window \(em even in the`
		200	`.CW sam`
		201	`window to construct commands.`
		202	`The other mode of operation, invoked by starting`
		203	`.CW sam`
		204	`with the option`
		205	`.CW -d`
		206	(for `no download'),
		207	`does not use the mouse or bitmap display, but still permits`
		208	`editing using the textual command language, even on an ordinary terminal,`
		209	`interactively or from a script.`
		210	`.PP`
		211	`The following sections describe first the command language (under`
		212	`.CW sam\ -d`
		213	`and in the`
		214	`.CW sam`
		215	`window), and then the mouse interface.`
		216	`These two languages are nearly independent, but connect through the`
		217	`.I current`
		218	`.I text,`
		219	`described below.`
		220	`.SH 2`
		221	`The Command Language`
		222	`.LP`
		223	`A file consists of its contents, which are an array of characters`
		224	`(that is, a string); the`
		225	`.I name`
		226	`of the associated disc file; the`
		227	`.I`
		228	`modified bit`
		229	`.R`
		230	`that states whether the contents match those of`
		231	`the disc file;`
		232	`and a substring of the contents, called the`
		233	`.I`
		234	`current text`
		235	`.R`
		236	`or`
		237	`.I dot`
		238	`(see Figures 1 and 2).`
		239	`If the current text is a null string, dot falls between characters.`
		240	`The`
		241	`.I value`
		242	`of dot is the location of the current text; the`
		243	`.I contents`
		244	`of dot are the characters it contains.`
		245	`.CW Sam`
		246	`imparts to the text no two-dimensional interpretation such as columns`
		247	`or fields; text is always one-dimensional.`
		248	Even the idea of a `line' of text as understood by most UNIX programs
		249	`\(em a sequence of characters terminated by a newline character \(em`
		250	`is only weakly supported.`
		251	`.PP`
		252	`The`
		253	`.I`
		254	`current file`
		255	`.R`
		256	`is the file to which editing commands refer.`
		257	`The current text is therefore dot in the current file.`
		258	`If a command doesn't explicitly name a particular file or piece of text,`
		259	`the command is assumed to apply to the current text.`
		260	`For the moment, ignore the presence of multiple files and consider`
		261	`editing a single file.`
		262	`.KF L`
		263	`.XP fig1 3.5i`
		264	`.Cs`
		265	`Figure 1. A typical`
		266	`.CW sam`
		267	`screen, with the editing menu presented.`
		268	`The`
		269	`.CW sam`
		270	`(command language) window is in the middle, with file windows above and below.`
		271	`(The user interface makes it easy to create these abutting windows.)`
		272	`The partially obscured window is a third file window.`
		273	`The uppermost window is that to which typing and mouse operations apply,`
		274	`as indicated by its heavy border.`
		275	`Each window has its current text highlighted in reverse video.`
		276	`The`
		277	`.CW sam`
		278	`window's current text is the null string on the last visible line,`
		279	`indicated by a vertical bar.`
		280	`See also Figure 2.`
		281	`.Ce`
		282	`.KE`
		283	`.PP`
		284	`Commands have one-letter names.`
		285	`Except for non-editing commands such as writing`
		286	`the file to disc, most commands make some change`
		287	`to the text in dot and leave dot set to the text resulting from the change.`
		288	`For example, the delete command,`
		289	`.CW d ,`
		290	`deletes the text in dot, replacing it by the null string and setting dot`
		291	`to the result.`
		292	`The change command,`
		293	`.CW c ,`
		294	`replaces dot by text delimited by an arbitrary punctuation character,`
		295	`conventionally`
		296	`a slash. Thus,`
		297	`.P1`
		298	`c/Peter/`
		299	`.P2`
		300	`replaces the text in dot by the string`
		301	`.CW Peter .`
		302	`Similarly,`
		303	`.P1`
		304	`a/Peter/`
		305	`.P2`
		306	`(append) adds the string after dot, and`
		307	`.P1`
		308	`i/Peter/`
		309	`.P2`
		310	`(insert) inserts before dot.`
		311	`All three leave dot set to the new text,`
		312	`.CW Peter .`
		313	`.PP`
		314	`Newlines are part of the syntax of commands:`
		315	`the newline character lexically terminates a command.`
		316	`Within the inserted text, however, newlines are never implicit.`
		317	`But since it is often convenient to insert multiple lines of text,`
		318	`.CW sam`
		319	`has a special`
		320	`syntax for that case:`
		321	`.P1`
		322	`a`
		323	`some lines of text`
		324	`to be inserted in the file,`
		325	`terminated by a period`
		326	`on a line by itself`
		327	`\&.`
		328	`.P2`
		329	`In the one-line syntax, a newline character may be specified by a C-like`
		330	`escape, so`
		331	`.P1`
		332	`c/\en/`
		333	`.P2`
		334	`replaces dot by a single newline character.`
		335	`.PP`
		336	`.CW Sam`
		337	`also has a substitute command,`
		338	`.CW s :`
		339	`.P1`
		340	`s/\f2expression\fP/\f2replacement\fP/`
		341	`.P2`
		342	`substitutes the replacement text for the first match, in dot,`
		343	`of the regular expression.`
		344	`Thus, if dot is the string`
		345	`.CW Peter ,`
		346	`the command`
		347	`.P1`
		348	`s/t/st/`
		349	`.P2`
		350	`changes it to`
		351	`.CW Pester .`
		352	`In general,`
		353	`.CW s`
		354	`is unnecessary, but it was inherited from`
		355	`.CW ed`
		356	`and it has some convenient variations.`
		357	`For instance, the replacement text may include the matched text,`
		358	`specified by`
		359	`.CW & :`
		360	`.P1`
		361	`s/Peter/Oh, &, &, &, &!/`
		362	`.P2`
		363	`.PP`
		364	`There are also three commands that apply programs`
		365	`to text:`
		366	`.P1`
		367	`< \f2UNIX program\fP`
		368	`.P2`
		369	`replaces dot by the output of the UNIX program.`
		370	`Similarly, the`
		371	`.CW >`
		372	`command`
		373	`runs the program with dot as its standard input, and`
		374	`.CW \|`
		375	`does both. For example,`
		376	`.P1`
		377	`\| sort`
		378	`.P2`
		379	`replaces dot by the result of applying the standard sorting utility to it.`
		380	`Again, newlines have no special significance for these`
		381	`.CW sam`
		382	`commands.`
		383	`The text acted upon and resulting from these commands is not necessarily`
		384	`bounded by newlines, although for connection with UNIX programs,`
		385	`newlines may be necessary to obey conventions.`
		386	`.PP`
		387	`One more command:`
		388	`.CW p`
		389	`prints the contents of dot.`
		390	`Table I summarizes`
		391	`.CW sam 's`
		392	`commands.`
		393	`.KF`
		394	`.TS`
		395	`center;`
		396	`c s`
		397	`lfCW l.`
		398	`Table I. \f(CWSam\fP commands`
		399	`.sp .4`
		400	`.ft CW`
		401	`_`
		402	`.ft`
		403	`.sp .4`
		404	`\f1Text commands\fP`
		405	`.sp .4`
		406	`_`
		407	`.sp .4`
		408	`a/\f2text\fP/ Append text after dot`
		409	`c/\f2text\fP/ Change text in dot`
		410	`i/\f2text\fP/ Insert text before dot`
		411	`d Delete text in dot`
		412	`s/\f2regexp\fP/\f2text\fP/ Substitute text for match of regular expression in dot`
		413	`m \f2address\fP Move text in dot after address`
		414	`t \f2address\fP Copy text in dot after address`
		415	`.sp .4`
		416	`_`
		417	`.sp .4`
		418	`\f1Display commands\fP`
		419	`.sp .4`
		420	`_`
		421	`.sp .2`
		422	`p Print contents of dot`
		423	`\&= Print value (line numbers and character numbers) of dot`
		424	`.sp .4`
		425	`_`
		426	`.sp .4`
		427	`\f1File commands\fP`
		428	`.sp .4`
		429	`_`
		430	`.sp .2`
		431	`b \f2file-list\fP Set current file to first file in list that \f(CWsam\fP has in menu`
		432	`B \f2file-list\fP Same as \f(CWb\fP, but load new files`
		433	`n Print menu lines of all files`
		434	`D \f2file-list\fP Delete named files from \f(CWsam\fP`
		435	`.sp .4`
		436	`_`
		437	`.sp .4`
		438	`\f1I/O commands\fP`
		439	`.sp .4`
		440	`_`
		441	`.sp .2`
		442	`e \f2filename\fP Replace file with named disc file`
		443	`r \f2filename\fP Replace dot by contents of named disc file`
		444	`w \f2filename\fP Write file to named disc file`
		445	`f \f2filename\fP Set file name and print new menu line`
		446	`< \f2UNIX-command\fP Replace dot by standard output of command`
		447	`> \f2UNIX-command\fP Send dot to standard input of command`
		448	`\| \f2UNIX-command\fP Replace dot by result of command applied to dot`
		449	`! \f2UNIX-command\fP Run the command`
		450	`.sp .4`
		451	`_`
		452	`.sp .4`
		453	`\f1Loops and conditionals\fP`
		454	`.sp .4`
		455	`_`
		456	`.sp .2`
		457	`x/\f2regexp\fP/ \f2command\fP For each match of regexp, set dot and run command`
		458	`y/\f2regexp\fP/ \f2command\fP Between adjacent matches of regexp, set dot and run command`
		459	`X/\f2regexp\fP/ \f2command\fP Run command in each file whose menu line matches regexp`
		460	`Y/\f2regexp\fP/ \f2command\fP Run command in each file whose menu line does not match`
		461	`g/\f2regexp\fP/ \f2command\fP If dot contains a match of regexp, run command`
		462	`v/\f2regexp\fP/ \f2command\fP If dot does not contain a match of regexp, run command`
		463	`.sp .4`
		464	`_`
		465	`.sp .4`
		466	`\f1Miscellany\fP`
		467	`.sp .4`
		468	`_`
		469	`.sp .2`
		470	`k Set address mark to value of dot`
		471	`q Quit`
		472	`u \f2n\fP Undo last \f2n\fP (default 1) changes`
		473	`{ } Braces group commands`
		474	`.sp .3`
		475	`.ft CW`
		476	`_`
		477	`.ft`
		478	`.TE`
		479	`.sp`
		480	`.KE`
		481	`.PP`
		482	`The value of dot may be changed by`
		483	`specifying an`
		484	`.I address`
		485	`for the command.`
		486	`The simplest address is a line number:`
		487	`.P1`
		488	`3`
		489	`.P2`
		490	`refers to the third line of the file, so`
		491	`.P1`
		492	`3d`
		493	`.P2`
		494	`deletes the third line of the file, and implicitly renumbers`
		495	`the lines so the old line 4 is now numbered 3.`
		496	`(This is one of the few places where`
		497	`.CW sam`
		498	`deals with lines directly.)`
		499	`Line`
		500	`.CW 0`
		501	`is the null string at the beginning of the file.`
		502	`If a command consists of only an address, a`
		503	`.CW p`
		504	`command is assumed, so typing an unadorned`
		505	`.CW 3`
		506	`prints line 3 on the terminal.`
		507	`There are a couple of other basic addresses:`
		508	`a period addresses dot itself; and`
		509	`a dollar sign`
		510	`.CW $ ) (`
		511	`addresses the null string at the end of the file.`
		512	`.PP`
		513	`An address is always a single substring of the file.`
		514	`Thus, the address`
		515	`.CW 3`
		516	`addresses the characters`
		517	`after the second newline of`
		518	`the file through the third newline of the file.`
		519	`A`
		520	`.I`
		521	`compound address`
		522	`.R`
		523	`is constructed by the comma operator`
		524	`.P1`
		525	`\f2address1\fP,\f2address2\fP`
		526	`.P2`
		527	`and addresses the substring of the file from the beginning of`
		528	`.I address1`
		529	`to the end of`
		530	`.I address2 .`
		531	`For example, the command`
		532	`.CW 3,5p`
		533	`prints the third through fifth lines of the file and`
		534	`.CW .,$d`
		535	`deletes the text from the beginning of dot to the end of the file.`
		536	`.PP`
		537	`These addresses are all absolute positions in the file, but`
		538	`.CW sam`
		539	`also has relative addresses, indicated by`
		540	`.CW +`
		541	`or`
		542	`.CW - .`
		543	`For example,`
		544	`.P1`
		545	`$-3`
		546	`.P2`
		547	`is the third line before the end of the file and`
		548	`.P1`
		549	`\&.+1`
		550	`.P2`
		551	`is the line after dot.`
		552	`If no address appears to the left of the`
		553	`.CW +`
		554	`or`
		555	`.CW - ,`
		556	`dot is assumed;`
		557	`if nothing appears to the right,`
		558	`.CW 1`
		559	`is assumed.`
		560	`Therefore,`
		561	`.CW .+1`
		562	`may be abbreviated to just a plus sign.`
		563	`.PP`
		564	`The`
		565	`.CW +`
		566	`operator acts relative to the end of its first argument, while the`
		567	`.CW -`
		568	`operator acts relative to the beginning. Thus`
		569	`.CW .+1`
		570	`addresses the first line after dot,`
		571	`.CW .-`
		572	`addresses the first line before dot, and`
		573	`.CW +-`
		574	`refers to the line containing the end of dot. (Dot may span multiple lines, and`
		575	`.CW +`
		576	`selects the line after the end of dot, then`
		577	`.CW -`
		578	`backs up one line.)`
		579	`.PP`
		580	`The final type of address is a regular expression, which addresses the`
		581	`text matched by the expression. The expression is enclosed in slashes, as in`
		582	`.P1`
		583	`/\f2expression\fP/`
		584	`.P2`
		585	`The expressions are the same as those in the UNIX program`
		586	`.CW egrep ,\u\s-4\&6,7\s+4\d`
		587	`and include closures, alternations, and so on.`
		588	`They find the`
		589	`.I`
		590	`leftmost longest`
		591	`.R`
		592	`string that matches the expression, that is,`
		593	`the first match after the point where the search is started,`
		594	`and if more than one match begins at the same spot, the longest such match.`
		595	`(I assume familiarity with the syntax for regular expressions in UNIX programs.\u\s-4\&9\s+4\d)`
		596	`For example,`
		597	`.P1`
		598	`/x/`
		599	`.P2`
		600	`matches the next`
		601	`.CW x`
		602	`character in the file,`
		603	`.P1`
		604	`/xx*/`
		605	`.P2`
		606	`matches the next run of one or more`
		607	`.CW x 's,`
		608	`and`
		609	`.P1`
		610	`/x\|Peter/`
		611	`.P2`
		612	`matches the next`
		613	`.CW x`
		614	`or`
		615	`.CW Peter .`
		616	For compatibility with other UNIX programs, the `any character' operator,
		617	`a period,`
		618	`does not match a newline, so`
		619	`.P1`
		620	`/.*/`
		621	`.P2`
		622	`matches the text from dot to the end of the line, but excludes the newline`
		623	`and so will not match across`
		624	`the line boundary.`
		625	`.PP`
		626	`Regular expressions are always relative addresses.`
		627	`The direction is forwards by default,`
		628	`so`
		629	`.CW /Peter/`
		630	`is really an abbreviation for`
		631	`.CW +/Peter/ .`
		632	`The search can be reversed with a minus sign, so`
		633	`.P1`
		634	`.CW -/Peter/`
		635	`.P2`
		636	`finds the first`
		637	`.CW Peter`
		638	`before dot.`
		639	`Regular expressions may be used with other address forms, so`
		640	`.CW 0+/Peter/`
		641	`finds the first`
		642	`.CW Peter`
		643	`in the file and`
		644	`.CW $-/Peter/`
		645	`finds the last.`
		646	`Table II summarizes`
		647	`.CW sam 's`
		648	`addresses.`
		649	`.KF`
		650	`.TS`
		651	`center;`
		652	`c s`
		653	`lfCW l.`
		654	`Table II. \f(CWSam\fP addresses`
		655	`.sp .4`
		656	`.ft CW`
		657	`_`
		658	`.ft`
		659	`.sp .4`
		660	`\f1Simple addresses\fP`
		661	`.sp .4`
		662	`_`
		663	`.sp .2`
		664	`#\f2n\fP The empty string after character \f2n\fP`
		665	`\f2n\fP Line \f2n\fP.`
		666	`/\f2regexp\fP/ The first following match of the regular expression`
		667	`-/\f2regexp\fP/ The first previous match of the regular expression`
		668	`$ The null string at the end of the file`
		669	`\&. Dot`
		670	`\&' The address mark, set by \f(CWk\fP command`
		671	`"\f2regexp\fP" Dot in the file whose menu line matches regexp`
		672	`.sp .4`
		673	`_`
		674	`.sp .4`
		675	`\f1Compound addresses\fP`
		676	`.sp .4`
		677	`_`
		678	`.sp .2`
		679	`\f2a1\fP+\f2a2\fP The address \f2a2\fP evaluated starting at right of \f2a1\fP`
		680	`\f2a1\fP-\f2a2\fP \f2a2\fP evaluated in the reverse direction starting at left of \f2a1\fP`
		681	`\f2a1\fP,\f2a2\fP From the left of \f2a1\fP to the right of \f2a2\fP (default \f(CW0,$\fP)`
		682	`\f2a1\fP;\f2a2\fP Like \f(CW,\fP but sets dot after evaluating \f2a1\fP`
		683	`.sp .4`
		684	`_`
		685	`.sp .4`
		686	`.T&`
		687	`c s.`
		688	`T{`
		689	`The operators`
		690	`.CW +`
		691	`and`
		692	`.CW -`
		693	`are high precedence, while`
		694	`.CW ,`
		695	`and`
		696	`.CW ;`
		697	`are low precedence.`
		698	`In both`
		699	`.CW +`
		700	`and`
		701	`.CW -`
		702	`forms,`
		703	`.I a2`
		704	`defaults to 1 and`
		705	`.I a1`
		706	`defaults to dot.`
		707	`If both`
		708	`.I a1`
		709	`and`
		710	`.I a2`
		711	`are present,`
		712	`.CW +`
		713	`may be elided.`
		714	`T}`
		715	`.sp .5`
		716	`.ft CW`
		717	`_`
		718	`.ft`
		719	`.TE`
		720	`.sp`
		721	`.KE`
		722	`.PP`
		723	`The language discussed so far will not seem novel`
		724	`to people who use UNIX text editors`
		725	`such as`
		726	`.CW ed`
		727	`or`
		728	`.CW vi .\u\s-4\&9\s+4\d`
		729	`Moreover, the kinds of editing operations these commands allow, with the exception`
		730	`of regular expressions and line numbers,`
		731	`are clearly more conveniently handled by a mouse-based interface.`
		732	`Indeed,`
		733	`.CW sam 's`
		734	`mouse language (discussed at length below) is the means by which`
		735	`simple changes are usually made.`
		736	`For large or repetitive changes, however, a textual language`
		737	`outperforms a manual interface.`
		738	`.PP`
		739	`Imagine that, instead of deleting just one occurrence of the string`
		740	`.CW Peter ,`
		741	`we wanted to eliminate every`
		742	`.CW Peter .`
		743	`What's needed is an iterator that runs a command for each occurrence of some`
		744	`text.`
		745	`.CW Sam 's`
		746	`iterator is called`
		747	`.CW x ,`
		748	`for extract:`
		749	`.P1`
		750	`x/\f2expression\fP/ \f2command\fP`
		751	`.P2`
		752	`finds all matches in dot of the specified expression, and for each`
		753	`such match, sets dot to the text matched and runs the command.`
		754	`So to delete all the`
		755	`.CW Peters:`
		756	`.P1`
		757	`0,$ x/Peter/ d`
		758	`.P2`
		759	`(Blanks in these examples are to improve readability;`
		760	`.CW sam`
		761	`neither requires nor interprets them.)`
		762	`This searches the entire file`
		763	`.CW 0,$ ) (`
		764	`for occurrences of the string`
		765	`.CW Peter ,`
		766	`and runs the`
		767	`.CW d`
		768	`command with dot set to each such occurrence.`
		769	`(By contrast, the comparable`
		770	`.CW ed`
		771	`command would delete all`
		772	`.I lines`
		773	`containing`
		774	`.CW Peter ;`
		775	`.CW sam`
		776	`deletes only the`
		777	`.CW Peters .)`
		778	`The address`
		779	`.CW 0,$`
		780	`is commonly used, and may be abbreviated to just a comma.`
		781	`As another example,`
		782	`.P1`
		783	`, x/Peter/ p`
		784	`.P2`
		785	`prints a list of`
		786	`.CW Peters,`
		787	`one for each appearance in the file, with no intervening text (not even newlines`
		788	`to separate the instances).`
		789	`.PP`
		790	`Of course, the text extracted by`
		791	`.CW x`
		792	`may be selected by a regular expression,`
		793	`which complicates deciding what set of matches is chosen \(em`
		794	`matches may overlap. This is resolved by generating the matches`
		795	`starting from the beginning of dot using the leftmost-longest rule,`
		796	`and searching for each match starting from the end of the previous one.`
		797	`Regular expressions may also match null strings, but a null match`
		798	`adjacent to a non-null match is never selected; at least one character`
		799	`must intervene.`
		800	`For example,`
		801	`.P1`
		802	`, c/AAA/`
		803	`x/B*/ c/-/`
		804	`, p`
		805	`.P2`
		806	`produces as output`
		807	`.P1`
		808	`-A-A-A-`
		809	`.P2`
		810	`because the pattern`
		811	`.CW B*`
		812	`matches the null strings separating the`
		813	`.CW A 's.`
		814	`.PP`
		815	`The`
		816	`.CW x`
		817	`command has a complement,`
		818	`.CW y ,`
		819	`with similar syntax, that executes the command with dot set to the text`
		820	`.I between`
		821	`the matches of the expression.`
		822	`For example,`
		823	`.P1`
		824	`, c/AAA/`
		825	`y/A/ c/-/`
		826	`, p`
		827	`.P2`
		828	`produces the same result as the example above.`
		829	`.PP`
		830	`The`
		831	`.CW x`
		832	`and`
		833	`.CW y`
		834	`commands are looping constructs, and`
		835	`.CW sam`
		836	`has a pair of conditional commands to go with them.`
		837	`They have similar syntax:`
		838	`.P1`
		839	`g/\f2expression\fP/ \f2command\fP`
		840	`.P2`
		841	`(guard)`
		842	`runs the command exactly once if dot contains a match of the expression.`
		843	`This is different from`
		844	`.CW x ,`
		845	`which runs the command for`
		846	`.I each`
		847	`match:`
		848	`.CW x`
		849	`loops;`
		850	`.CW g`
		851	`merely tests, without changing the value of dot.`
		852	`Thus,`
		853	`.P1`
		854	`, x/Peter/ d`
		855	`.P2`
		856	`deletes all occurrences of`
		857	`.CW Peter ,`
		858	`but`
		859	`.P1`
		860	`, g/Peter/ d`
		861	`.P2`
		862	`deletes the whole file (reduces it to a null string) if`
		863	`.CW Peter`
		864	`occurs anywhere in the text.`
		865	`The complementary conditional is`
		866	`.CW v ,`
		867	`which runs the command if there is`
		868	`.I no`
		869	`match of the expression.`
		870	`.PP`
		871	`These control-structure-like commands may be composed to construct more`
		872	`involved operations. For example, to print those lines of text that`
		873	`contain the string`
		874	`.CW Peter :`
		875	`.P1`
		876	`, x/.*\en/ g/Peter/ p`
		877	`.P2`
		878	`The`
		879	`.CW x`
		880	`breaks the file into lines, the`
		881	`.CW g`
		882	`selects those lines containing`
		883	`.CW Peter ,`
		884	`and the`
		885	`.CW p`
		886	`prints them.`
		887	`This command gives an address for the`
		888	`.CW x`
		889	`command (the whole file), but because`
		890	`.CW g`
		891	`does not have an explicit address, it applies to the value of`
		892	`dot produced by the`
		893	`.CW x`
		894	`command, that is, to each line.`
		895	`All commands in`
		896	`.CW sam`
		897	`except for the command to write a file to disc use dot for the`
		898	`default address.`
		899	`.PP`
		900	`Composition may be continued indefinitely.`
		901	`.P1`
		902	`, x/.*\en/ g/Peter/ v/SaltPeter/ p`
		903	`.P2`
		904	`prints those lines containing`
		905	`.CW Peter`
		906	`but`
		907	`.I not`
		908	`those containing`
		909	`.CW SaltPeter .`
		910	`.SH 2`
		911	`Structural Regular Expressions`
		912	`.LP`
		913	`Unlike other UNIX text editors,`
		914	`including the non-interactive ones such as`
		915	`.CW sed`
		916	`and`
		917	`.CW awk ,\u\s-4\&7\s+4\d`
		918	`.CW sam`
		919	is good for manipulating files with multi-line `records.'
		920	`An example is an on-line phone book composed of records,`
		921	`separated by blank lines, of the form`
		922	`.P1`
		923	`Herbert Tic`
		924	`44 Turnip Ave., Endive, NJ`
		925	`201-5555642`
		926
		927	`Norbert Twinge`
		928	`16 Potato St., Cabbagetown, NJ`
		929	`201-5553145`
		930
		931	`\&...`
		932	`.P2`
		933	`The format may be encoded as a regular expression:`
		934	`.P1`
		935	`(.+\en)+`
		936	`.P2`
		937	`that is, a sequence of one or more non-blank lines.`
		938	`The command to print Mr. Tic's entire record is then`
		939	`.P1`
		940	`, x/(.+\en)+/ g/^Herbert Tic$/ p`
		941	`.P2`
		942	`and that to extract just the phone number is`
		943	`.P1`
		944	`, x/(.+\en)+/ g/^Herbert Tic$/ x/^[0-9]-[0-9]\en/ p`
		945	`.P2`
		946	`The latter command breaks the file into records,`
		947	`chooses Mr. Tic's record,`
		948	`extracts the phone number from the record,`
		949	`and finally prints the number.`
		950	`.PP`
		951	`A more involved problem is that of`
		952	`renaming a particular variable, say`
		953	`.CW n ,`
		954	`to`
		955	`.CW num`
		956	`in a C program.`
		957	`The obvious first attempt,`
		958	`.P1`
		959	`, x/n/ c/num/`
		960	`.P2`
		961	`is badly flawed: it changes not only the variable`
		962	`.CW n`
		963	`but any letter`
		964	`.CW n`
		965	`that appears.`
		966	`We need to extract all the variables, and select those that match`
		967	`.CW n`
		968	`and only`
		969	`.CW n :`
		970	`.P1`
		971	`, x/[A-Za-z_][A-Za-z_0-9]*/ g/n/ v/../ c/num/`
		972	`.P2`
		973	`The pattern`
		974	`.CW [A-Za-z_][A-Za-z_0-9]*`
		975	`matches C identifiers.`
		976	`Next`
		977	`.CW g/n/`
		978	`selects those containing an`
		979	`.CW n .`
		980	`Then`
		981	`.CW v/../`
		982	`rejects those containing two (or more) characters, and finally`
		983	`.CW c/num/`
		984	`changes the remainder (identifiers`
		985	`.CW n )`
		986	`to`
		987	`.CW num .`
		988	`This version clearly works much better, but there may still be problems.`
		989	`For example, in C character and string constants, the sequence`
		990	`.CW \en`
		991	`is interpreted as a newline character, and we don't want to change it to`
		992	`.CW \enum.`
		993	`This problem can be forestalled with a`
		994	`.CW y`
		995	`command:`
		996	`.P1`
		997	`, y/\e\en/ x/[A-Za-z_][A-Za-z_0-9]*/ g/n/ v/../ c/num/`
		998	`.P2`
		999	`(the second`
		1000	`.CW \e`
		1001	`is necessary because of lexical conventions in regular expressions),`
		1002	`or we could even reject character constants and strings outright:`
		1003	`.P1 0`
		1004	`,y/'[^']'/ y/"[^"]"/ x/[A-Za-z_][A-Za-z_0-9]*/ g/n/ v/../ c/num/`
		1005	`.P2`
		1006	`The`
		1007	`.CW y`
		1008	`commands in this version exclude from consideration all character constants`
		1009	`and strings.`
		1010	`The only remaining problem is to deal with the possible occurrence of`
		1011	`.CW \e'`
		1012	`or`
		1013	`.CW \e"`
		1014	`within these sequences, but it's easy to see how to resolve this difficulty.`
		1015	`.PP`
		1016	`The point of these composed commands is successive refinement.`
		1017	`A simple version of the command is tried, and if it's not good enough,`
		1018	`it can be honed by adding a clause or two.`
		1019	`(Mistakes can be undone; see below.`
		1020	`Also, the mouse language makes it unnecessary to retype the command each time.)`
		1021	`The resulting chains of commands are somewhat reminiscent of`
		1022	`shell pipelines.\u\s-4\&7\s+4\d`
		1023	`Unlike pipelines, though, which pass along modified`
		1024	`.I data ,`
		1025	`.CW sam`
		1026	`commands pass a`
		1027	`.I view`
		1028	`of the data.`
		1029	`The text at each step of the command is the same, but which pieces`
		1030	`are selected is refined step by step until the correct piece is`
		1031	`available to the final step of the command line, which ultimately makes the change.`
		1032	`.PP`
		1033	`In other UNIX programs, regular expressions are used only for selection,`
		1034	`as in the`
		1035	`.CW sam`
		1036	`.CW g`
		1037	`command, never for extraction as in the`
		1038	`.CW x`
		1039	`or`
		1040	`.CW y`
		1041	`command.`
		1042	`For example, patterns in`
		1043	`.CW awk \u\s-4\&7\s+4\d`
		1044	`are used to select lines to be operated on, but cannot be used`
		1045	`to describe the format of the input text, or to handle newline-free text.`
		1046	`The use of regular expressions to describe the structure of a piece`
		1047	`of text rather than its contents, as in the`
		1048	`.CW x`
		1049	`command,`
		1050	`has been given a name:`
		1051	`.I`
		1052	`structural regular expressions.`
		1053	`.R`
		1054	`When they are composed, as in the above example,`
		1055	`they are pleasantly expressive.`
		1056	`Their use is discussed at greater length elsewhere.\u\s-4\&10\s+4\d`
		1057	`.PP`
		1058	`.SH 2`
		1059	`Multiple files`
		1060	`.LP`
		1061	`.CW Sam`
		1062	`has a few other commands, mostly relating to input and output.`
		1063	`.P1`
		1064	`e discfilename`
		1065	`.P2`
		1066	`replaces the contents and name of the current file with those of the named`
		1067	`disc file;`
		1068	`.P1`
		1069	`w discfilename`
		1070	`.P2`
		1071	`writes the contents to the named disc file; and`
		1072	`.P1`
		1073	`r discfilename`
		1074	`.P2`
		1075	`replaces dot with the contents of the named disc file.`
		1076	`All these commands use the current file's name if none is specified.`
		1077	`Finally,`
		1078	`.P1`
		1079	`f discfilename`
		1080	`.P2`
		1081	`changes the name associated with the file and displays the result:`
		1082	`.P1`
		1083	`\&'-. discfilename`
		1084	`.P2`
		1085	`This output is called the file's`
		1086	`.I`
		1087	`menu line,`
		1088	`.R`
		1089	`because it is the contents of the file's line in the button 3 menu (described`
		1090	`in the`
		1091	`next section).`
		1092	`The first three characters are a concise notation for the state of the file.`
		1093	`The apostrophe signifies that the file is modified.`
		1094	`The minus sign indicates the number of windows`
		1095	`open on the file (see the next section):`
		1096	`.CW -`
		1097	`means none,`
		1098	`.CW +`
		1099	`means one, and`
		1100	`.CW *`
		1101	`means more than one.`
		1102	`Finally, the period indicates that this is the current file.`
		1103	`These characters are useful for controlling the`
		1104	`.CW X`
		1105	`command, described shortly.`
		1106	`.PP`
		1107	`.CW Sam`
		1108	`may be started with a set of disc files (such as all the source for`
		1109	`a program) by invoking it with a list of file names as arguments, and`
		1110	`more may be added or deleted on demand.`
		1111	`.P1`
		1112	`B discfile1 discfile2 ...`
		1113	`.P2`
		1114	`adds the named files to`
		1115	`.CW sam 's`
		1116	`list, and`
		1117	`.P1`
		1118	`D discfile1 discfile2 ...`
		1119	`.P2`
		1120	`removes them from`
		1121	`.CW sam 's`
		1122	`memory (without effect on associated disc files).`
		1123	`Both these commands have a syntax for using the shell\u\s-4\&7\s+4\d`
		1124	`(the UNIX command interpreter) to generate the lists:`
		1125	`.P1`
		1126	`B <echo *.c`
		1127	`.P2`
		1128	`will add all C source files, and`
		1129	`.P1`
		1130	`B <grep -l variable *.c`
		1131	`.P2`
		1132	`will add all C source files referencing a particular variable`
		1133	`(the UNIX command`
		1134	`.CW grep\ -l`
		1135	`lists all files in its arguments that contain matches of`
		1136	`the specified regular expression).`
		1137	`Finally,`
		1138	`.CW D`
		1139	`without arguments deletes the current file.`
		1140	`.PP`
		1141	`There are two ways to change which file is current:`
		1142	`.P1`
		1143	`b filename`
		1144	`.P2`
		1145	`makes the named file current.`
		1146	`The`
		1147	`.CW B`
		1148	`command`
		1149	`does the same, but also adds any new files to`
		1150	`.CW sam 's`
		1151	`list.`
		1152	`(In practice, of course, the current file`
		1153	`is usually chosen by mouse actions, not by textual commands.)`
		1154	`The other way is to use a form of address that refers to files:`
		1155	`.P1`
		1156	`"\f2expression\fP" \f2address\fP`
		1157	`.P2`
		1158	`refers to the address evaluated in the file whose menu line`
		1159	`matches the expression (there must be exactly one match).`
		1160	`For example,`
		1161	`.P1`
		1162	`"peter.c" 3`
		1163	`.P2`
		1164	`refers to the third line of the file whose name matches`
		1165	`.CW peter.c .`
		1166	`This is most useful in the move`
		1167	`.CW m ) (`
		1168	`and copy`
		1169	`.CW t ) (`
		1170	`commands:`
		1171	`.P1`
		1172	`0,$ t "peter.c" 0`
		1173	`.P2`
		1174	`makes a copy of the current file at the beginning of`
		1175	`.CW peter.c .`
		1176	`.PP`
		1177	`The`
		1178	`.CW X`
		1179	`command`
		1180	`is a looping construct, like`
		1181	`.CW x ,`
		1182	`that refers to files instead of strings:`
		1183	`.P1`
		1184	`X/\f2expression\fP/ \f2command\fP`
		1185	`.P2`
		1186	`runs the command in all`
		1187	`files whose menu lines match the expression. The best example is`
		1188	`.P1`
		1189	`X/'/ w`
		1190	`.P2`
		1191	`which writes to disc all modified files.`
		1192	`.CW Y`
		1193	`is the complement of`
		1194	`.CW X :`
		1195	`it runs the command on all files whose menu lines don't match the expression:`
		1196	`.P1`
		1197	`Y/\e.c/ D`
		1198	`.P2`
		1199	`deletes all files that don't have`
		1200	`.CW \&.c`
		1201	`in their names, that is, it keeps all C source files and deletes the rest.`
		1202	`.PP`
		1203	`Braces allow commands to be grouped, so`
		1204	`.P1`
		1205	`{`
		1206	`\f2command1\fP`
		1207	`\f2command2\fP`
		1208	`}`
		1209	`.P2`
		1210	`is syntactically a single command that runs two commands.`
		1211	`Thus,`
		1212	`.P1`
		1213	`X/\e.c/ ,g/variable/ {`
		1214	`f`
		1215	`, x/.*\en/ g/variable/ p`
		1216	`}`
		1217	`.P2`
		1218	`finds all occurrences of`
		1219	`.CW variable`
		1220	`in C source files, and prints`
		1221	`out the file names and lines of each match.`
		1222	`The precise semantics of compound operations is discussed in the implementation`
		1223	`sections below.`
		1224	`.PP`
		1225	`Finally,`
		1226	`the undo command,`
		1227	`.CW u ,`
		1228	`undoes the last command,`
		1229	`no matter how many files were affected.`
		1230	`Multiple undo operations move further back in time, so`
		1231	`.P1`
		1232	`u`
		1233	`u`
		1234	`.P2`
		1235	`(which may be abbreviated`
		1236	`.CW u2 )`
		1237	`undoes the last two commands. An undo may not be undone, however, nor`
		1238	`may any command that adds or deletes files.`
		1239	`Everything else is undoable, though, including for example`
		1240	`.CW e`
		1241	`commands:`
		1242	`.P1`
		1243	`e filename`
		1244	`u`
		1245	`.P2`
		1246	`restores the state of the file completely, including its name, dot,`
		1247	`and modified bit. Because of the undo, potentially dangerous commands`
		1248	`are not guarded by confirmations. Only`
		1249	`.CW D ,`
		1250	`which destroys the information necessary to restore itself, is protected.`
		1251	`It will not delete a modified file, but a second`
		1252	`.CW D`
		1253	`of the same file will succeed regardless.`
		1254	`The`
		1255	`.CW q`
		1256	`command, which exits`
		1257	`.CW sam ,`
		1258	`is similarly guarded.`
		1259	`.SH 2`
		1260	`Mouse Interface`
		1261	`.LP`
		1262	`.CW Sam`
		1263	`is most commonly run`
		1264	`connected to a bitmap display and mouse for interactive editing.`
		1265	`The only difference in the command language`
		1266	`between regular, mouse-driven`
		1267	`.CW sam`
		1268	`and`
		1269	`.CW sam\ -d`
		1270	`is that if an address`
		1271	`is provided without a command,`
		1272	`.CW sam\ -d`
		1273	`will print the text referenced by the address, but`
		1274	`regular`
		1275	`.CW sam`
		1276	`will highlight it on the screen \(em in fact,`
		1277	`dot is always highlighted (see Figure 2).`
		1278	`.WS 1`
		1279	`.KF`
		1280	`.XP fig3 2.04i`
		1281	`.Cs`
		1282	`Figure 2. A`
		1283	`.CW sam`
		1284	`window. The scroll bar down the left`
		1285	`represents the file, with the bubble showing the fraction`
		1286	`visible in the window.`
		1287	`The scroll bar may be manipulated by the mouse for convenient browsing.`
		1288	`The current text,`
		1289	`which is highlighted, need not fit on a line. Here it consists of one partial`
		1290	`line, one complete line, and final partial line.`
		1291	`.Ce`
		1292	`.KE`
		1293	`.PP`
		1294	`Each file may have zero or more windows open on the display.`
		1295	`At any time, only one window in all of`
		1296	`.CW sam`
		1297	`is the`
		1298	`.I`
		1299	`current window,`
		1300	`.R`
		1301	`that is, the window to which typing and mouse actions refer;`
		1302	`this may be the`
		1303	`.CW sam`
		1304	`window (that in which commands may be typed)`
		1305	`or one of the file windows.`
		1306	`When a file has multiple windows, the image of the file in each window`
		1307	`is always kept up to date.`
		1308	`The current file is the last file affected by a command,`
		1309	`so if the`
		1310	`.CW sam`
		1311	`window is current,`
		1312	`the current window is not a window on the current file.`
		1313	`However, each window on a file has its own value of dot,`
		1314	`and when switching between windows on a single file,`
		1315	`the file's value of dot is changed to that of the window.`
		1316	`Thus, flipping between windows behaves in the obvious, convenient way.`
		1317	`.PP`
		1318	`The mouse on the Blit has three buttons, numbered left to right.`
		1319	`Button 3 has a list of commands to manipulate windows,`
		1320	followed by a list of `menu lines' exactly as printed by the
		1321	`.CW f`
		1322	`command, one per file (not one per window).`
		1323	`These menu lines are sorted by file name.`
		1324	`If the list is long, the Blit menu software will make it more manageable`
		1325	`by generating a scrolling menu instead of an unwieldy long list.`
		1326	`Using the menu to select a file from the list makes that file the current`
		1327	`file, and the most recently current window in that file the current window.`
		1328	`But if that file is already current, selecting it in the menu cycles through`
		1329	`the windows on the file; this simple trick avoids a special menu to`
		1330	`choose windows on a file.`
		1331	`If there is no window open on the file,`
		1332	`.CW sam`
		1333	`changes the mouse cursor to prompt the user to create one.`
		1334	`.PP`
		1335	`The commands on the button 3 menu are straightforward (see Figure 3), and`
		1336	`are like the commands to manipulate windows in`
		1337	`.CW mux ,\u\s-4\&8\s+4\d`
		1338	`the Blit's window system.`
		1339	`.CW New`
		1340	`makes a new file, and gives it one empty window, whose size is determined`
		1341	`by a rectangle swept by the mouse.`
		1342	`.CW Zerox`
		1343	`prompts for a window to be selected, and`
		1344	`makes a clone of that window; this is how multiple windows are created on one file.`
		1345	`.CW Reshape`
		1346	`changes the size of the indicated window, and`
		1347	`.CW close`
		1348	`deletes it. If that is the last window open on the file,`
		1349	`.CW close`
		1350	`first does a`
		1351	`.CW D`
		1352	`command on the file.`
		1353	`.CW Write`
		1354	`is identical to a`
		1355	`.CW w`
		1356	`command on the file; it is in the menu purely for convenience.`
		1357	`Finally,`
		1358	`.CW ~~sam~~`
		1359	`is a menu item that appears between the commands and the file names.`
		1360	`Selecting it makes the`
		1361	`.CW sam`
		1362	`window the current window,`
		1363	`causing subsequent typing to be interpreted as commands.`
		1364	`.KF`
		1365	`.XP fig2 2.74i`
		1366	`.Cs`
		1367	`Figure 3. The menu on button 3.`
		1368	`The black rectangle on the left is a scroll bar; the menu is limited to`
		1369	`the length shown to prevent its becoming unwieldy.`
		1370	`Above the`
		1371	`.CW ~~sam~~`
		1372	`line is a list of commands;`
		1373	`beneath it is a list of files, presented exactly as with the`
		1374	`.CW f`
		1375	`command.`
		1376	`.Ce`
		1377	`.KE`
		1378	`.PP`
		1379	`When`
		1380	`.CW sam`
		1381	`requests that a window be swept, in response to`
		1382	`.CW new ,`
		1383	`.CW zerox`
		1384	`or`
		1385	`.CW reshape ,`
		1386	`it changes the mouse cursor from the usual arrow to a box with`
		1387	`a small arrow.`
		1388	`In this state, the mouse may be used to indicate an arbitrary rectangle by`
		1389	`pressing button 3 at one corner and releasing it at the opposite corner.`
		1390	`More conveniently,`
		1391	`button 3 may simply be clicked,`
		1392	`whereupon`
		1393	`.CW sam`
		1394	`creates the maximal rectangle that contains the cursor`
		1395	`and abuts the`
		1396	`.CW sam`
		1397	`window.`
		1398	`By placing the`
		1399	`.CW sam`
		1400	`window in the middle of the screen, the user can define two regions (one above,`
		1401	`one below) in which stacked fully-overlapping`
		1402	`windows can be created with minimal fuss (see Figure 1).`
		1403	`This simple user interface trick makes window creation noticeably easier.`
		1404	`.PP`
		1405	`The cut-and-paste editor is essentially the same as that in Smalltalk-80.\u\s-4\&11\s+4\d`
		1406	`The text in dot is always highlighted on the screen.`
		1407	`When a character is typed it replaces dot, and sets dot to the null`
		1408	`string after the character. Thus, ordinary typing inserts text.`
		1409	`Button 1 is used for selection:`
		1410	`pressing the button, moving the mouse, and lifting the button`
		1411	`selects (sets dot to) the text between the points where the`
		1412	`button was pressed and released.`
		1413	`Pressing and releasing at the same point selects a null string; this`
		1414	`is called clicking. Clicking twice quickly, or`
		1415	`.I`
		1416	`double clicking,`
		1417	`.R`
		1418	`selects larger objects;`
		1419	`for example, double clicking in a word selects the word,`
		1420	`double clicking just inside an opening bracket selects the text`
		1421	`contained in the brackets (handling nested brackets correctly),`
		1422	`and similarly for`
		1423	`parentheses, quotes, and so on.`
		1424	`The double-clicking rules reflect a bias toward`
		1425	`programmers.`
		1426	`If`
		1427	`.CW sam`
		1428	`were intended more for word processing, double-clicks would probably`
		1429	`select linguistic structures such as sentences.`
		1430	`.PP`
		1431	`If button 1 is pressed outside the current window, it makes the indicated`
		1432	`window current.`
		1433	`This is the easiest way to switch between windows and files.`
		1434	`.PP`
		1435	`Pressing button 2 brings up a menu of editing functions (see Figure 4).`
		1436	`These mostly apply to the selected text:`
		1437	`.CW cut`
		1438	`deletes the selected text, and remembers it in a hidden buffer called the`
		1439	`.I`
		1440	`snarf buffer,`
		1441	`.R`
		1442	`.CW paste`
		1443	`replaces the selected text by the contents of the snarf buffer,`
		1444	`.CW snarf`
		1445	`just copies the selected text to the snarf buffer,`
		1446	`.CW look`
		1447	`searches forward for the next literal occurrence of the selected text, and`
		1448	`.CW <mux>`
		1449	`exchanges snarf buffers with the window system in which`
		1450	`.CW sam`
		1451	`is running.`
		1452	`Finally, the last regular expression used appears as a menu entry`
		1453	`to search`
		1454	`forward for the next occurrence of a match for the expression.`
		1455	`.WS 1`
		1456	`.KF`
		1457	`.XP fig4 1.20i`
		1458	`.Cs`
		1459	`Figure 4. The menu on button 2.`
		1460	`The bottom entry tracks the most recently used regular expression, which may`
		1461	`be literal text.`
		1462	`.Ce`
		1463	`.KE`
		1464	`.PP`
		1465	`The relationship between the command language and the mouse language is`
		1466	`entirely due to the equality of dot and the selected text chosen`
		1467	`with button 1 on the mouse.`
		1468	`For example, to make a set of changes in a C subroutine, dot can be`
		1469	`set by double clicking on the left brace that begins the subroutine,`
		1470	`which sets dot for the command language.`
		1471	`An address-free command then typed in the`
		1472	`.CW sam`
		1473	`window will apply only to the text between the opening and closing`
		1474	`braces of the function.`
		1475	`The idea is to select what you want, and then say what you want`
		1476	`to do with it, whether invoked by a menu selection or by a typed command.`
		1477	`And of course, the value of dot is highlighted on`
		1478	`the display after the command completes.`
		1479	`This relationship between mouse interface and command language`
		1480	`is clumsy to explain, but comfortable, even natural, in practice.`
		1481	`.SH`
		1482	`The Implementation`
		1483	`.LP`
		1484	`The next few sections describe how`
		1485	`.CW sam`
		1486	`is put together, first the host part,`
		1487	`then the inter-component communication,`
		1488	`then the terminal part.`
		1489	`After explaining how the command language is implemented,`
		1490	`the discussion follows (roughly) the path of a character`
		1491	`from the temporary file on disc to the screen.`
		1492	`The presentation centers on the data structures,`
		1493	`because that is how the program was designed and because`
		1494	`the algorithms are easy to provide, given the right data`
		1495	`structures.`
		1496	`.SH 2`
		1497	`Parsing and execution`
		1498	`.LP`
		1499	`The command language is interpreted by parsing each command with a`
		1500	`table-driven recursive`
		1501	`descent parser, and when a complete command is assembled, invoking a top-down`
		1502	`executor.`
		1503	`Most editors instead employ a simple character-at-a-time`
		1504	`lexical scanner.`
		1505	`Use of a parser makes it`
		1506	`easy and unambiguous to detect when a command is complete,`
		1507	`which has two advantages.`
		1508	`First, escape conventions such as backslashes to quote`
		1509	`multiple-line commands are unnecessary; if the command isn't finished,`
		1510	`the parser keeps reading. For example, a multiple-line append driven by an`
		1511	`.CW x`
		1512	`command is straightforward:`
		1513	`.P1`
		1514	`x/.*\en/ g/Peter/ a`
		1515	`one line about Peter`
		1516	`another line about Peter`
		1517	`\&.`
		1518	`.P2`
		1519	`Other UNIX editors would require a backslash after all but the last line.`
		1520	`.PP`
		1521	`The other advantage is specific to the two-process structure of`
		1522	`.CW sam .`
		1523	`The host process must decide when a command is completed so the`
		1524	`command interpreter can be called. This problem is easily resolved`
		1525	`by having the lexical analyzer read the single stream of events from the`
		1526	`terminal, directly executing all typing and mouse commands,`
		1527	`but passing to the parser characters typed to the`
		1528	`.CW sam`
		1529	`command window.`
		1530	`This scheme is slightly complicated by the availability of cut-and-paste`
		1531	`editing in the`
		1532	`.CW sam`
		1533	`window, but that difficulty is resolved by applying the rules`
		1534	`used in`
		1535	`.CW mux :`
		1536	`when a newline is typed to the`
		1537	`.CW sam`
		1538	`window, all text between the newline and the previously typed newline`
		1539	`is made available to the parser.`
		1540	`This permits arbitrary editing to be done to a command before`
		1541	`typing newline and thereby requesting execution.`
		1542	`.PP`
		1543	`The parser is driven by a table because the syntax of addresses`
		1544	`and commands is regular enough`
		1545	`to be encoded compactly. There are few special cases, such as the`
		1546	`replacement text in a substitution, so the syntax of almost all commands`
		1547	`can be encoded with a few flags.`
		1548	`These include whether the command allows an address (for example,`
		1549	`.CW e`
		1550	`does not), whether it takes a regular expression (as in`
		1551	`.CW x`
		1552	`and`
		1553	`.CW s ),`
		1554	`whether it takes replacement text (as in`
		1555	`.CW c`
		1556	`or`
		1557	`.CW i ),`
		1558	`which may be multi-line, and so on.`
		1559	`The internal syntax of regular expressions is handled by a separate`
		1560	`parser; a regular expression is a leaf of the command parse tree.`
		1561	`Regular expressions are discussed fully in the next section.`
		1562	`.PP`
		1563	`The parser table also has information about defaults, so the interpreter`
		1564	`is always called with a complete tree. For example, the parser fills in`
		1565	`the implicit`
		1566	`.CW 0`
		1567	`and`
		1568	`.CW $`
		1569	`in the abbreviated address`
		1570	`.CW ,`
		1571	`(comma),`
		1572	`inserts a`
		1573	`.CW +`
		1574	`to the left of an unadorned regular expression in an address,`
		1575	`and provides the usual default address`
		1576	`.CW .`
		1577	`(dot) for commands that expect an address but are not given one.`
		1578	`.PP`
		1579	`Once a complete command is parsed, the evaluation is easy.`
		1580	`The address is evaluated left-to-right starting from the value of dot,`
		1581	`with a mostly ordinary expression evaluator.`
		1582	`Addresses, like many of the data structures in`
		1583	`.CW sam ,`
		1584	`are held in a C structure and passed around by value:`
		1585	`.P1`
		1586	`typedef long Posn; /* Position in a file */`
		1587	`typedef struct Range{`
		1588	`Posn p1, p2;`
		1589	`}Range;`
		1590	`typedef struct Address{`
		1591	`Range r;`
		1592	`File *f;`
		1593	`}Address;`
		1594	`.P2`
		1595	`An address is encoded as a substring (character positions`
		1596	`.CW p1`
		1597	`to`
		1598	`.CW p2 )`
		1599	`in a file`
		1600	`.CW f .`
		1601	`(The data type`
		1602	`.CW File`
		1603	`is described in detail below.)`
		1604	`.PP`
		1605	`The address interpreter is an`
		1606	`.CW Address -valued`
		1607	`function that traverses the parse tree describing an address (the`
		1608	`parse tree for the address has type`
		1609	`.CW Addrtree ):`
		1610	`.P1`
		1611	`Address`
		1612	`address(ap, a, sign)`
		1613	`Addrtree *ap;`
		1614	`Address a;`
		1615	`int sign;`
		1616	`{`
		1617	`Address a2;`
		1618	`do`
		1619	`switch(ap->type){`
		1620	`case '.':`
		1621	`a=a.f->dot;`
		1622	`break;`
		1623	`case '$':`
		1624	`a.r.p1=a.r.p2=a.f->nbytes;`
		1625	`break;`
		1626	`case '"':`
		1627	`a=matchfile(a, ap->aregexp)->dot;`
		1628	`break;`
		1629	`case ',':`
		1630	`a2=address(ap->right, a, 0);`
		1631	`a=address(ap->left, a, 0);`
		1632	`if(a.f!=a2.f \|\| a2.r.p2<a.r.p1)`
		1633	`error(Eorder);`
		1634	`a.r.p2=a2.r.p2;`
		1635	`return a;`
		1636	`/* and so on */`
		1637	`}`
		1638	`while((ap=ap->right)!=0);`
		1639	`return a;`
		1640	`}`
		1641	`.P2`
		1642	`.PP`
		1643	`Throughout, errors are handled by a non-local`
		1644	`.CW goto`
		1645	`(a`
		1646	`.CW setjmp/longjmp`
		1647	`in C terminology)`
		1648	`hidden in a routine called`
		1649	`.CW error`
		1650	`that immediately aborts the execution, retracts any`
		1651	partially made changes (see the section below on `undoing'), and
		1652	`returns to the top level of the parser.`
		1653	`The argument to`
		1654	`.CW error`
		1655	`is an enumeration type that`
		1656	`is translated to a terse but possibly helpful`
		1657	message such as `?addresses out of order.'
		1658	`Very common messages are kept short; for example the message for`
		1659	a failed regular expression search is `?search.'
		1660	`.PP`
		1661	`Character addresses such as`
		1662	`.CW #3`
		1663	`are trivial to implement, as the`
		1664	`.CW File`
		1665	`data structure is accessible by character number.`
		1666	`However,`
		1667	`.CW sam`
		1668	`keeps no information about the position of newlines \(em it is too`
		1669	`expensive to track dynamically \(em so line addresses are computed by reading`
		1670	`the file, counting newlines. Except in very large files, this has proven`
		1671	`acceptable: file access is fast enough to make the technique practical,`
		1672	`and lines are not central to the structure of the command language.`
		1673	`.PP`
		1674	`The command interpreter, called`
		1675	`.CW cmdexec ,`
		1676	`is also straightforward. The parse table includes a`
		1677	`function to call to interpret a particular command. That function`
		1678	`receives as arguments`
		1679	`the calculated address`
		1680	`for the command`
		1681	`and the command tree (of type`
		1682	`.CW Cmdtree ),`
		1683	`which may contain information such as the subtree for compound commands.`
		1684	`Here, for example, is the function for the`
		1685	`.CW g`
		1686	`and`
		1687	`.CW v`
		1688	`commands:`
		1689	`.P1`
		1690	`int`
		1691	`g_cmd(a, cp)`
		1692	`Address a;`
		1693	`Cmdtree *cp;`
		1694	`{`
		1695	`compile(cp->regexp);`
		1696	`if(execute(a.f, a.r.p1, a.r.p2)!=(cp->cmdchar=='v')){`
		1697	`a.f->dot=a;`
		1698	`return cmdexec(a, cp->subcmd);`
		1699	`}`
		1700	`return TRUE; /* cause execution to continue */`
		1701	`}`
		1702	`.P2`
		1703	`.CW Compile "" (`
		1704	`and`
		1705	`.CW execute`
		1706	`are part of the regular expression code, described in the next section.)`
		1707	`Because the parser and the`
		1708	`.CW File`
		1709	`data structure do most of the work, most commands`
		1710	`are similarly brief.`
		1711	`.SH 2`
		1712	`Regular expressions`
		1713	`.LP`
		1714	`The regular expression code in`
		1715	`.CW sam`
		1716	`is an interpreted, rather than compiled on-the-fly, implementation of Thompson's`
		1717	`non-deterministic finite automaton algorithm.\u\s-4\&12\s+4\d`
		1718	`The syntax and semantics of the expressions are as in the UNIX program`
		1719	`.CW egrep ,`
		1720	`including alternation, closures, character classes, and so on.`
		1721	`The only changes in the notation are two additions:`
		1722	`.CW \en`
		1723	`is translated to, and matches, a newline character, and`
		1724	`.CW @`
		1725	`matches any character. In`
		1726	`.CW egrep ,`
		1727	`the character`
		1728	`.CW \&.`
		1729	`matches any character except newline, and in`
		1730	`.CW sam`
		1731	`the same rule seemed safest, to prevent idioms like`
		1732	`.CW \&.*`
		1733	`from spanning newlines.`
		1734	`.CW Egrep`
		1735	`expressions are arguably too complicated for an interactive editor \(em`
		1736	`certainly it would make sense if all the special characters were two-character`
		1737	`sequences, so that most of the punctuation characters wouldn't have`
		1738	`peculiar meanings \(em but for an interesting command language, full`
		1739	`regular expressions are necessary, and`
		1740	`.CW egrep`
		1741	`defines the full regular expression syntax for UNIX programs.`
		1742	`Also, it seemed superfluous to define a new syntax, since various UNIX programs`
		1743	`.CW ed , (`
		1744	`.CW egrep`
		1745	`and`
		1746	`.CW vi )`
		1747	`define too many already.`
		1748	`.PP`
		1749	`The expressions are compiled by a routine,`
		1750	`.CW compile ,`
		1751	`that generates the description of the non-deterministic finite state machine.`
		1752	`A second routine,`
		1753	`.CW execute ,`
		1754	`interprets the machine to generate the leftmost-longest match of the`
		1755	`expression in a substring of the file.`
		1756	`The algorithm is described elsewhere.\u\s-4\&12,13\s+4\d`
		1757	`.CW Execute`
		1758	`reports`
		1759	`whether a match was found, and sets a global variable,`
		1760	`of type`
		1761	`.CW Range ,`
		1762	`to the substring matched.`
		1763	`.PP`
		1764	`A trick is required to evaluate the expression in reverse, such as when`
		1765	`searching backwards for an expression.`
		1766	`For example,`
		1767	`.P1`
		1768	`-/P.*r/`
		1769	`.P2`
		1770	`looks backwards through the file for a match of the expression.`
		1771	`The expression, however, is defined for a forward search.`
		1772	`The solution is to construct a machine identical to the machine`
		1773	`for a forward search except for a reversal of all the concatenation`
		1774	`operators (the other operators are symmetric under direction reversal),`
		1775	`to exchange the meaning of the operators`
		1776	`.CW ^`
		1777	`and`
		1778	`.CW $ ,`
		1779	`and then to read the file backwards, looking for the`
		1780	`usual earliest longest match.`
		1781	`.PP`
		1782	`.CW Execute`
		1783	`generates only one match each time it is called.`
		1784	`To interpret looping constructs such as the`
		1785	`.CW x`
		1786	`command,`
		1787	`.CW sam`
		1788	`must therefore synchronize between`
		1789	`calls of`
		1790	`.CW execute`
		1791	`to avoid`
		1792	`problems with null matches.`
		1793	`For example, even given the leftmost-longest rule,`
		1794	`the expression`
		1795	`.CW a*`
		1796	`matches three times in the string`
		1797	`.CW ab`
		1798	`(the character`
		1799	`.CW a ,`
		1800	`the null string between the`
		1801	`.CW a`
		1802	`and`
		1803	`.CW b ,`
		1804	`and the final null string).`
		1805	`After returning a match for the`
		1806	`.CW a ,`
		1807	`.CW sam`
		1808	`must not match the null string before the`
		1809	`.CW b .`
		1810	`The algorithm starts`
		1811	`.CW execute`
		1812	`at the end of its previous match, and`
		1813	`if the match it returns`
		1814	`is null and abuts the previous match, rejects the match and advances`
		1815	`the initial position one character.`
		1816	`.SH 2`
		1817	`Memory allocation`
		1818	`.LP`
		1819	`The C language has no memory allocation primitives, although a standard`
		1820	`library routine,`
		1821	`.CW malloc ,`
		1822	`provides adequate service for simple programs.`
		1823	`For specific uses, however,`
		1824	`it can be better to write a custom allocator.`
		1825	`The allocator (or rather, pair of allocators) described here`
		1826	`work in both the terminal and host parts of`
		1827	`.CW sam .`
		1828	`They are designed for efficient manipulation of strings,`
		1829	`which are allocated and freed frequently and vary in length from essentially`
		1830	`zero to 32 Kbytes (very large strings are written to disc).`
		1831	`More important, strings may be large and change size often,`
		1832	`so to minimize memory usage it is helpful to reclaim and to coalesce the`
		1833	`unused portions of strings when they are truncated.`
		1834	`.PP`
		1835	`Objects to be allocated in`
		1836	`.CW sam`
		1837	`are of two flavors:`
		1838	`the first is C`
		1839	`.CW structs ,`
		1840	`which are small and often addressed by pointer variables;`
		1841	`the second is variable-sized arrays of characters`
		1842	`or integers whose`
		1843	`base pointer is always used to access them.`
		1844	`The memory allocator in`
		1845	`.CW sam`
		1846	`is therefore in two parts:`
		1847	`first, a traditional first-fit allocator that provides fixed storage for`
		1848	`.CW structs ;`
		1849	`and second, a garbage-compacting allocator that reduces storage`
		1850	`overhead for variable-sized objects, at the cost of some bookkeeping.`
		1851	`The two types of objects are allocated from adjoining arenas, with`
		1852	`the garbage-compacting allocator controlling the arena with higher addresses.`
		1853	`Separating into two arenas simplifies compaction and prevents fragmentation due`
		1854	`to immovable objects.`
		1855	`The access rules for garbage-compactable objects`
		1856	`(discussed in the next paragraph) allow them to be relocated, so when`
		1857	`the first-fit arena needs space, it moves the garbage-compacted arena`
		1858	`to higher addresses to make room. Storage is therefore created only`
		1859	`at successively higher addresses, either when more garbage-compacted`
		1860	`space is needed or when the first-fit arena pushes up the other arena.`
		1861	`.PP`
		1862	`Objects that may be compacted declare to the`
		1863	`allocator a cell that is guaranteed to be the sole repository of the`
		1864	`address of the object whenever a compaction can occur.`
		1865	`The compactor can then update the address when the object is moved.`
		1866	`For example, the implementation of type`
		1867	`.CW List`
		1868	`(really a variable-length array)`
		1869	`is:`
		1870	`.P1`
		1871	`typedef struct List{`
		1872	`int nused;`
		1873	`long *ptr;`
		1874	`}List;`
		1875	`.P2`
		1876	`The`
		1877	`.CW ptr`
		1878	`cell must always be used directly, and never copied. When a`
		1879	`.CW List`
		1880	`is to be created the`
		1881	`.CW List`
		1882	`structure is allocated in the ordinary first-fit arena`
		1883	`and its`
		1884	`.CW ptr`
		1885	`is allocated in the garbage-compacted arena.`
		1886	`A similar data type for strings, called`
		1887	`.CW String ,`
		1888	`stores variable-length character arrays of up to 32767 elements.`
		1889	`.PP`
		1890	`A related matter of programming style:`
		1891	`.CW sam`
		1892	`frequently passes structures by value, which`
		1893	`simplifies the code.`
		1894	`Traditionally, C programs have`
		1895	`passed structures by reference, but implicit allocation on`
		1896	`the stack is easier to use.`
		1897	`Structure passing is a relatively new feature of C`
		1898	`(it is not in the`
		1899	`standard reference manual for C\u\s-4\&14\s+4\d), and is poorly supported in most`
		1900	`commercial C compilers.`
		1901	`It's convenient and expressive, though,`
		1902	`and simplifies memory management by`
		1903	`avoiding the allocator altogether`
		1904	`and eliminating pointer aliases.`
		1905	`.SH 2`
		1906	`Data structures for manipulating files`
		1907	`.LP`
		1908	`Experience with`
		1909	`.CW jim`
		1910	`showed that the requirements`
		1911	`of the file data structure were few, but strict.`
		1912	`First, files need to be read and written quickly;`
		1913	`adding a fresh file must be painless.`
		1914	`Second, the implementation must place no arbitrary upper limit on`
		1915	`the number or sizes of files. (It should be practical to edit many files,`
		1916	`and files up to megabytes in length should be handled gracefully.)`
		1917	`This implies that files be stored on disc, not in main memory.`
		1918	`(Aficionados of virtual memory may argue otherwise, but the`
		1919	`implementation of virtual`
		1920	`memory in our system is not something to depend on`
		1921	`for good performance.)`
		1922	`Third, changes to files need be made by only two primitives:`
		1923	`deletion and insertion.`
		1924	`These are inverses of each other,`
		1925	`which simplifies the implementation of the undo operation.`
		1926	`Finally,`
		1927	`it must be easy and efficient to access the file, either`
		1928	`forwards or backwards, a byte at a time.`
		1929	`.PP`
		1930	`The`
		1931	`.CW File`
		1932	`data type is constructed from three simpler data structures that hold arrays`
		1933	`of characters.`
		1934	`Each of these types has an insertion and deletion operator, and the`
		1935	`insertion and deletion operators of the`
		1936	`.CW File`
		1937	`type itself are constructed from them.`
		1938	`.PP`
		1939	`The simplest type is the`
		1940	`.CW String ,`
		1941	`which is used to hold strings in main memory.`
		1942	`The code that manages`
		1943	`.CW Strings`
		1944	`guarantees that they will never be longer`
		1945	`than some moderate size, and in practice they are rarely larger than 8 Kbytes.`
		1946	`.CW Strings`
		1947	`have two purposes: they hold short strings like file names with little overhead,`
		1948	`and because they are deliberately small, they are efficient to modify.`
		1949	`They are therefore used as the data structure for in-memory caches.`
		1950	`.PP`
		1951	`The disc copy of the file is managed by a data structure called a`
		1952	`.CW Disc ,`
		1953	`which corresponds to a temporary file. A`
		1954	`.CW Disc`
		1955	`has no storage in main memory other than bookkeeping information;`
		1956	`the actual data being held is all on the disc.`
		1957	`To reduce the number of open files needed,`
		1958	`.CW sam`
		1959	`opens a dozen temporary UNIX files and multiplexes the`
		1960	`.CW Discs`
		1961	`upon them.`
		1962	`This permits many files to`
		1963	`be edited; the entire`
		1964	`.CW sam`
		1965	`source (48 files) may be edited comfortably with a single`
		1966	`instance of`
		1967	`.CW sam .`
		1968	`Allocating one temporary file per`
		1969	`.CW Disc`
		1970	`would strain the operating system's limit on the number of open files.`
		1971	`Also, spreading the traffic among temporary files keeps the files shorter,`
		1972	`and shorter files are more efficiently implemented by the UNIX`
		1973	`I/O subsystem.`
		1974	`.PP`
		1975	`A`
		1976	`.CW Disc`
		1977	`is an array of fixed-length blocks, each of which contains`
		1978	`between 1 and 4096 characters of active data.`
		1979	`(The block size of our UNIX file system is 4096 bytes.)`
		1980	`The block addresses within the temporary file and the length of each`
		1981	`block are stored in a`
		1982	`.CW List .`
		1983	`When changes are made the live part of blocks may change size.`
		1984	`Blocks are created and coalesced when necessary to try to keep the sizes`
		1985	`between 2048 and 4096 bytes.`
		1986	`An actively changing part of the`
		1987	`.CW Disc`
		1988	`therefore typically has about a kilobyte of slop that can be`
		1989	`inserted or deleted`
		1990	`without changing more than one block or affecting the block order.`
		1991	`When an insertion would overflow a block, the block is split, a new one`
		1992	`is allocated to receive the overflow, and the memory-resident list of blocks`
		1993	`is rearranged to reflect the insertion of the new block.`
		1994	`.PP`
		1995	`Obviously, going to the disc for every modification to the file is`
		1996	`prohibitively expensive.`
		1997	`The data type`
		1998	`.CW Buffer`
		1999	`consists of a`
		2000	`.CW Disc`
		2001	`to hold the data and a`
		2002	`.CW String`
		2003	`that acts as a cache.`
		2004	`This is the first of a series of caches throughout the data structures in`
		2005	`.CW sam.`
		2006	`The caches not only improve performance, they provide a way to organize`
		2007	`the flow of data, particularly in the communication between the host`
		2008	`and terminal.`
		2009	`This idea is developed below, in the section on communications.`
		2010	`.PP`
		2011	`To reduce disc traffic, changes to a`
		2012	`.CW Buffer`
		2013	`are mediated by a variable-length string, in memory, that acts as a cache.`
		2014	`When an insertion or deletion is made to a`
		2015	`.CW Buffer ,`
		2016	`if the change can be accommodated by the cache, it is done there.`
		2017	`If the cache becomes bigger than a block because of an insertion,`
		2018	`some of it is written to the`
		2019	`.CW Disc`
		2020	`and deleted from the cache.`
		2021	`If the change does not intersect the cache, the cache is flushed.`
		2022	`The cache is only loaded at the new position if the change is smaller than a block;`
		2023	`otherwise, it is sent directly to the`
		2024	`.CW Disc .`
		2025	`This is because`
		2026	`large changes are typically sequential,`
		2027	`whereupon the next change is unlikely to overlap the current one.`
		2028	`.PP`
		2029	`A`
		2030	`.CW File`
		2031	`comprises a`
		2032	`.CW String`
		2033	`to hold the file name and some ancillary data such as dot and the modified bit.`
		2034	`The most important components, though, are a pair of`
		2035	`.CW Buffers ,`
		2036	`one called the transcript and the other the contents.`
		2037	`Their use is described in the next section.`
		2038	`.PP`
		2039	`The overall structure is shown in Figure 5.`
		2040	`Although it may seem that the data is touched many times on its`
		2041	`way from the`
		2042	`.CW Disc ,`
		2043	`it is read (by one UNIX system call) directly into the cache of the`
		2044	`associated`
		2045	`.CW Buffer ;`
		2046	`no extra copy is done.`
		2047	`Similarly, when flushing the cache, the text is written`
		2048	`directly from the cache to disc.`
		2049	`Most operations act directly on the text in the cache.`
		2050	`A principle applied throughout`
		2051	`.CW sam`
		2052	`is that the fewer times the data is copied, the faster the program will run`
		2053	`(see also the paper by Waite\u\s-4\&15\s+4\d).`
		2054	`.KF`
		2055	`.PS`
		2056	`copy "fig5.pic"`
		2057	`.PE`
		2058	`.Cs`
		2059	`Figure 5. File data structures.`
		2060	`The temporary files are stored in the standard repository for such files`
		2061	`on the host system.`
		2062	`.Ce`
		2063	`.KE`
		2064	`.PP`
		2065	`The contents of a`
		2066	`.CW File`
		2067	`are accessed by a routine that`
		2068	`copies to a buffer a substring of a file starting at a specified offset.`
		2069	`To read a byte at a time, a`
		2070	`.CW File "" per-`
		2071	`array is loaded starting from a specified initial position,`
		2072	`and bytes may then be read from the array.`
		2073	`The implementation is done by a macro similar to the C standard I/O`
		2074	`.CW getc`
		2075	`macro.\u\s-4\&14\s+4\d`
		2076	`Because the reading may be done at any address, a minor change to the`
		2077	`macro allows the file to be read backwards.`
		2078	`This array is read-only; there is no`
		2079	`.CW putc .`
		2080	`.SH 2`
		2081	`Doing and undoing`
		2082	`.LP`
		2083	`.CW Sam`
		2084	`has an unusual method for managing changes to files.`
		2085	`The command language makes it easy to specify multiple variable-length changes`
		2086	`to a file millions of bytes long, and such changes`
		2087	`must be made efficiently if the editor is to be practical.`
		2088	`The usual techniques for inserting and deleting strings`
		2089	`are inadequate under these conditions.`
		2090	`The`
		2091	`.CW Buffer`
		2092	`and`
		2093	`.CW Disc`
		2094	`data structures are designed for efficient random access to long strings,`
		2095	`but care must be taken to avoid super-linear behavior when making`
		2096	`many changes simultaneously.`
		2097	`.PP`
		2098	`.CW Sam`
		2099	`uses a two-pass algorithm for making changes, and treats each file as a database`
		2100	`against which transactions are registered.`
		2101	`Changes are not made directly to the contents.`
		2102	Instead, when a command is started, a `mark' containing
		2103	`a sequence number is placed in the transcript`
		2104	`.CW Buffer ,`
		2105	`and each change made to the file, either an insertion or deletion`
		2106	`or a change to the file name,`
		2107	`is appended to the end of the transcript.`
		2108	`When the command is complete, the transcript is rewound to the`
		2109	`mark and applied to the contents.`
		2110	`.PP`
		2111	`One reason for separating evaluation from`
		2112	`application in this way is to simplify tracking the addresses of changes`
		2113	`made in the middle of a long sequence.`
		2114	`The two-pass algorithm also allows all changes to apply to the`
		2115	`.I original`
		2116	`data: no change can affect another change made in the same command.`
		2117	`This is particularly important when evaluating an`
		2118	`.CW x`
		2119	`command because it prevents regular expression matches`
		2120	`from stumbling over changes made earlier in the execution.`
		2121	`Also, the two-pass`
		2122	`algorithm is cleaner than the way other UNIX editors allow changes to`
		2123	`affect each other;`
		2124	`for example,`
		2125	`.CW ed 's`
		2126	`idioms to do things like delete every other line`
		2127	`depend critically on the implementation.`
		2128	`Instead,`
		2129	`.CW sam 's`
		2130	`simple model, in which all changes in a command occur effectively`
		2131	`simultaneously, is easy to explain and to understand.`
		2132	`.PP`
		2133	The records in the transcript are of the form ``delete substring from
		2134	`locations`
		2135	123 to 456'' and ``insert 11 characters `hello there' at location 789.''
		2136	`(It is an error if the changes are not at monotonically greater`
		2137	`positions through the file.)`
		2138	`While the update is occurring, these numbers must be`
		2139	`offset by earlier changes, but that is straightforward and`
		2140	`local to the update routine;`
		2141	`moreover, all the numbers have been computed`
		2142	`before the first is examined.`
		2143	`.PP`
		2144	`Treating the file as a transaction system has another advantage:`
		2145	`undo is trivial.`
		2146	`All it takes is to invert the transcript after it has been`
		2147	`implemented, converting insertions`
		2148	`into deletions and vice versa, and saving them in a holding`
		2149	`.CW Buffer .`
		2150	The `do' transcript can then be deleted from
		2151	`the transcript`
		2152	`.CW Buffer`
		2153	and replaced by the `undo' transcript.
		2154	`If an undo is requested, the transcript is rewound and the undo transcript`
		2155	`executed.`
		2156	`Because the transcript`
		2157	`.CW Buffer`
		2158	`is not truncated after each command, it accumulates`
		2159	`successive changes.`
		2160	`A sequence of undo commands`
		2161	`can therefore back up the file arbitrarily,`
		2162	`which is more helpful than the more commonly implemented self-inverse form of undo.`
		2163	`.CW Sam "" (`
		2164	`provides no way to undo an undo, but if it were desired,`
		2165	it would be easy to provide by re-interpreting the `do' transcript.)
		2166	`Each mark in the transcript contains a sequence number and the offset into`
		2167	`the transcript of the previous mark, to aid in unwinding the transcript.`
		2168	`Marks also contain the value of dot and the modified bit so these can be`
		2169	`restored easily.`
		2170	`Undoing multiple files is easy; it merely demands undoing all files whose`
		2171	`latest change has the same sequence number as the current file.`
		2172	`.PP`
		2173	`Another benefit of having a transcript is that errors encountered in the middle`
		2174	`of a complicated command need not leave the files in an intermediate state.`
		2175	`By rewinding the transcript to the mark beginning the command,`
		2176	`the partial command can be trivially undone.`
		2177	`.PP`
		2178	`When the update algorithm was first implemented, it was unacceptably slow,`
		2179	`so a cache was added to coalesce nearby changes,`
		2180	`replacing multiple small changes by a single larger one.`
		2181	`This reduced the number`
		2182	`of insertions into the transaction`
		2183	`.CW Buffer ,`
		2184	`and made a dramatic improvement in performance,`
		2185	`but made it impossible`
		2186	`to handle changes in non-monotonic order in the file; the caching method`
		2187	`only works if changes don't overlap.`
		2188	`Before the cache was added, the transaction could in principle be sorted`
		2189	`if the changes were out of order, although`
		2190	`this was never done.`
		2191	`The current status is therefore acceptable performance with a minor`
		2192	`restriction on global changes, which is sometimes, but rarely, an annoyance.`
		2193	`.PP`
		2194	`The update algorithm obviously paws the data more than simpler`
		2195	`algorithms, but it is not prohibitively expensive;`
		2196	`the caches help.`
		2197	`(The principle of avoiding copying the data is still honored here,`
		2198	`although not as piously:`
		2199	`the data is moved from contents' cache to`
		2200	`the transcript's all at once and through only one internal buffer.)`
		2201	`Performance figures confirm the efficiency.`
		2202	`To read from a dead start a hundred kilobyte file on a VAX-11/750`
		2203	`takes 1.4 seconds of user time, 2.5 seconds of system time,`
		2204	`and 5 seconds of real time.`
		2205	`Reading the same file in`
		2206	`.CW ed`
		2207	`takes 6.0 seconds of user time, 1.7 seconds of system time,`
		2208	`and 8 seconds of real time.`
		2209	`.CW Sam`
		2210	`uses about half the CPU time.`
		2211	`A more interesting example is the one stated above:`
		2212	`inserting a character between every pair of characters in the file.`
		2213	`The`
		2214	`.CW sam`
		2215	`command is`
		2216	`.P1`
		2217	`,y/@/ a/x/`
		2218	`.P2`
		2219	`and takes 3 CPU seconds per kilobyte of input file, of which`
		2220	`about a third is spent in the regular expression code.`
		2221	`This translates to about 500 changes per second.`
		2222	`.CW Ed`
		2223	`takes 1.5 seconds per kilobyte to make a similar change (ignoring newlines),`
		2224	`but cannot undo it.`
		2225	`The same example in`
		2226	`.CW ex ,\u\s-4\&9\s+4\d`
		2227	`a variant of`
		2228	`.CW ed`
		2229	`done at the University of California at Berkeley,`
		2230	`which allows one level of undoing, again takes 3 seconds.`
		2231	`In summary,`
		2232	`.CW sam 's`
		2233	`performance is comparable to that of other UNIX editors, although it solves`
		2234	`a harder problem.`
		2235	`.SH 2`
		2236	`Communications`
		2237	`.LP`
		2238	`The discussion so far has described the implementation of the host part of`
		2239	`.CW sam ;`
		2240	`the next few sections explain how a machine with mouse and bitmap display`
		2241	`can be engaged to improve interaction.`
		2242	`.CW Sam`
		2243	`is not the first editor to be written as two processes,\u\s-4\&16\s+4\d`
		2244	`but its implementation`
		2245	`has some unusual aspects.`
		2246	`.PP`
		2247	`There are several ways`
		2248	`.CW sam 's`
		2249	`host and terminal parts may be connected.`
		2250	`The first and simplest is to forgo the terminal part and use the host`
		2251	`part's command language to edit text on an ordinary terminal.`
		2252	`This mode is invoked by starting`
		2253	`.CW sam`
		2254	`with the`
		2255	`.CW -d`
		2256	`option.`
		2257	`With no options,`
		2258	`.CW sam`
		2259	`runs separate host and terminal programs,`
		2260	`communicating with a message protocol over the physical`
		2261	`connection that joins them.`
		2262	`Typically, the connection is an RS-232 link between a Blit`
		2263	`(the prototypical display for`
		2264	`.CW sam )`
		2265	`and a host running`
		2266	`the Ninth Edition of the UNIX operating system.\u\s-4\&8\s+4\d`
		2267	`(This is the version of the system used in the Computing Sciences Research`
		2268	`Center at AT&T Bell Laboratories [now Lucent Technologies, Bell Labs], where I work. Its relevant`
		2269	`aspects are discussed in the Blit paper.\u\s-4\&1\s+4\d)`
		2270	`The implementation of`
		2271	`.CW sam`
		2272	`for the SUN computer runs both processes on the same machine and`
		2273	`connects them by a pipe.`
		2274	`.PP`
		2275	`The low bandwidth of an RS-232 link`
		2276	`necessitated the split between`
		2277	`the two programs.`
		2278	`The division is a mixed blessing:`
		2279	`a program in two parts is much harder to write and to debug`
		2280	`than a self-contained one,`
		2281	`but the split makes several unusual configurations possible.`
		2282	`The terminal may be physically separated from the host, allowing the conveniences`
		2283	`of a mouse and bitmap display to be taken home while leaving the files at work.`
		2284	`It is also possible to run the host part on a remote machine:`
		2285	`.P1`
		2286	`sam -r host`
		2287	`.P2`
		2288	`connects to the terminal in the usual way, and then makes a call`
		2289	`across the network to establish the host part of`
		2290	`.CW sam`
		2291	`on the named machine.`
		2292	`Finally, it cross-connects the I/O to join the two parts.`
		2293	`This allows`
		2294	`.CW sam`
		2295	`to be run on machines that do not support bitmap displays;`
		2296	`for example,`
		2297	`.CW sam`
		2298	`is the editor of choice on our Cray X-MP/24.`
		2299	`.CW Sam`
		2300	`.CW -r`
		2301	`involves`
		2302	`.I three`
		2303	`machines: the remote host, the terminal, and the local host.`
		2304	`The local host's job is simple but vital: it passes the data`
		2305	`between the remote host and terminal.`
		2306	`.PP`
		2307	`The host and terminal exchange messages asynchronously`
		2308	`(rather than, say, as remote procedure calls) but there is no`
		2309	`error detection or correction`
		2310	`because, whatever the configuration, the connection is reliable.`
		2311	`Because the terminal handles mundane interaction tasks such as`
		2312	`popping up menus and interpreting the responses, the messages are about`
		2313	`data, not actions.`
		2314	`For example, the host knows nothing about what is displayed on the screen,`
		2315	`and when the user types a character, the message sent to the host says`
		2316	``insert a one-byte string at location 123 in file 7,'' not ``a character
		2317	`was typed at the current position in the current file.''`
		2318	`In other words, the messages look very much like the transaction records`
		2319	`in the transcripts.`
		2320	`.PP`
		2321	`Either the host or terminal part of`
		2322	`.CW sam`
		2323	`may initiate a change to a file.`
		2324	`The command language operates on the host, while typing and some`
		2325	`mouse operations are executed directly in the terminal to optimize response.`
		2326	`Changes initiated by the host program must be transmitted to the terminal,`
		2327	`and`
		2328	`vice versa.`
		2329	`(A token is exchanged to determine which end is in control,`
		2330	`which means that characters typed while a time-consuming command runs`
		2331	`must be buffered and do not appear until the command is complete.)`
		2332	`To maintain consistent information,`
		2333	`the host and terminal track changes through a per-file`
		2334	`data structure that records what portions of the file`
		2335	`the terminal has received.`
		2336	`The data structure, called a`
		2337	`.CW Rasp`
		2338	`(a weak pun: it's a file with holes)`
		2339	`is held and updated by both the host and terminal.`
		2340	`A`
		2341	`.CW Rasp`
		2342	`is a list of`
		2343	`.CW Strings`
		2344	`holding those parts of the file known to the terminal,`
		2345	`separated by counts of the number of bytes in the interstices.`
		2346	`Of course, the host doesn't keep a separate copy of the data (it only needs`
		2347	`the lengths of the various pieces),`
		2348	`but the structure is the same on both ends.`
		2349	`.PP`
		2350	`The`
		2351	`.CW Rasp`
		2352	`in the terminal doubles as a cache.`
		2353	`Since the terminal keeps the text for portions of the file it has displayed,`
		2354	`it need not request data from the host when revisiting old parts of the file`
		2355	`or redrawing obscured windows, which speeds things up considerably`
		2356	`over low-speed links.`
		2357	`.PP`
		2358	`It's trivial for the terminal to maintain its`
		2359	`.CW Rasp ,`
		2360	`because all changes made on the terminal apply to parts of the file`
		2361	`already loaded there.`
		2362	`Changes made by the host are compared against the`
		2363	`.CW Rasp`
		2364	`during the update sequence after each command.`
		2365	`Small changes to pieces of the file loaded in the terminal`
		2366	`are sent in their entirety.`
		2367	`Larger changes, and changes that fall entirely in the holes,`
		2368	`are transmitted as messages without literal data:`
		2369	`only the lengths of the deleted and inserted strings are transmitted.`
		2370	`When a command is completed, the terminal examines its visible`
		2371	`windows to see if any holes in their`
		2372	`.CW Rasps`
		2373	`intersect the visible portion of the file.`
		2374	`It then requests the missing data from the host,`
		2375	`along with up to 512 bytes of surrounding data, to minimize`
		2376	`the number of messages when visiting a new portion of the file.`
		2377	`This technique provides a kind of two-level lazy evaluation for the terminal.`
		2378	`The first level sends a minimum of information about`
		2379	`parts of the file not being edited interactively;`
		2380	`the second level waits until a change is displayed before`
		2381	`transmitting the new data.`
		2382	`Of course,`
		2383	`performance is also helped by having the terminal respond immediately to typing`
		2384	`and simple mouse requests.`
		2385	`Except for small changes to active pieces of the file, which are`
		2386	`transmitted to the terminal without negotiation,`
		2387	`the terminal is wholly responsible for deciding what is displayed;`
		2388	`the host uses the`
		2389	`.CW Rasp`
		2390	`only to tell the terminal what might be relevant.`
		2391	`.PP`
		2392	`When a change is initiated by the host,`
		2393	`the messages to the terminal describing the change`
		2394	`are generated by the routine that applies the transcript of the changes`
		2395	`to the contents of the`
		2396	`.CW File .`
		2397	`Since changes are undone by the same update routine,`
		2398	`undoing requires`
		2399	`no extra code in the communications;`
		2400	`the usual messages describing changes to the file are sufficient`
		2401	`to back up the screen image.`
		2402	`.PP`
		2403	`The`
		2404	`.CW Rasp`
		2405	`is a particularly good example of the way caches are used in`
		2406	`.CW sam .`
		2407	`First, it facilitates access to the active portion of the text by placing`
		2408	`the busy text in main memory.`
		2409	`In so doing, it provides efficient access`
		2410	`to a large data structure that does not fit in memory.`
		2411	`Since the form of data is to be imposed by the user, not by the program,`
		2412	`and because characters will frequently be scanned sequentially,`
		2413	`files are stored as flat objects.`
		2414	`Caches help keep performance good and linear when working with such`
		2415	`data.`
		2416	`.PP`
		2417	`Second, the`
		2418	`.CW Rasp`
		2419	`and several of the other caches have some`
		2420	`.I read-ahead;`
		2421	`that is, the cache is loaded with more information than is needed for`
		2422	`the job immediately at hand.`
		2423	`When manipulating linear structures, the accesses are usually sequential,`
		2424	`and read-ahead can significantly reduce the average time to access the`
		2425	`next element of the object.`
		2426	`Sequential access is a common mode for people as well as programs;`
		2427	`consider scrolling through a document while looking for something.`
		2428	`.PP`
		2429	`Finally, like any good data structure,`
		2430	`the cache guides the algorithm, or at least the implementation.`
		2431	`The`
		2432	`.CW Rasp`
		2433	`was actually invented to control the communications between the host and`
		2434	`terminal parts, but I realized very early that it was also a form of`
		2435	`cache. Other caches were more explicitly intended to serve a double`
		2436	`purpose: for example, the caches in`
		2437	`.CW Files`
		2438	`that coalesce updates not only reduce traffic to the`
		2439	`transcript and contents`
		2440	`.CW Buffers ,`
		2441	`they also clump screen updates so that complicated changes to the`
		2442	`screen are achieved in`
		2443	`just a few messages to the terminal.`
		2444	`This saved me considerable work: I did not need to write special`
		2445	`code to optimize the message traffic to the`
		2446	`terminal.`
		2447	`Caches pay off in surprising ways.`
		2448	`Also, they tend to be independent, so their performance improvements`
		2449	`are multiplicative.`
		2450	`.SH 2`
		2451	`Data structures in the terminal`
		2452	`.LP`
		2453	`The terminal's job is to display and to maintain a consistent image of`
		2454	`pieces of the files being edited.`
		2455	`Because the text is always in memory, the data structures are`
		2456	`considerably simpler than those in the host part.`
		2457	`.PP`
		2458	`.CW Sam`
		2459	`typically has far more windows than does`
		2460	`.CW mux ,`
		2461	`the window system within which its Blit implementation runs.`
		2462	`.CW Mux`
		2463	`has a fairly small number of asynchronously updated windows;`
		2464	`.CW sam`
		2465	`needs a large number of synchronously updated windows that are`
		2466	`usually static and often fully obscured.`
		2467	`The different tradeoffs guided`
		2468	`.CW sam`
		2469	`away from the memory-intensive implementation of windows, called`
		2470	`.CW Layers ,\u\s-4\&17\s+4\d`
		2471	`used in`
		2472	`.CW mux.`
		2473	`Rather than depending on a complete bitmap image of the display for each window,`
		2474	`.CW sam`
		2475	`regenerates the image from its in-memory text`
		2476	`(stored in the`
		2477	`.CW Rasp )`
		2478	`when necessary, although it will use such an image if it is available.`
		2479	`Like`
		2480	`.CW Layers ,`
		2481	`though,`
		2482	`.CW sam`
		2483	`uses the screen bitmap as active storage in which to update the image using`
		2484	`.CW bitblt .\u\s-4\&18,19\s+4\d`
		2485	`The resulting organization, pictured in Figure 6,`
		2486	`has a global array of windows, called`
		2487	`.CW Flayers ,`
		2488	`each of which holds an image of a piece of text held in a data structure`
		2489	`called a`
		2490	`.CW Frame ,`
		2491	`which in turn represents`
		2492	`a rectangular window full of text displayed in some`
		2493	`.CW Bitmap .`
		2494	`Each`
		2495	`.CW Flayer`
		2496	`appears in a global list that orders them all front-to-back`
		2497	`on the display, and simultaneously as an element of a per-file array`
		2498	`that holds all the open windows for that file.`
		2499	`The complement in the terminal of the`
		2500	`.CW File`
		2501	`on the host is called a`
		2502	`.CW Text ;`
		2503	`each connects its`
		2504	`.CW Flayers`
		2505	`to the associated`
		2506	`.CW Rasp .`
		2507	`.KF`
		2508	`.PS`
		2509	`copy "fig6.pic"`
		2510	`.PE`
		2511	`.Cs`
		2512	`Figure 6. Data structures in the terminal.`
		2513	`.CW Flayers`
		2514	`are also linked together into a front-to-back list.`
		2515	`.CW Boxes`
		2516	`are discussed in the next section.`
		2517	`.Ce`
		2518	`.KE`
		2519	`.PP`
		2520	`The`
		2521	`.CW Bitmap`
		2522	`for a`
		2523	`.CW Frame`
		2524	`contains the image of the text.`
		2525	`For a fully visible window, the`
		2526	`.CW Bitmap`
		2527	`will be the screen (or at least the`
		2528	`.CW Layer`
		2529	`in which`
		2530	`.CW sam`
		2531	`is being run),`
		2532	`while for partially obscured windows the`
		2533	`.CW Bitmap`
		2534	`will be off-screen.`
		2535	`If the window is fully obscured, the`
		2536	`.CW Bitmap`
		2537	`will be null.`
		2538	`.PP`
		2539	`The`
		2540	`.CW Bitmap`
		2541	`is a kind of cache.`
		2542	`When making changes to the display, most of the original image will`
		2543	`look the same in the final image, and the update algorithms exploit this.`
		2544	`The`
		2545	`.CW Frame`
		2546	`software updates the image in the`
		2547	`.CW Bitmap`
		2548	`incrementally; the`
		2549	`.CW Bitmap`
		2550	`is not just an image, it is a data structure.\u\s-4\&18,19\s+4\d`
		2551	`The job of the software that updates the display is therefore`
		2552	`to use as much as possible of the existing image (converting the`
		2553	`text from ASCII characters to pixels is expensive) in a sort of two-dimensional`
		2554	`string insertion algorithm.`
		2555	`The details of this process are described in the next section.`
		2556	`.PP`
		2557	`The`
		2558	`.CW Frame`
		2559	`software has no code to support overlapping windows;`
		2560	`its job is to keep a single`
		2561	`.CW Bitmap`
		2562	`up to date.`
		2563	`It falls to the`
		2564	`.CW Flayer`
		2565	`software to multiplex the various`
		2566	`.CW Bitmaps`
		2567	`onto the screen.`
		2568	`The problem of maintaining overlapping`
		2569	`.CW Flayers`
		2570	`is easier than for`
		2571	`.CW Layers \u\s-4\&17\s+4\d`
		2572	`because changes are made synchronously and because the contents of the window`
		2573	`can be reconstructed from the data stored in the`
		2574	`.CW Frame ;`
		2575	`the`
		2576	`.CW Layers`
		2577	`software`
		2578	`makes no such assumptions.`
		2579	`In`
		2580	`.CW sam ,`
		2581	`the window being changed is almost always fully visible, because the current`
		2582	`window is always fully visible, by construction.`
		2583	`However, when multi-file changes are being made, or when`
		2584	`more than one window is open on a file,`
		2585	`it may be necessary to update partially obscured windows.`
		2586	`.PP`
		2587	`There are three cases: the window is`
		2588	`fully visible, invisible (fully obscured), or partially visible.`
		2589	`If fully visible, the`
		2590	`.CW Bitmap`
		2591	`is part of the screen, so when the`
		2592	`.CW Flayer`
		2593	`update routine calls the`
		2594	`.CW Frame`
		2595	`update routine, the screen will be updated directly.`
		2596	`If the window is invisible,`
		2597	`there is no associated`
		2598	`.CW Bitmap ,`
		2599	`and all that is necessary is to update the`
		2600	`.CW Frame`
		2601	`data structure, not the image.`
		2602	`If the window is partially visible, the`
		2603	`.CW Frame`
		2604	`routine is called to update the image in the off-screen`
		2605	`.CW Bitmap ,`
		2606	`which may require regenerating it from the text of the window.`
		2607	`The`
		2608	`.CW Flayer`
		2609	`code then clips this`
		2610	`.CW Bitmap`
		2611	`against the`
		2612	`.CW Bitmaps`
		2613	`of all`
		2614	`.CW Frames`
		2615	`in front of the`
		2616	`.CW Frame`
		2617	`being modified, and the remainder is copied to the display.`
		2618	`.PP`
		2619	`This is much faster than recreating the image off-screen`
		2620	`for every change, or clipping all the changes made to the image`
		2621	`during its update.`
		2622	`Unfortunately, these caches can also consume prohibitive amounts of`
		2623	`memory, so they are freed fairly liberally \(em after every change to the`
		2624	`front-to-back order of the`
		2625	`.CW Flayers .`
		2626	`The result is that`
		2627	`the off-screen`
		2628	`.CW Bitmaps`
		2629	`exist only while multi-window changes are occurring,`
		2630	`which is the only time the performance improvement they provide is needed.`
		2631	`Also, the user interface causes fully-obscured windows to be the`
		2632	`easiest to make \(em`
		2633	`creating a canonically sized and placed window requires only a button click`
		2634	`\(em which reduces the need for caching still further.`
		2635	`.PP`
		2636	`.SH 2`
		2637	`Screen update`
		2638	`.LP`
		2639	`Only two low-level primitives are needed for incremental update:`
		2640	`.CW bitblt ,`
		2641	`which copies rectangles of pixels, and`
		2642	`.CW string`
		2643	`(which in turn calls`
		2644	`.CW bitblt ),`
		2645	`which draws a null-terminated character string in a`
		2646	`.CW Bitmap .`
		2647	`A`
		2648	`.CW Frame`
		2649	`contains a list of`
		2650	`.CW Boxes ,`
		2651	`each of which defines a horizontal strip of text in the window`
		2652	`(see Figure 7).`
		2653	`A`
		2654	`.CW Box`
		2655	`has a character string`
		2656	`.CW str ,`
		2657	`and a`
		2658	`.CW Rectangle`
		2659	`.CW rect`
		2660	`that defines the location of the strip in the window.`
		2661	`(The text in`
		2662	`.CW str`
		2663	`is stored in the`
		2664	`.CW Box`
		2665	`separately from the`
		2666	`.CW Rasp`
		2667	`associated with the window's file, so`
		2668	`.CW Boxes`
		2669	`are self-contained.)`
		2670	`The invariant is that`
		2671	`the image of the`
		2672	`.CW Box`
		2673	`can be reproduced by calling`
		2674	`.CW string`
		2675	`with argument`
		2676	`.CW str`
		2677	`to draw the string in`
		2678	`.CW rect ,`
		2679	`and the resulting picture fits perfectly within`
		2680	`.CW rect .`
		2681	`In other words, the`
		2682	`.CW Boxes`
		2683	`define the tiling of the window.`
		2684	`The tiling may be complicated by long lines of text, which`
		2685	`are folded onto the next line.`
		2686	`Some editors use horizontal scrolling to avoid this complication,`
		2687	`but to be comfortable this technique requires that lines not be`
		2688	`.I too`
		2689	`long;`
		2690	`.CW sam`
		2691	`has no such restriction.`
		2692	`Also, and perhaps more importantly, UNIX programs and terminals traditionally fold`
		2693	`long lines to make their contents fully visible.`
		2694	`.PP`
		2695	`Two special kinds of`
		2696	`.CW Boxes`
		2697	`contain a single`
		2698	`character: either a newline or a tab.`
		2699	`Newlines and tabs are white space.`
		2700	`A newline`
		2701	`.CW Box`
		2702	`always extends to the right edge of the window,`
		2703	`forcing the following`
		2704	`.CW Box`
		2705	`to the next line.`
		2706	`The width of a tab depends on where it is located:`
		2707	`it forces the next`
		2708	`.CW Box`
		2709	`to begin at a tab location.`
		2710	`Tabs also`
		2711	`have a minimum width equivalent to a blank (blanks are`
		2712	`drawn by`
		2713	`.CW string`
		2714	`and are not treated specially); newlines have a minimum width of zero.`
		2715	`.KF`
		2716	`.PS`
		2717	`copy "fig7.pic"`
		2718	`.PE`
		2719	`.sp .5`
		2720	`.Cs`
		2721	`Figure 7. A line of text showing its`
		2722	`.CW Boxes .`
		2723	`The first two blank`
		2724	`.CW Boxes`
		2725	`contain tabs; the last contains a newline.`
		2726	`Spaces are handled as ordinary characters.`
		2727	`.Ce`
		2728	`.KE`
		2729	`.PP`
		2730	`The update algorithms always use the`
		2731	`.CW Bitmap`
		2732	`image of the text (either the display or cache`
		2733	`.CW Bitmap );`
		2734	`they never examine the characters within a`
		2735	`.CW Box`
		2736	`except when the`
		2737	`.CW Box`
		2738	`needs to be split in two.`
		2739	`Before a change, the window consists of a tiling of`
		2740	`.CW Boxes ;`
		2741	`after the change the window is tiled differently.`
		2742	`The update algorithms rearrange the tiles in place, without`
		2743	`backup storage.`
		2744	`The algorithms are not strictly optimal \(em for example, they can`
		2745	`clear a pixel that is later going to be written upon \(em`
		2746	`but they never move a tile that doesn't need to be moved,`
		2747	`and they move each tile at most once.`
		2748	`.CW Frinsert`
		2749	`on a Blit can absorb over a thousand characters a second if the strings`
		2750	`being inserted are a few tens of characters long.`
		2751	`.PP`
		2752	`Consider`
		2753	`.CW frdelete .`
		2754	`Its job is to delete a substring from a`
		2755	`.CW Frame`
		2756	`and restore the image of the`
		2757	`.CW Frame .`
		2758	`The image of a substring has a peculiar shape (see Figure 2) comprising`
		2759	`possibly a partial line,`
		2760	`zero or more full lines,`
		2761	`and possibly a final partial line.`
		2762	`For reference, call this the`
		2763	`.I`
		2764	`Z-shape.`
		2765	`.R`
		2766	`.CW Frdelete`
		2767	`begins by splitting, if necessary, the`
		2768	`.CW Boxes`
		2769	`containing the ends of`
		2770	`the substring so the substring begins and ends on`
		2771	`.CW Box`
		2772	`boundaries.`
		2773	`Because the substring is being deleted, its image is not needed,`
		2774	`so the Z-shape is then cleared.`
		2775	`Then, tiles (that is, the images of`
		2776	`.CW Boxes )`
		2777	`are copied, using`
		2778	`.CW bitblt ,`
		2779	`from immediately after the Z-shape to`
		2780	`the beginning of the Z-shape,`
		2781	`resulting in a new Z-shape.`
		2782	`.CW Boxes "" (`
		2783	`whose contents would span two lines in the new position must first be split.)`
		2784	`.PP`
		2785	`Copying the remainder of the`
		2786	`.CW Frame`
		2787	`tile by tile`
		2788	`this way will clearly accomplish the deletion but eventually,`
		2789	`typically when the copying algorithm encounters a tab or newline,`
		2790	`the old and new`
		2791	`.CW x`
		2792	`coordinates of the tile`
		2793	`to be copied are the same.`
		2794	`This correspondence implies`
		2795	`that the Z-shape has its beginning and ending edges aligned`
		2796	`vertically, and a sequence of at most two`
		2797	`.CW bitblts`
		2798	`can be used to copy the remaining tiles.`
		2799	`The last step is to clear out the resulting empty space at the bottom`
		2800	`of the window;`
		2801	`the number of lines to be cleared is the number of complete lines in the`
		2802	`Z-shape closed by the final`
		2803	`.CW bitblts.`
		2804	`The final step is to merge horizontally adjacent`
		2805	`.CW Boxes`
		2806	`of plain text.`
		2807	`The complete source to`
		2808	`.CW frdelete`
		2809	`is less than 100 lines of C.`
		2810	`.PP`
		2811	`.CW frinsert`
		2812	`is more complicated because it must do four passes:`
		2813	`one to construct the`
		2814	`.CW Box`
		2815	`list for the inserted string,`
		2816	`one to reconnoitre,`
		2817	`one to copy (in opposite order to`
		2818	`.CW frdelete )`
		2819	`the`
		2820	`.CW Boxes`
		2821	`to make the hole for the new text,`
		2822	`and finally one to copy the new text into place.`
		2823	`Overall, though,`
		2824	`.CW frinsert`
		2825	`has a similar flavor to`
		2826	`.CW frdelete ,`
		2827	`and needn't be described further.`
		2828	`.CW Frinsert`
		2829	`and its subsidiary routines comprise 211 lines of C.`
		2830	`.PP`
		2831	`The terminal source code is 3024 lines of C,`
		2832	`and the host source is 5797 lines.`
		2833	`.SH`
		2834	`Discussion`
		2835	`.SH 2`
		2836	`History`
		2837	`.LP`
		2838	`The immediate ancestor of`
		2839	`.CW sam`
		2840	`was the original text editor for the Blit, called`
		2841	`.CW jim .`
		2842	`.CW Sam`
		2843	`inherited`
		2844	`.CW jim 's`
		2845	`two-process structure and mouse language almost unchanged, but`
		2846	`.CW jim`
		2847	`suffered from several drawbacks that were addressed in the design of`
		2848	`.CW sam .`
		2849	`The most important of these was the lack of a command language.`
		2850	`Although`
		2851	`.CW jim`
		2852	`was easy to use for simple editing, it provided no direct help with`
		2853	`large or repetitive editing tasks. Instead, it provided a command to pass`
		2854	`selected text through a shell pipeline,`
		2855	`but this was no more satisfactory than could be expected of a stopgap measure.`
		2856	`.PP`
		2857	`.CW Jim`
		2858	`was written primarily as a vehicle for experimenting with a mouse-based`
		2859	`interface to text, and the experiment was successful.`
		2860	`.CW Jim`
		2861	`had some spin-offs:`
		2862	`.CW mux ,`
		2863	`the second window system for the Blit, is essentially a multiplexed`
		2864	`version of the terminal part of`
		2865	`.CW jim ;`
		2866	`and the debugger`
		2867	`.CW pi 's`
		2868	`user interface\u\s-4\&20\s+4\d was closely modeled on`
		2869	`.CW jim 's.`
		2870	`But after a couple of years,`
		2871	`.CW jim`
		2872	`had become difficult to maintain and limiting to use,`
		2873	`and its replacement was overdue.`
		2874	`.PP`
		2875	`I began the design of`
		2876	`.CW sam`
		2877	`by asking`
		2878	`.CW jim`
		2879	`customers what they wanted.`
		2880	`This was probably a mistake; the answers were essentially a list of features`
		2881	`to be found in other editors, which did not provide any of the`
		2882	`guiding principles I was seeking.`
		2883	For instance, one common request was for a ``global substitute,''
		2884	`but no one suggested how to provide it within a cut-and-paste editor.`
		2885	`I was looking for a scheme that would`
		2886	`support such specialized features comfortably in the context of some`
		2887	`general command language.`
		2888	`Ideas were not forthcoming, though, particularly given my insistence`
		2889	`on removing all limits on file sizes, line lengths and so on.`
		2890	`Even worse, I recognized that, since the mouse could easily`
		2891	`indicate a region of the screen that was not an integral number of lines,`
		2892	`the command language would best forget about newlines altogether,`
		2893	`and that meant the command language had to treat the file as a single`
		2894	`string, not an array of lines.`
		2895	`.PP`
		2896	`Eventually, I decided that thinking was not getting me very far and it was`
		2897	`time to try building.`
		2898	`I knew that the terminal part could be built easily \(em`
		2899	`that part of`
		2900	`.CW jim`
		2901	`behaved acceptably well \(em and that most of the hard work was going`
		2902	`to be in the host part: the file interface, command interpreter and so on.`
		2903	`Moreover, I had some ideas about how the architecture of`
		2904	`.CW jim`
		2905	`could be improved without destroying its basic structure, which I liked`
		2906	`in principle but which hadn't worked out as well as I had hoped.`
		2907	`So I began by designing the file data structure,`
		2908	`starting with the way`
		2909	`.CW jim`
		2910	`worked \(em comparable to a single structure merging`
		2911	`.CW Disc`
		2912	`and`
		2913	`.CW Buffer ,`
		2914	`which I split to make the cache more general`
		2915	`\(em and thinking about how global substitute could be implemented.`
		2916	`The answer was clearly that it had to be done in two passes,`
		2917	`and the transcript-oriented implementation fell out naturally.`
		2918	`.PP`
		2919	`.CW Sam`
		2920	`was written bottom-up,`
		2921	`starting from the data structures and algorithms for manipulating text,`
		2922	`through the command language and up to the code for maintaining`
		2923	`the display.`
		2924	`In retrospect, it turned out well, but this implementation method is`
		2925	`not recommended in general.`
		2926	`There were several times when I had a large body of interesting code`
		2927	`assembled and no clue how to proceed with it.`
		2928	`The command language, in particular, took almost a year to figure out,`
		2929	`but can be implemented (given what was there at the beginning of that year)`
		2930	`in a day or two. Similarly, inventing the`
		2931	`.CW Rasp`
		2932	`data structure delayed the`
		2933	`connection of the host and terminal pieces by another few months.`
		2934	`.CW Sam`
		2935	`took about two years to write, although only about four months were`
		2936	`spent actually working on it.`
		2937	`.PP`
		2938	`Part of the design process was unusual:`
		2939	`the subset of the protocol that maintains the`
		2940	`.CW Rasp`
		2941	`was simulated, debugged`
		2942	`and verified by an automatic protocol analyzer,\u\s-4\&21\s+4\d and was bug-free`
		2943	`from the start.`
		2944	`The rest of the protocol, concerned mostly`
		2945	`with keeping menus up to date,`
		2946	`was unfortunately too unwieldy for such analysis,`
		2947	`and was debugged by more traditional methods, primarily`
		2948	`by logging in a file all messages in and out of the host.`
		2949	`.SH 2`
		2950	`Reflections`
		2951	`.LP`
		2952	`.CW Sam`
		2953	`is essentially the only interactive editor used by the sixty or so members of`
		2954	`the computing science research center in which I work.`
		2955	`The same could not be said of`
		2956	`.CW jim ;`
		2957	`the lack of a command language kept some people from adopting it.`
		2958	`The union of a user interface as comfortable as`
		2959	`.CW jim 's`
		2960	`with a command language as powerful as`
		2961	`.CW ed 's†`
		2962	`.FS`
		2963	`.vs 9`
		2964	`†The people who criticize`
		2965	`.CW ed`
		2966	`as an interactive program often forget that it and its close relative`
		2967	`.CW sed \u\s-4\&7\s+4\d`
		2968	`still thrive as programmable editors. The strength of these programs is`
		2969	`independent of their convenience for interactive editing.`
		2970	`.br`
		2971	`.vs`
		2972	`.FE`
		2973	`is essential to`
		2974	`.CW sam 's`
		2975	`success.`
		2976	`When`
		2977	`.CW sam`
		2978	`was first made available to the`
		2979	`.CW jim`
		2980	`community,`
		2981	`almost everyone switched to it within two or three days.`
		2982	`In the months that followed, even people who had never adopted`
		2983	`.CW jim`
		2984	`started using`
		2985	`.CW sam`
		2986	`exclusively.`
		2987	`.PP`
		2988	`To be honest,`
		2989	`.CW ed`
		2990	`still gets occasional use, but usually when`
		2991	`something quick needs to be done and the overhead of`
		2992	`downloading the terminal part of`
		2993	`.CW sam`
		2994	`isn't worth the trouble.`
		2995	Also, as a `line' editor,
		2996	`.CW sam`
		2997	`.CW -d`
		2998	`is a bit odd;`
		2999	`when using a good old ASCII terminal, it's comforting to have`
		3000	`a true line editor.`
		3001	`But it is fair to say that`
		3002	`.CW sam 's`
		3003	`command language has displaced`
		3004	`.CW ed 's`
		3005	`for most of the complicated editing that has kept line editors`
		3006	`(that is, command-driven editors) with us.`
		3007	`.PP`
		3008	`.CW Sam 's`
		3009	`command language is even fancier than`
		3010	`.CW ed 's,`
		3011	`and most`
		3012	`.CW sam`
		3013	`customers don't come near to using all its capabilities.`
		3014	`Does it need to be so sophisticated?`
		3015	`I think the answer is yes, for two reasons.`
		3016	`.PP`
		3017	`First, the`
		3018	`.I model`
		3019	`for`
		3020	`.CW sam 's`
		3021	`command language is really relatively simple, and certainly simpler than that of`
		3022	`.CW ed .`
		3023	`For instance, there is only one kind of textual loop in`
		3024	`.CW sam`
		3025	`\(em the`
		3026	`.CW x`
		3027	`command \(em`
		3028	`while`
		3029	`.CW ed`
		3030	`has three (the`
		3031	`.CW g`
		3032	`command, the global flag on substitutions, and the implicit loop over`
		3033	`lines in multi-line substitutions).`
		3034	`Also,`
		3035	`.CW ed 's`
		3036	`substitute command is necessary to make changes within lines, but in`
		3037	`.CW sam`
		3038	`the`
		3039	`.CW s`
		3040	`command is more of a familiar convenience than a necessity;`
		3041	`.CW c`
		3042	`and`
		3043	`.CW t`
		3044	`can do all the work.`
		3045	`.PP`
		3046	`Second,`
		3047	`given a community that expects an editor to be about as powerful as`
		3048	`.CW ed ,`
		3049	`it's hard to see how`
		3050	`.CW sam`
		3051	`could really be much simpler and still satisfy that expectation.`
		3052	People want to do ``global substitutes,'' and most are content
		3053	`to have the recipe for that and a few other fancy changes.`
		3054	`The sophistication of the command language is really just a veneer`
		3055	`over a design that makes it possible to do global substitutes`
		3056	`in a screen editor.`
		3057	`Some people will always want something more, however, and it's gratifying to`
		3058	`be able to provide it.`
		3059	`The real power of`
		3060	`.CW sam 's`
		3061	`command language comes from composability of the operators, which is by`
		3062	`nature orthogonal to the underlying model.`
		3063	`In other words,`
		3064	`.CW sam`
		3065	`is not itself complex, but it makes complex things possible.`
		3066	`If you don't want to do anything complex, you can ignore the`
		3067	`complexity altogether, and many people do so.`
		3068	`.PP`
		3069	`Sometimes I am asked the opposite question: why didn't I just make`
		3070	`.CW sam`
		3071	`a real programmable editor, with macros and variables and so on?`
		3072	`The main reason is a matter of taste: I like the editor`
		3073	`to be the same every time I use it.`
		3074	`There is one technical reason, though:`
		3075	`programmability in editors is largely a workaround for insufficient`
		3076	`interactivity.`
		3077	`Programmable editors are used to make particular, usually short-term,`
		3078	`things easy to do, such as by providing shorthands for common actions.`
		3079	`If things are generally easy to do in the first place,`
		3080	`shorthands are not as helpful.`
		3081	`.CW Sam`
		3082	`makes common editing operations very easy, and the solutions to`
		3083	`complex editing problems seem commensurate with the problems themselves.`
		3084	`Also, the ability to edit the`
		3085	`.CW sam`
		3086	`window makes it easy to repeat commands \(em it only takes a mouse button click`
		3087	`to execute a command again.`
		3088	`.SH 2`
		3089	`Pros and cons`
		3090	`.LP`
		3091	`.CW Sam`
		3092	`has several other good points,`
		3093	`and its share of problems.`
		3094	`Among the good things is the idea of`
		3095	`structural regular expressions,`
		3096	`whose usefulness has only begun to be explored.`
		3097	`They were arrived at serendipitously when I attempted to distill the essence of`
		3098	`.CW ed 's`
		3099	`way of doing global substitution and recognized that the looping command in`
		3100	`.CW ed`
		3101	`was implicitly imposing a structure (an array of lines) on the file.`
		3102	`.PP`
		3103	`Another of`
		3104	`.CW sam 's`
		3105	`good things is its undo capability.`
		3106	`I had never before used an editor with a true undo,`
		3107	`but I would never go back now.`
		3108	`Undo`
		3109	`.I must`
		3110	`be done well, but if it is, it can be relied on.`
		3111	`For example,`
		3112	`it's safe to experiment if you're not sure how to write some intricate command,`
		3113	`because if you make a mistake, it can be fixed simply and reliably.`
		3114	`I learned two things about undo from writing`
		3115	`.CW sam :`
		3116	`first, it's easy to provide if you design it in from the beginning, and`
		3117	`second, it's necessary, particularly if the system has some subtle`
		3118	`properties that may be unfamiliar or error-prone for users.`
		3119	`.PP`
		3120	`.CW Sam 's`
		3121	`lack of internal limits and sizes is a virtue.`
		3122	`Because it avoids all fixed-size tables and data structures,`
		3123	`.CW sam`
		3124	`is able to make global changes to files that some of our other`
		3125	`tools cannot even read.`
		3126	`Moreover, the design keeps the performance linear when doing such`
		3127	`operations, although I must admit`
		3128	`.CW sam`
		3129	`does get slow when editing a huge file.`
		3130	`.PP`
		3131	`Now, the problems.`
		3132	`Externally, the most obvious is that it is poorly integrated into the`
		3133	`surrounding window system.`
		3134	`By design, the user interface in`
		3135	`.CW sam`
		3136	`feels almost identical to that of`
		3137	`.CW mux ,`
		3138	`but a thick wall separates text in`
		3139	`.CW sam`
		3140	`from the programs running in`
		3141	`.CW mux .`
		3142	For instance, the `snarf buffer' in
		3143	`.CW sam`
		3144	`must be maintained separately from that in`
		3145	`.CW mux .`
		3146	`This is regrettable, but probably necessary given the unusual configuration`
		3147	`of the system, with a programmable terminal on the far end of an RS-232 link.`
		3148	`.PP`
		3149	`.CW Sam`
		3150	`is reliable; otherwise, people wouldn't use it.`
		3151	`But it was written over such a long time, and has so many new (to me)`
		3152	`ideas in it, that I would like to see it done over again to clean`
		3153	`up the code and remove many of the lingering problems in the implementation.`
		3154	`The worst part is in the interconnection of the host and terminal parts,`
		3155	`which might even be able to go away in a redesign for a more`
		3156	`conventional window system.`
		3157	`The program must be split in two to use the terminal effectively,`
		3158	`but the low bandwidth of the connection forces the separation to`
		3159	`occur in an inconvenient part of the design if performance is to be acceptable.`
		3160	`A simple remote procedure call`
		3161	`protocol driven by the host, emitting only graphics`
		3162	`commands, would be easy to write but wouldn't have nearly the`
		3163	`necessary responsiveness. On the other hand, if the terminal were in control`
		3164	`and requested much simpler file services from the host, regular expression`
		3165	`searches would require that the terminal read the entire file over its RS-232`
		3166	`link, which would be unreasonably slow.`
		3167	`A compromise in which either end can take control is necessary.`
		3168	`In retrospect, the communications protocol should have been`
		3169	`designed and verified formally, although I do not know of any tool`
		3170	`that can adequately relate the protocol to`
		3171	`its implementation.`
		3172	`.PP`
		3173	`Not all of`
		3174	`.CW sam 's`
		3175	`users are comfortable with its command language, and few are adept.`
		3176	`Some (venerable) people use a sort of`
		3177	.CW ed \& ``
		3178	`subset'' of`
		3179	`.CW sam 's`
		3180	`command language,`
		3181	`and even ask why`
		3182	`.CW sam 's`
		3183	`command language is not exactly`
		3184	`.CW ed 's.`
		3185	`(The reason, of course, is that`
		3186	`.CW sam 's`
		3187	`model for text does not include newlines, which are central to`
		3188	`.CW ed .`
		3189	`Making the text an array of newlines to the command language would`
		3190	`be too much of a break from the seamless model provided by the mouse.`
		3191	`Some editors, such as`
		3192	`.CW vi ,`
		3193	`are willing to make this break, though.)`
		3194	`The difficulty is that`
		3195	`.CW sam 's`
		3196	`syntax is so close to`
		3197	`.CW ed 's`
		3198	`that people believe it`
		3199	`.I should`
		3200	`be the same.`
		3201	`I thought, with some justification in hindsight,`
		3202	`that making`
		3203	`.CW sam`
		3204	`similar to`
		3205	`.CW ed`
		3206	`would make it easier to learn and to accept.`
		3207	`But I may have overstepped and raised the users'`
		3208	`expectations too much.`
		3209	`It's hard to decide which way to resolve this problem.`
		3210	`.PP`
		3211	`Finally, there is a tradeoff in`
		3212	`.CW sam`
		3213	`that was decided by the environment in which it runs:`
		3214	`.CW sam`
		3215	`is a multi-file editor, although in a different system there might instead be`
		3216	`multiple single-file editors.`
		3217	`The decision was made primarily because starting a new program in a Blit is`
		3218	`time-consuming.`
		3219	`If the choice could be made freely, however, I would`
		3220	`still choose the multi-file architecture, because it allows`
		3221	`groups of files to be handled as a unit;`
		3222	`the usefulness of the multi-file commands is incontrovertible.`
		3223	`It is delightful to have the source to an entire program`
		3224	`available at your fingertips.`
		3225	`.SH`
		3226	`Acknowledgements`
		3227	`.LP`
		3228	`Tom Cargill suggested the idea behind the`
		3229	`.CW Rasp`
		3230	`data structure.`
		3231	`Norman Wilson and Ken Thompson influenced the command language.`
		3232	`This paper was improved by comments from`
		3233	`Al Aho,`
		3234	`Jon Bentley,`
		3235	`Chris Fraser,`
		3236	`Gerard Holzmann,`
		3237	`Brian Kernighan,`
		3238	`Ted Kowalski,`
		3239	`Doug McIlroy`
		3240	`and`
		3241	`Dennis Ritchie.`

Subversion Repositories planix.SVN

(root)/os/branches/feature_tlsv12/sys/doc/sam/sam.ms – Rev 2