WebSVN – tendra.SVN – Blame – /branches/tendra4/doc/tdf/spec11.html

Rev	Author	Line No.	Line
2	7u83	1	`<!-- Crown Copyright (c) 1998 -->`
		2	`<HTML>`
		3	`<HEAD>`
		4	`<TITLE>The bit encoding of TDF</TITLE>`
		5	`</HEAD>`
		6	`<BODY TEXT="#000000" BGCOLOR="#FFFFFF" LINK="#0000FF" VLINK="#400080" ALINK="#FF0000">`
		7	`<H1><A NAME=S417>TDF Specification, Issue 4.0</A></H1>`
		8	`<H3>January 1998</H3>`
		9	`<A HREF="spec12.html"><IMG SRC="../images/next.gif" ALT="next section"></A>`
		10	`<A HREF="spec10.html"><IMG SRC="../images/prev.gif" ALT="previous section"></A>`
		11	`<A HREF="spec1.html"><IMG SRC="../images/top.gif" ALT="current document"></A>`
		12	`<A HREF="../index.html"><IMG SRC="../images/home.gif" ALT="TenDRA home page">`
		13	`</A>`
		14	`<A HREF="spec12.html"><IMG SRC="../images/index.gif" ALT="document index"></A>`
		15	`<P>`
		16	`<HR>`
		17	`<DL>`
		18	`<DT><A HREF="#S418"><B>8.1</B> - The Basic Encoding</A><DD>`
		19	`<DT><A HREF="#S419"><B>8.2</B> - Fundamental encodings</A><DD>`
		20	`<DL>`
		21	`<DT><A HREF="#S420"><B>8.2.1</B> - TDFINT</A><DD>`
		22	`<DT><A HREF="#S421"><B>8.2.2</B> - TDFBOOL</A><DD>`
		23	`<DT><A HREF="#S422"><B>8.2.3</B> - TDFSTRING</A><DD>`
		24	`<DT><A HREF="#S423"><B>8.2.4</B> - TDFIDENT</A><DD>`
		25	`</DL>`
		26	`<DT><A HREF="#S424"><B>8.3</B> - BITSTREAM</A><DD>`
		27	`<DL>`
		28	`<DT><A HREF="#S425"><B>8.3.1</B> - BYTESTREAM</A><DD>`
		29	`<DT><A HREF="#S426"><B>8.3.2</B> - BYTE_ALIGN</A><DD>`
		30	`<DT><A HREF="#S427"><B>8.3.3</B> - Extendable integer encoding</A><DD>`
		31	`</DL>`
		32	`<DT><A HREF="#S428"><B>8.4</B> - The TDF encoding</A><DD>`
		33	`<DT><A HREF="#S429"><B>8.5</B> - File Formats</A><DD>`
		34	`</DL>`
		35	`<HR>`
		36	`<H1>8. <A NAME=0>The bit encoding of TDF</A></H1>`
		37	`This is a description of the encoding used for TDF.`
		38	`<P>`
		39	`<A HREF="#1">Section 8.1</A> defines the basic level of encoding,`
		40	`in which integers consisting of a specified number of bits are appended`
		41	`to the sequence of bytes. <A HREF="#4">Section 8.2</A> defines the`
		42	`second level of encoding, in which fundamental kinds of value are`
		43	`encoded in terms of integers of specified numbers of bits. <A HREF="#17">Section`
		44	`8.4</A> defines the third level, in which TDF is encoded using the`
		45	`previously defined concepts.`
		46	`<P>`
		47	`<A NAME=S418>`
		48
		49	`<HR>`
		50	`<H2>8.1. <A NAME=1>The Basic Encoding</A></H2>`
		51	`<A NAME=M2><A NAME=M3>TDF consists of a sequence of 8-bit bytes used`
		52	`to encode integers of a varying number of bits, from 1 to 32. These`
		53	`integers will be called basic integers.`
		54	`<P>`
		55	`TDF is encoded into bytes in increasing byte index, and within the`
		56	`byte the most significant end is filled before the least significant.`
		57	`Let the bits within a byte be numbered from 0 to 7, 0 denoting the`
		58	`least significant bit and 7 the most significant. Suppose that the`
		59	`bytes up to <I>n</I>-1 have been filled and that the next free bit`
		60	`in byte <I>n</I> is bit <I>k</I>. Then bits <I>k</I>+1 to 7 are full`
		61	`and bits 0 to <I>k</I> remain to be used. Now an integer of <I>d</I>`
		62	`bits is to be appended.`
		63	`<P>`
		64	`If <I>d</I> is less than or equal to <I>k</I>, the <I>d</I> bits will`
		65	`occupy bits <I>k</I>-<I>d</I>+1 to <I>k</I> of byte <I>n</I>, and`
		66	`the next free bit will be at bit <I>k-d</I>. Bit 0 of the integer`
		67	`will be at bit <I>k</I>-<I>d</I>+1 of the byte, and bit <I>d</I>-1`
		68	`of the integer will be at bit <I>k</I>.`
		69	`<P>`
		70	`If <I>d</I> is equal to <I>k</I>+1, the <I>d</I> bits will occupy`
		71	`bits 0 to <I>k</I> of byte <I>n</I> and the next free bit will be`
		72	`bit 7 of byte <I>n</I>+1. Bit <I>d</I>-1 of the integer will be at`
		73	`bit <I>k</I> of the byte.`
		74	`<P>`
		75	`If <I>d</I> is greater than <I>k</I>+1, the most significant <I>k</I>+1`
		76	`bits of the integer will be in byte <I>n</I>, with bit <I>d</I>-1`
		77	`at bit <I>k</I> of the byte. The remaining <I>d</I>-<I>k</I>-1 least`
		78	`significant bits are then encoded into the bytes, starting at byte`
		79	`<I>n</I>+1, bit 7, using the same algorithm (i.e. recursively).`
		80	`<P>`
		81	`<A NAME=S419>`
		82
		83	`<HR>`
		84	`<H2>8.2. <A NAME=4>Fundamental encodings</A></H2>`
		85	`This section describes the encoding of <CODE>TDFINT</CODE>,`
		86	`<CODE>TDFBOOL</CODE>, <CODE>TDFSTRING</CODE>, <CODE>TDFIDENT</CODE>,`
		87	`<CODE>BITSTREAM</CODE>, <CODE>BYTESTREAM</CODE>, <CODE>BYTE_ALIGN</CODE>`
		88	`and extendable integers.`
		89	`<P>`
		90	`<A NAME=S420>`
		91	`<H3>8.2.1. TDFINT</H3>`
		92	`<A NAME=M5><CODE>TDFINT</CODE> encodes non-negative integers of unbounded`
		93	`size. The encoding uses octal digits encoded in 4-bit basic integers.`
		94	`The most significant octal digit is encoded first, the least significant`
		95	`last. For all digits except the last the 4-bit integer is the value`
		96	`of the octal digit. For the last digit the 4-bit integer is the value`
		97	`of the octal digit plus 8.`
		98	`<P>`
		99	`<A NAME=S421>`
		100	`<H3>8.2.2. TDFBOOL</H3>`
		101	`<A NAME=M6><CODE>TDFBOOL</CODE> encodes a boolean, true or false.`
		102	`The encoding uses a 1-bit basic integer, with 1 encoding true and`
		103
		104	`<P>`
		105	`<A NAME=S422>`
		106	`<H3>8.2.3. TDFSTRING</H3>`
		107	`<CODE>TDFSTRING</CODE> encodes a sequence containing <I>n</I> non-negative`
		108	`integers, each of <I>k</I> bits. The encoding consists of, first a`
		109	`<CODE>TDFINT</CODE> giving the number of bits, second a <CODE>TDFINT</CODE>`
		110	`giving the number of integers, which may be zero. Thirdly it contains`
		111	`<I>n</I> <I>k</I>-bit basic integers, giving the sequence of integers`
		112	`required, the first integer being first in this sequence.`
		113	`<P>`
		114	`<A NAME=S423>`
		115	`<H3>8.2.4. TDFIDENT</H3>`
		116	`<A NAME=M7><CODE>TDFIDENT</CODE> also encodes a sequence containing`
		117	`<I>n</I> non-negative integers. These integers will all consist of`
		118	`the same number of bits, which will be a multiple of 8. It is a property`
		119	`of the encoding of the other constructions that TDFIDENTS will start`
		120	`on either bit 7 or bit 3 of a byte and end on bit 7 or bit 3 of a`
		121	`byte. It thus has some alignment properties which are useful to permit`
		122	`fast copying of sections of TDF.`
		123	`<P>`
		124	`The encoding consists of, first a <CODE>TDFINT</CODE> giving the number`
		125	`of bits, second a <CODE>TDFINT</CODE> giving the number of integers,`
		126	`which may be zero. If the next free bit is not bit 7 of some byte,`
		127	`it is moved on to bit 7 of the next byte.`
		128	`<P>`
		129	`Thirdly it contains <I>n</I> <I>k</I>-bit integers.`
		130	`<P>`
		131	`If the next free bit is not bit 7 of some byte, it is moved on to`
		132	`bit 7 of the next byte.`
		133	`<P>`
		134	`<A NAME=S424>`
		135
		136	`<HR>`
		137	`<H2>8.3. <A NAME=8>BITSTREAM</A></H2>`
		138	`<A NAME=M9>It can be useful to be able to skip a TDF construction`
		139	`without reading through it. <CODE>BITSTREAM</CODE> provides a means`
		140	`of doing this.`
		141	`<P>`
		142	`A <CODE>BITSTREAM</CODE> encoding of <I>X</I> consists of a <CODE>TDFINT</CODE>`
		143	`giving the number of bits of encoding which are occupied by the <I>X</I>.`
		144	`Hence to skip over a <CODE>BITSTREAM</CODE> while decoding, one should`
		145	`read the <CODE>TDFINT</CODE> and then advance the bit index by that`
		146	`number of bits. To read the contents of a <CODE>BITSTREAM</CODE> encoding`
		147	`of <I>X</I>, one should read and ignore a <CODE>TDFINT</CODE> and`
		148	`then decode an <I>X</I>. There will be no spare bits at the end of`
		149	`the <I>X</I>, so reading can continue directly.`
		150	`<P>`
		151	`<A NAME=S425>`
		152	`<H3>8.3.1. BYTESTREAM</H3>`
		153	`<A NAME=M10>It can be useful to be able to skip a TDF construction`
		154	`without reading through it. <CODE>BYTESTREAM</CODE> provides a means`
		155	`of doing this while remaining byte aligned, so facilitating copying`
		156	`the TDF. A <CODE>BYTESTREAM</CODE> will always start when the bit`
		157	`position is 3 or 7.`
		158	`<P>`
		159	`A <CODE>BYTESTREAM</CODE> encoding of <I>X</I> starts with a <CODE>TDFINT</CODE>`
		160	`giving a number, <I>n</I>. After this, if the current bit position`
		161	`is not bit 7 of some byte, it is moved to bit 7 of the next byte.`
		162	`The next <I>n</I> bytes are an encoding of <I>X</I>. There may be`
		163	`some spare bits left over at the end of <I>X</I>.`
		164	`<P>`
		165	`Hence to skip over a <CODE>BYTESTREAM</CODE> while decoding one should`
		166	`read a <CODE>TDFINT</CODE>, <I>n</I>, move to the next byte alignment`
		167	`(if the bit position is not 7) and advance the bit index over <I>n</I>`
		168	`bytes. To read a <CODE>BYTESTREAM</CODE> encoding of <I>X</I> one`
		169	`should read a <CODE>TDFINT</CODE>, <I>n</I>, and move to the next`
		170	`byte, <I>b</I> (if the bit position is not 7), and then decode an`
		171	`<I>X</I>. Finally the bit position should be moved to <I>n</I> bytes`
		172	`after <I>b</I>.`
		173	`<P>`
		174	`<A NAME=S426>`
		175	`<H3>8.3.2. BYTE_ALIGN</H3>`
		176	`<A NAME=M11><A NAME=M12><CODE>BYTE_ALIGN</CODE> leaves the bit position`
		177	`alone if it is 7, and otherwise moves to bit 7 of the next byte.`
		178	`<P>`
		179	`<A NAME=S427>`
		180	`<H3>8.3.3. <A NAME=13>Extendable integer encoding</A></H3>`
		181	`<A NAME=M14><A NAME=M15><A NAME=M16>A <I>d</I>-bit extendable integer`
		182	`encoding enables an integer greater than zero to be encoded given`
		183	`<I>d</I>, a number of bits.`
		184	`<P>`
		185	`If the integer is between 1 and 2<SUP><I>d</I></SUP> - 1 inclusive,`
		186	`a`
		187	`<I>d</I>-bit basic integer is encoded.`
		188	`<P>`
		189	`If the integer, <I>i</I>, is greater than or equal to 2<SUP><I>d</I></SUP>,`
		190	`a <I>d</I>-bit basic integer encoding of zero is inserted and then`
		191	`<I>i</I> - 2<SUP><I>d</I></SUP> + 1 is encoded as a <I>d</I>-bit extendable`
		192	`encodin, and so on, recursively.`
		193	`<P>`
		194	`<A NAME=S428>`
		195
		196	`<HR>`
		197	`<H2>8.4. <A NAME=17>The TDF encoding</A></H2>`
		198	`The descriptions of SORTS and constructors contain encoding information`
		199	`which is interpreted as follows to define the TDF encoding.`
		200	`<P>`
		201	`<UL>`
		202	`<LI>A TDF CAPSULE is an encoding of the <CODE>SORT CAPSULE</CODE>.<P>`
		203	`<LI>For each <CODE>SORT</CODE> a number of encoding bits, <I>b,</I>`
		204	`is specified. If this is zero, there will only be one construction`
		205	`for the class, and its encoding will consist of the encodings of its`
		206	`components, in the given order.<P>`
		207	`<LI><A NAME=M18>If the number of encoding bits, <I>b</I>, is not zero`
		208	`the <CODE>SORT</CODE> is described as extendable or as not extendable.`
		209	`For each construction there is an encoding number given. If the <CODE>SORT`
		210	`</CODE> is extendable, this number is output as an extendable integer.`
		211	`If the <CODE>SORT</CODE> is described as not extendable, the number`
		212	`is output as a basic integer. This is followed by the encodings of`
		213	`the components of the construction in the order given in the description`
		214	`of the construct.<P>`
		215	`<LI><A NAME=M21>For the classes which are named <CODE>SLIST</CODE>(<I>x</I>)`
		216	`- e.g. <CODE>SLIST</CODE>(<CODE>UNIT</CODE>) - the encoding consists`
		217	`of a <CODE>TDFINT</CODE>, <I>n</I>, followed by <I>n</I> encodings`
		218	`of <I>x</I>.<P>`
		219	`<LI><A NAME=M22>For the classes which are named <CODE>LIST</CODE>(<I>x</I>)`
		220	`- e.g. <CODE>LIST</CODE>(<CODE>EXP</CODE>) - the encoding consists`
		221	`of a 1-bit integer which will be 0, followed by an <CODE>SLIST</CODE>(<I>x</I>).`
		222	`The 1-bit integer is to allow for extensions to other representations`
		223	`of`
		224	`<CODE>LIST</CODE>s.<P>`
		225	`<LI><A NAME=M23><A NAME=M24>For the classes which are named <CODE>OPTION</CODE>(`
		226	`<I>x</I>) the encoding consists of a 1-bit basic integer. If this`
		227	`is zero, the option is absent and there is no more encoding. If the`
		228	`integer is 1, the option is present and an encoding of <I>x</I> follows.<P>`
		229	`<LI><CODE>BITSTREAMS</CODE> occur in only two kinds of place. One`
		230	`is the constructions with the form <I>x_cond</I>, which are the install-time`
		231	`conditionals. For each of these the class encoded in the <CODE>BITSTREAM</CODE>`
		232	`is the same as the class which is the result of the <I>x_cond</I>`
		233	`construction. The other kind of place is as the <I>token_args</I>`
		234	`component of a construction with the form <I>x_apply_token</I>. This`
		235	`component always gives the parameters of the <CODE>TOKEN</CODE>. It`
		236	`can only be decoded if there is a token definition or a token declaration`
		237	`for the particular token being applied, i.e. for the <I>token_value</I>`
		238	`component of the construction. In this case the <CODE>SORTS</CODE>`
		239	`and hence the classes of the actual token arguments are given by the`
		240	`declaration or definition, and encodings of these classes are placed`
		241	`in sequence after the number of bits. If the declaration or definition`
		242	`are not available, the <CODE>BITSTREAM</CODE> can only be skipped.<P>`
		243	`<LI><CODE>BYTESTREAM</CODE> <I>X</I> occurs in only one place, the`
		244	`encoding of the <CODE>SORT UNIT</CODE>. The <CODE>SORT</CODE>`
		245	`<I>X</I> is determined by the <CODE>UNIT</CODE> identification which`
		246	`is given for each of the relevant <CODE>SORTS</CODE>.<P>`
		247	`<LI><A NAME=M25>The <I>tld</I> <CODE>UNIT</CODE> is encoded specially.`
		248	`It is always the first <CODE>UNIT</CODE> in a Capsule and is used`
		249	`to pass information to the TDF linker. The first entry in a <I>tld</I>`
		250	`<CODE>UNIT</CODE> is a <CODE>TDFINT</CODE> giving the format of the`
		251	`remainder of the <CODE>UNIT</CODE>. Currently, the linker supports`
		252	`formats 0 and 1, but others may be added to give greater functionality`
		253	`while retaining compatibility. With format 0, the remainder of <CODE>UNIT</CODE>`
		254	`is identical to a now obsolete`
		255	`<I>tld2</I> <CODE>UNIT</CODE>. With format 1, the remainder of the`
		256	`<CODE>UNIT</CODE> is as follows: If <I>n</I> is the number of <CODE>EXTERN_LINK`
		257	`</CODE>s in the <I>external_linkage</I>`
		258	`argument of <I>make_capsule</I>, the remainder consists of <I>n</I>`
		259	`sequences. These sequences are in the order given by <I>external_linkage</I>.`
		260	`Each element of a sequence consist of one <CODE>TDFINT</CODE> per`
		261	`linkable entity in the corresponding <I>el</I> of the <I>make_extern_link</I>`
		262	`in the same order. These integers will describe properties of the`
		263	`correponding external links. (In format 0, there are only two sequences,`
		264	`the first describing the token external links and the second describing`
		265	`the tag external links.)`
		266	`<P>`
		267	`<UL>`
		268	`<LI>Bit 0: 1 means "used in this capsule", 0 means "not`
		269	`used in this capsule".`
		270	`<LI>Bit 1: 1 means "declared in this capsule", 0 means "not`
		271	`declared in this capsule".`
		272	`<LI>Bit 2: 0 means "not defined in this capsule, 1 means "defined`
		273	`in this capsule".`
		274	`<LI>Bit 3: 0 means "is uniquely defined", 1 means "`
		275	`may be multiply defined".`
		276	`</UL>`
		277	`</UL>`
		278	`<P>`
		279	`<A NAME=S429>`
		280
		281	`<HR>`
		282	`<H2>8.5. File Formats</H2>`
		283	`There may be various kinds of files which contain TDF bitstream information.`
		284	`Each will start with a 4-byte "magic-number" identifying`
		285	`the kind of file followed by two <CODE>TDFINT</CODE>s giving the major`
		286	`and minor version numbers of the TDF involved.`
		287	`<P>`
		288	`A CAPSULE file will have a magic-number <I>"TDFC"</I>. The`
		289	`encoding of the CAPSULE will be byte-aligned following the version`
		290	`numbers.`
		291	`<P>`
		292	`A TDF library file will have a magic-number <I>"TDFL"</I>.`
		293	`These files are constructed by the TDF linker.`
		294	`<P>`
		295	`A TDF archive file will have a magic-number <I>"TDFA"</I>.`
		296	`<P>`
		297	`Other file formats introduced should follow a similar pattern.`
		298	`<P>`
		299	`<I>The TDF linker will refuse to link TDF files with different major`
		300	`version numbers. The resulting minor version number is the maximum`
		301	`of component minor version numbers.</I>`
		302	`<P>`
		303	`<P>`
		304	`<HR>`
		305	`<P><I>Part of the <A HREF="../index.html">TenDRA Web</A>.<BR>Crown`
		306	`Copyright © 1998.</I></P>`
		307	`</BODY>`
		308	`</HTML>`

Subversion Repositories tendra.SVN

(root)/branches/tendra4/doc/tdf/spec11.html – Rev 3