Subversion Repositories planix.SVN

Rev

Details | Last modification | View Log | RSS feed

Rev Author Line No. Line
2 - 1
all data is big-endian on disk.
2
 
3
arena layout:
4
 
5
ArenaPart (first at offset PartBlank = 256kB in the disk file)
6
	magic[4] 0xA9E4A5E7
7
	version[4] 3
8
	blockSize[4]
9
	arenaBase[4] offset of first ArenaHead structure in the disk file
10
 
11
the ArenaMap starts at the first block at offset >= PartBlank+512 bytes.
12
it is a sequence of text lines
13
/*
14
 * amap: n '\n' amapelem * n
15
 * n: u32int
16
 * amapelem: name '\t' astart '\t' asize '\n'
17
 * astart, asize: u64int
18
 */
19
 
20
the astart and astop are byte offsets in the disk file.
21
they are the offsets to the ArenaHead and the end of the Arena block.
22
 
23
ArenaHead 
24
[base points here in the C code]
25
size bytes
26
	Clumps
27
	ClumpInfo blocks
28
Arena
29
 
30
Arena
31
	magic[4] 0xF2A14EAD
32
	version[4] 4
33
	name[64]
34
	clumps[4]
35
	cclumps[4]
36
	ctime[4]
37
	wtime[4]
38
	used[8]
39
	uncsize[8]
40
	sealed[1]
41
	optional score[20]
42
 
43
once sealed, the sha1 hash of every block from the
44
ArenaHead to the Arena is checksummed, as though
45
the final score in Arena were the zeroScore.  strangely,
46
the tail of the Arena block (the last one) is not included in the checksum
47
(i.e., the unused data after the score).
48
 
49
clumpMax = blocksize/ClumpInfoSize = blocksize/25
50
dirsize = ((clumps/clumpMax)+1) * blocksize
51
want used+dirsize <= size
52
want cclumps <= clumps
53
want uncsize+clumps*ClumpSize+blocksize < used
54
want ctime <= wtime
55
 
56
clump info is stored packed into blocks in order.
57
clump info moves forward through a block but the
58
blocks themselves move backwards.  so if cm=clumpMax
59
and there are two blocks worth of clumpinfo, the blocks
60
look like;
61
 
62
	[cm..2*cm-1] [0..cm-1] [Arena]
63
 
64
with the blocks pushed right up against the Arena trailer.
65
 
66
ArenaHead
67
	magic[4] 0xD15C4EAD
68
	version[4] = Arena.version
69
	name[64]
70
	blockSize[4]
71
	size[8]
72
 
73
Clump
74
	magic[4] 0xD15CB10C (0 for an unused clump)
75
	type[1]
76
	size[2]
77
	uncsize[2]
78
	score[20]
79
	encoding[1] raw=1, compress=2
80
	creator[4]
81
	time[4]
82
 
83
ClumpInfo
84
	type[1]
85
	size[2]
86
	uncsize[2]
87
	score[20]
88
 
89
the arenas are mapped into a single address space corresponding
90
to the index that brings them together.  if each arena has 100M bytes
91
excluding the headers and there are 4 arenas, then there's 400M of
92
index address space between them.  index address space starts at 1M
93
instead of 0, so the index addresses assigned to the first arena are
94
1M up to 101M, then 101M to 201M, etc.
95
 
96
of course, the assignment of addresses has nothing to do with the index,
97
but that's what they're called.
98
 
99
 
100
the index is split into index sections, which are put on different disks
101
to get parallelism of disk heads.  each index section holds some number
102
of hash buckets, each in its own disk block.  collectively the index sections
103
hold ix->buckets between them. 
104
 
105
the top 32-bits of the score is used to assign scores to buckets.
106
div = ceil(2³² / ix->buckets) is the amount of 32-bit score space per bucket.
107
 
108
to look up a block, take the top 32 bits of score and divide by div
109
to get the bucket number.  then look through the index section headers
110
to figure out which index section has that bucket.
111
 
112
then load that block from the index section.  it's an IBucket.
113
 
114
the IBucket has ib.n IEntry structures in it, sorted by score and then by type.
115
do the lookup and get an IEntry.  the ia.addr will be a logical address
116
that you then use to get the 
117
 
118
ISect
119
	magic[4] 0xD15C5EC7
120
	version[4]
121
	name[64]
122
	index[64]
123
	blockSize[4]
124
	blockBase[4]	address in partition where bucket blocks start
125
	blocks[4]
126
	start[4]
127
	stop[4]	stop - start <= blocks, but not necessarily ==
128
 
129
IEntry
130
	score[20]
131
	wtime[4]
132
	train[2]
133
	ia.addr[8]		index address (see note above)
134
	ia.size[2]		size of uncompressed block data
135
	ia.type[1]
136
	ia.blocks[1]	number of blocks of clump on disk
137
 
138
IBucket
139
	n[2]
140
	next[4]	not sure; either 0 or inside [start,stop) for the ISect
141
	data[n*IEntrySize]
142
 
143
final piece: all the disk partitions start with PartBlank=256kB of unused disk
144
(presumably to avoid problems with boot sectors and layout tables
145
and the like).
146
 
147
actually the last 8k of the 256k (that is, at offset 248kB) can hold
148
a venti config file to help during bootstrap of the venti file server.
149