前文提到了一个adlist,它的注释是这样写的“A generic doubly linked list implementation”,这次要说的ziplist,它的注释是这样写的
-
/* The ziplist is a specially encoded dually linked list that is designed
-
* to be very memory efficient. It stores both strings and integer values,
-
* where integers are encoded as actual integers instead of a series of
-
* characters. It allows push and pop operations on either side of the list
-
* in O(1) time. However, because every operation requires a reallocation of
-
* the memory used by the ziplist, the actual complexity is related to the
-
* amount of memory used by the ziplist.*/
ziplist本身并不会像adlist那样能够存储多种数据,而是存储string和integer values,就像上面的注释一样"encoded dually linked list"。但是其能够更加有效的利用内存空间,如何利用的呢,就让我们一探究竟。
注意:如果没有特殊声明,所有字段均按照little endian存储。
首先,和adlist一样会用宏定义头和尾
-
#define ZIPLIST_HEAD 0
-
#define ZIPLIST_TAIL 1
其在内存当中的整体结构如下所示
-
<zlbytes> <zltail> <zllen> <entry> <entry> ... <entry> <zlend>
其中各个字段的描述如下
名称
|
描述
|
zlbytes(uint32_t)
|
整个链表所占的字节数
|
zltail(uint32_t)
|
到链表当中最后一个元素的偏移量,便于pop操作
|
zllen(uint16_t)
|
entry个数,当元素个数超过216-2时,这个值就被赋值为216-1,这个时候,如果想知道实际有多少个entry的话,就需要自己遍历整个链表了
|
zlend(uint8_t)
|
表示链表尾部,被设置为255
|
链表中的entry在内存当中的表示如下:
-
<prevlen> <encoding> <entry-data>
其中各个字段的描述如下:
名称
|
描述
|
prevlen
|
前一个entry的长度(便于从后向前遍历)。如果长度小于254,则只占1个字节,如果大于254,则占用5个字节,其中第一个字节为254,后面4个字节表示实际长度。
|
encoding
|
表明entry type。integer或者string。如果是小整数,有时会省略掉entry-data部分,由encoding直接表示数据,当entry-data为string时,encoding的第一个字节的前2bits用来保存整个字符串长度的encoding type,后面紧接着就是实际的长度:当entry-data时一个integer,第一个字节的前2bits均被置为1,借着它的2位用来表示integer的种类
|
encoding字段具体表示如下:
encoding字节表示
|
说明
|
00pppppp(1 byte)
|
string value with length less than or equal to 63 bytes
|
01pppppp|qqqqqqqq(2 bytes)
|
string value with length less than or equal to 16383 bytes. The 14 bit number is stored in big endian
|
10000000|qqqqqqqq|rrrrrrrr|ssssssss|tttttttt(5 bytes)
|
string balue with length greater or equal to 16384 bytes. Only the 4 bytes following the first byte represents the length up to 322-1. The 32 bit number is stored in big endian.
|
11000000| - 3 bytes total
|
integer encoded as int16_t(2 bytes)
|
11010000| - 5 bytes total
|
integer encoded as int32_t(4 bytes)
|
11100000| - 9 bytes total
|
integer encoded as int64_t(8 bytes)
|
11110000| - 4 bytes total
|
integer encoded as 24 bit signed(3 bytes)
|
11111110| - 2 bytes total
|
integer encoded as 8 bit signed(1 byte)
|
1111xxxx| - with xxxx between 0000 and 1101
|
immediate 4 bit integer. unsigned integer from 0 to 12. The encoded value is actually from 1 to 13 because 0000 and 1111 can not be used, so 1 should be subtracted from the encoded 4 bit value to obtain the right value.
|
11111111
|
End of ziplist special entry
|
它是通过定义一些mask来实现上述的各种判断,如下
-
#define ZIP_END 255 /* Special "end of ziplist" entry. */
-
#define ZIP_BIG_PREVLEN 254 /* Max number of bytes of the previous entry*/
-
/* Different encoding/length possibilities */
-
#define ZIP_STR_MASK 0xc0
-
#define ZIP_INT_MASK 0x30
-
#define ZIP_STR_06B (0 << 6)
-
#define ZIP_STR_14B (1 << 6)
-
#define ZIP_STR_32B (2 << 6)
-
#define ZIP_INT_16B (0xc0 | 0<<4)
-
#define ZIP_INT_32B (0xc0 | 1<<4)
-
#define ZIP_INT_64B (0xc0 | 2<<4)
-
#define ZIP_INT_24B (0xc0 | 3<<4)
-
#define ZIP_INT_8B 0xfe
-
-
/* 4 bit integer immediate encoding |1111xxxx| with xxxx between
-
* 0001 and 1101. */
-
#define ZIP_INT_IMM_MASK 0x0f /* Mask to extract the 4 bits value. To add
-
one is needed to reconstruct the value. */
-
#define ZIP_INT_IMM_MIN 0xf1 /* 11110001 */
-
#define ZIP_INT_IMM_MAX 0xfd /* 11111101 */
-
-
#define INT24_MAX 0x7fffff
-
#define INT24_MIN (-INT24_MAX - 1)
entry的话是通过下面的结构体来进行
读取的,实际上在内存当中并不是按照结构体存储的,这点要记住,可以偷偷的记录在小本本上面~
-
typedef struct zlentry {
-
unsigned int prevrawlensize; /* Bytes used to encode the previos entry len*/
-
unsigned int prevrawlen; /* Previous entry len. */
-
unsigned int lensize; /* Bytes used to encode this entry type/len.
-
For example strings have a 1, 2 or 5 bytes
-
header. Integers always use a single byte.*/
-
unsigned int len; /* Bytes used to represent the actual entry.
-
For strings this is just the string length
-
while for integers it is 1, 2, 3, 4, 8 or
-
0 (for 4 bit immediate) depending on the
-
number range. */
-
unsigned int headersize; /* prevrawlensize + lensize. */
-
unsigned char encoding; /* Set to ZIP_STR_* or ZIP_INT_* depending on
-
the entry encoding. However for 4 bits
-
immediate integers this can assume a range
-
of values and must be range-checked. */
-
unsigned char *p; /* Pointer to the very start of the entry, that
-
is, this points to prev-entry-len field. */
-
} zlentry;
之后的话就是一些聊表的常规操作了,这里不再赘述了~~~
阅读(9504) | 评论(0) | 转发(0) |