tyltr技术窝

本文的源码分析基于:版本:version go 1.13

map(字典、映射)在不同语言中都有这类 kv的数据结构。每种语言的实现也不尽相同。那么go是怎么实现的呢?
Go 语言采用的是哈希查找表,并且使用链表解决哈希冲突。注:python采用的是二次寻址的方式解决哈希冲突的。

1.底层数据结构#

map的内部实现示意图:

map的内部实现示意图

map的定义:runtime/map.go

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
// A header for a Go map.
// map的定义 hashmap缩写
type hmap struct {
// Note: the format of the hmap is also encoded in cmd/compile/internal/gc/reflect.go.
// Make sure this stays in sync with the compiler's definition.
// len()==count,map中元素的数量
count int // # live cells == size of map. Must be first (used by len() builtin)

// 这是一个标识,标注map当前处于的状态
flags uint8

// bmap数组的长度为2^B
B uint8 // log_2 of # of buckets (can hold up to loadFactor * 2^B items)

// 溢出的数量
noverflow uint16 // approximate number of overflow buckets; see incrnoverflow for details
hash0 uint32 // hash seed 哈希种子

// 指向当前使用的桶数组,这个数组的长度2^B
buckets unsafe.Pointer // array of 2^B Buckets. may be nil if count==0.

// 扩容时,用得到这个。指向原来的桶的数组
// 扩容的时候,buckets 长度会是 oldbuckets 的两倍
// 即B+1
oldbuckets unsafe.Pointer // previous bucket array of half the size, non-nil only when growing
// 扩容是渐进式的,扩容的进度
nevacuate uintptr // progress counter for evacuation (buckets less than this have been evacuated)

extra *mapextra // optional fields
}

一直说桶数组,桶于是什么鬼?
桶的示意图

源码:

1
2
3
4
5
6
7
8
9
10
11
12
13
// A bucket for a Go map.
// bucket桶的定义
type bmap struct {
// tophash generally contains the top byte of the hash value
// for each key in this bucket. If tophash[0] < minTopHash,
// tophash[0] is a bucket evacuation state instead.
tophash [bucketCnt]uint8
// Followed by bucketCnt keys and then bucketCnt elems.
// NOTE: packing all the keys together and then all the elems together makes the
// code a bit more complicated than alternating key/elem/key/elem/... but it allows
// us to eliminate padding which would be needed for, e.g., map[int64]int8.
// Followed by an overflow pointer.
}

上面的bmap结构体只是初步的,在编译的时候会进行添加属性。会变成动态的创建一个新的

1
2
3
4
5
6
7
type bmap struct {
topbits [8]uint8
keys [8]keytype
values [8]valuetype
pad uintptr
overflow uintptr
}