Go Map Hash
maphash #
Map implementation is one of the interesting features in Go, it accepts anything as key, from string to numbers or even complex structs. And the magic thing is it is able to calculate the hash code of the given key effectively without the need to provide any hash code function. I feel this is really interesting and it stimulates my curiosity every single time I use map.
I would like to know how it's implemented and whether I can use the same implementation for some other contexts. But unfortunately, Go doesn't expose the hash code implementation for us to use directly. If you have enough knowledge about Go compiler and Go runtime you still can investigate and copy the implementation. I actually tried but found it's quite challenging and requires a lot of effort. It requires a lot of understanding about the compiler and runtime and also requires understanding of assembly code - which I feel is a little bit scary [1].
Luckily, Go have another package that helps to simplify the implementation of hash code for map. It is the package maphash. With the maphash package, we can easily implement the hash code the primitive types like string or numbers:
var h maphash.Hash
h.SetSeed(maphash.MakeSeed())
h.WriteString("hello")
hashCode := h.Sum64() 
Playground
number := 12.3
var h maphash.Hash
h.SetSeed(maphash.MakeSeed())
binary.Write(&h, binary.BigEndian, number)
hashCode := h.Sum64() // 4236663090134902550
Playground
But what if we have a key of the map is a struct that includes fields of different types like below:
type Employee struct {
	Name      string
	Age       int
	Address   string
	BirthDate time.Time
}
How can we implement the hash code? You might think that we just need to convert everything into string and contact them together to form a key? But it will cause some problems, the 2 following employees will have the same key and hence the same hash code:
e1 := Employee{
	Name: "Jack",
	Age:  15,
}
e2 := Employee{
	Name: "Jack1",
	Age:  5,
}
Playground
You can argue that we can add a separator between the fields... but then we need to identify which ones are safe to use. And if the separator collides with the value of the key?
Because of those complexities, we need another simple mechanism to apply. But unfortunately there is no guideline from Go on how to implement the hash code. Hence the best way I can imagine is to use the similar mechanism that I used to use while working with Java. This mechanism uses 2 primes numbers, one as the init of the hash code and another one as a multiply factor for the hash code. It is quite popular in the Java community. The steps can be described as below:
- Let call the hash code we want to calculate is sum
- Init sumwith value is a prime number. i.e. 7:sum = 7
- For every single field nof the given struct, calculate its hash codesum_n.- If type of the field is kind of byte or string, write directly to the maphash.Hash
- If type of the field is boolean, number, or slice values of them, use binary.Writeto write to themaphash.Hash
- If it's time, get its binary value by calling MarshalBinary()and write to themaphash.Hash. Or you can simply get theUnixNano()and usesbinary.Writefor it.
- If you have a slice or map of types, iterate over it and apply the same principle to calculate its sum.
 
- Combine the sum_nwith the total sum by multiple it with the multiply factor (which is also another prime number), i.e 31:sum = sum * 31 + sum_n
- Perform step #3 and #4 for all the fields of the struct that should be used for comparing the struct.
Applying the above rules, we can have the hash code for the struct Employee as below:
var seed = maphash.MakeSeed()
func hashEmployee(e Employee) uint64 {
	var h maphash.Hash
	h.SetSeed(seed)
	sum := uint64(7)
	h.WriteString(e.Name)
	sum = sum*31 + h.Sum64()
	h.Reset()
	binary.Write(&h, binary.LittleEndian, int64(e.Age))
	sum = sum*31 + h.Sum64()
	h.Reset()
	h.WriteString(e.Address)
	sum = sum*31 + h.Sum64()
	h.Reset()
	b, _ := e.BirthDate.MarshalBinary()
	h.Write(b)
	sum = sum*31 + h.Sum64()
	return sum
}
hashEmployee(e1) 
hashEmployee(e2) 
Playground
I think this mechanism is quite easy to use and its performance is acceptable for production code.
Unsafe Hack #
If you are not happy with the performance of the above hash mechanism and still want to use the Go internal hash implementation, below is an unsafe hack that you might want to consider:
func UnsafeHashOf[T comparable]() func(seed maphash.Seed, v T) uint64 {
	h := getHashFunc[T](map[T]struct{}{})
	return func(seed maphash.Seed, v T) uint64 {
		return h.Sum64(seed, v)
	}
}
type abiType struct {
	Size_       uintptr
	PtrBytes    uintptr 
	Hash        uint32  
	TFlag       uint8   
	Align_      uint8   
	FieldAlign_ uint8   
	Kind_       uint8   
	
	
	Equal func(unsafe.Pointer, unsafe.Pointer) bool
	
	
	
	GCData    *byte
	Str       int32 
	PtrToThis int32 
}
type abiMapType[T comparable] struct {
	abiType
	Key    *abiType
	Elem   *abiType
	Bucket *abiType 
	
	Hasher     hashFunc[T]
	KeySize    uint8  
	ValueSize  uint8  
	BucketSize uint16 
	Flags      uint32
}
type hashFunc[T comparable] func(value *T, seed maphash.Seed) uintptr
func (h hashFunc[T]) Sum64(seed maphash.Seed, v T) uint64 { return uint64(h(&v, seed)) }
type emptyInterface struct{ typ *abiType }
func getHashFunc[T comparable](v interface{}) hashFunc[T] {
	return (*abiMapType[T])(unsafe.Pointer((*emptyInterface)(unsafe.Pointer(&v)).typ)).Hasher
}
Playground
Please note that the above hack is not safe and will be broken if Go change its internal types.
Performance of using maphash is much worse compared to the hack method but it's safe to use. Hence you can base on your use-case to decide which one is suitable.
goos: linux
goarch: amd64
pkg: github.com/pthethanh/cache
cpu: 12th Gen Intel(R) Core(TM) i7-1255U
BenchmarkMapHashNormal-12        9833695               110.4 ns/op           184 B/op          3 allocs/op
BenchmarkMapHashHack-12         29772643                38.80 ns/op           64 B/op          1 allocs/op
PASS
ok      github.com/pthethanh/cache      2.411s
Playground
[1] You can take a look at the implementation of map in Go at map.go, map_fast32.go, map_fast64.go, map_faststr.go, alg.go and some relevant assembly code files can be found in the same run time package.