Working with string in Go

Thanh Pham / Tue 24 Sep 2019

String in Go is just an arbitrary slice of bytes. It can be created using string literals, or from a slice of bytes:

str1 := "This string is in 1 line."
str2 := `This string is in multiple lines.
This is the second line.`
str3 := string([]byte{'H', 'e', 'l', 'l', 'o'})

String in Go is immutable, cannot be nil, once created its content cannot be changed:

str := "Hello, Go!"
str[0] = 'X' // panic

String supports UTF-8 by default. This means you can use any languages without a need of any external dependencies about UTF-8 processing.

str := "Xin Chào Việt Nam"
str1 := "สวัสดีประเทศไทย"

String uses rune (alias of int32) to represent Unicode point. Length of each characters can be varied from 1-4 bytes. Hence byte i-th is not necessary character i-th.

msg := "你好"

// byte i-th is not necessary character i-th.
fmt.Printf("value of byte 0-th: %c\n", msg[0]) // "ä"

// i is the byte position of the character in the slice of bytes
// c is the value of the character
// below block of code will print
// 	byte 0th, value: 你
//	byte 3th, value: 好
for i, c := range msg {
	fmt.Printf("byte %d-th, value: %10c\n", i, c)
}

// another correct way to access the message as each character is convert it into rune
// below block of code will print
// 	character 0th, value: 你
// 	character 1th, value: 好 好
for i, c := range []rune(msg) {
	fmt.Printf("character %d-th, value: %c\n", i, c)
}
Playground

len return number of bytes in the string, not number of characters. Convert to slice of rune for calculating number of characters.

str := "Xin Chào Việt Nam"
fmt.Printf("number of bytes: %7d\n", len(str)) // 20
fmt.Printf("number of characters: %d", len([]rune(str))) // 17
Playground

String is comparable:

str1 := "Xin Chào Việt Nam"
str2 := "Xin Chào Việt Nam"
	
fmt.Println(str1 == str2) // true

Package strings, utf-8, strconv provides a lot of useful functionality about string:

var b strings.Builder
for i := 3; i >= 1; i-- {
	fmt.Fprintf(&b, "%d...", i)
}
b.WriteString("ignition")
fmt.Println(b.String()) // 3...2...1...ignition

fmt.Println(strings.Contains("seafood", "foo")) // true

Advanced

String is actually a composite type named StringHeader, which is composed of 2 words: Data is a pointer that point to an underlying array and Len is the number of bytes the string has.

type StringHeader struct {
    Data uintptr
    Len  int
}

We can get the StringHeader  from a string by using reflect and unsafe package:

s := "Hello, string!"

sh := (*reflect.StringHeader)(unsafe.Pointer(&s))
arr := (*[14]byte)(unsafe.Pointer(sh.Data))

fmt.Printf("Data: %d, Len: %d\n", sh.Data, sh.Len) // Data: 4812010, Len: 14; Hello, string!
Playground
Next In
golang
Working with constant in Go

A quick review of constant in Go