- Published on
What is concurrency in Go?
- Authors
- Name
- Loi Tran
- @PrimeTimeTrann
Modern personal computers have multiple cores and servers on cloud providers such as AWS or GCP can have hundreds. 1
Go was designed with concurrency in mind in a way which many older programming languages were not.
Tutorial
Inspiration draw from this more academic blog post! 2
When we start our app, we spin up a Go routine
The main function/entrypoint of every Go application is a Go routine.
package main
func main() {
// ...
}
Process concurrently
Whether it's a list of movies a user might want to watch, conversations a user might want to continue, or list of websites a user might want to visit; each application we build will probably have a list of item's we'll need to process.
As with all applications, the faster we process items the better.
package main
func main() {
links := []string {
"http://www.facebook.com",
"http://www.amazon.com",
"http://www.apple.com",
"http://www.netflix.com",
"http://www.google.com",
}
}
We can process the items using a loop
If we run the code now, we'll see that each print is asynchronously handled and that each item is printed to the screen almost immediately.
// ...
func main() {
// ...
for _, link := range links {
fmt.Println(link)
}
}
Blocking Operations
We usually do more than print to the screen. The processing will take some time, in this case due to a HTTP request inside the body of checkLink
.
import (
"fmt"
"net/http"
)
func main() {
// ...
for _, link := range links {
checkLink(link)
}
}
func checkLink(l string) {
_, err := http.Get(l)
if err != nil {
fmt.Println(l, " looks to be down!")
return
}
fmt.Println(l, "is up!")
}
The processing of each item is handled asynchronously, meaning we wait for the completion of one call to checkLink
before calling it again with a different url.
Ideally, we should be able to process all elements at the same time, not needing to wait for one element to finish processing before starting the next.
In other words, we should be able to begin processing each item in our list regardless of whether or not the previous item is done, the previous API call has returned.
We should be able to start another request before we get back the response from the previous checkLink
call.
This is the concurrency we want.
Processing concurrently
We can process the items concurrently by adding a go
call before each invocation of checkLink
, telling Go that this function call is a subroutine.
This is us spinning up 5 routines for each element.
func main() {
// ...
for _, link := range links {
go checkLink(link)
}
}
func checkLink(l string) {
// ...
}
We now see that the program exits without anything printing to the screen.
This is because our main routine started, spun up 5 subroutines, saw nothing else left to do inside the body of the main program; so immediately exited.
This happened within fractions of a second.
func main() {
for _, link := range links {
go checkLink(link)
}
fmt.Println("Main routine done!")
}
func checkLink(l string) {
// ...
}
We can prove this by adding a fmt.Println
after the loop.
Speed bump main routine
We want our subroutines to complete their processing before we finish our program.
If we add a fmt.Scanln(&input)
which waits for user input we accomplish this.
The main function only exits after the user has interacted with the program.
func main() {
for _, link := range links {
go checkLink(link)
}
var input string
fmt.Scanln(&input)
fmt.Println("Main routine done!")
}
func checkLink(l string) {
// ...
}
This approach will not work in production environments because we cannot depend on a user or admin being available to input to the program in order to continue processing.
Channels
By using channels we block the app from exiting on line 8.
On lines 17 & 21 we return data from our routine to the main routine.
func main() {
c := make(chan string)
for _, link := range links {
go checkLink(link, c)
}
fmt.Println(<- c)
fmt.Println("Main routine done!")
}
func checkLink(l string, c chan string) {
_, err := http.Get(l)
if err != nil {
fmt.Println(l, " looks to be down!")
c <- l
}
fmt.Println(l, "is up!")
c <- l
}
If we run the program now, we'll see that the app exits after processing one of the go subroutines.
We need a way to make sure that every subroutine broadcasts it's message to the channel before the main routine exits.
Processing each subroutine
If we use one fmt.Println(<- c)
for each item, we'll process each item.
func main() {
c := make(chan string)
for _, link := range links {
go checkLink(link, c)
}
for i := 0; i < len(links); i++ {
fmt.Println(<- c)
}
}
func checkLink(l string, c chan string) {
// ...
}
Process
Although that works, a more elegant way of ensuring all subroutines complete their execution is like this:
func main() {
c := make(chan string)
for _, link := range links {
go checkLink(link, c)
}
for {
fmt.Println(<- c)
}
}
func checkLink(l string, c chan string) {
// ...
}
Refactor
Clean up the <-
syntax by using the range keyword which behaves like range in python.
Each channel message becomes an enumerated item in the loop.
func main() {
c := make(chan string)
for _, link := range links {
go checkLink(link, c)
}
for l := range c {
go checkLink(l, c)
}
}
func checkLink(l string, c chan string) {
// ...
}
Slow down execution by using a sleep method
If we add a sleep to our main function, we'll see that our go routines become blocked again.
This is because we're blocking inside our main function.
import (
"fmt"
"net/http"
"time"
)
func main() {
// ...
for l := range c {
time.Sleep(time.Second * 5)
go checkLink(l, c)
}
}
func checkLink(l string, c chan string) {
// ...
}
Using a function literal which
We can use an immediately invoked anonymous function to unblock our main function.
If we wrap Sleep
with the immediately
When we do this, we see that our routines execute concurrently.
Thus, our individual routines are no longer blocked.
func main() {
// ...
for l := range c {
go func() {
time.Sleep(time.Second * 5)
checkLink(l, c)
}()
}
}
func checkLink(l string, c chan string) {
// ...
}
Unfortunately however, when we look at our output, we'll see that every subroutine is now processing the same item, the same url, the same l
.
This occurs because we're passing by reference when we invoke the anonymous function this way.
In other words, our subroutine is looking at the latest value of l
instead of the value of l
when it was invoked.
Pass by value to anon function
In order to store the value of l
at the time we invoked the routine we need to refactor to pass l
to our immediately invoked anonymous function.
// ...
func main() {
// ...
for l := range c {
go func(link string) {
time.Sleep(time.Second * 5)
checkLink(link, c)
}(l)
}
}
func checkLink(l string, c chan string) {
// ...
}
We update the function to receive a link
and the call to checkLink
to consume it.
Conclusion
We've seen achieve multi tasking in Go. By using sub routines we can
- Concurrently process items in our application by using multiple threads.
- Leverage the multi core processes of modern computers.
- Speed up the overall run time of our application.
import (
"fmt"
"time"
"net/http"
)
func main() {
links := []string{
"http://www.facebook.com",
"http://www.amazon.com",
"http://www.apple.com",
"http://www.netflix.com",
"http://www.google.com",
}
c := make(chan string)
for _, link := range links {
go checkLink(link, c)
}
for l := range c {
go func(link string) {
time.Sleep(time.Second * 5)
checkLink(link, c)
}(l)
}
}
func checkLink(l string, c chan string) {
_, err := http.Get(l)
if err != nil {
fmt.Println(l, " looks to be down!")
c <- l
}
fmt.Println(l, "is up!")
c <- l
}
Footnotes
Nearly 200 cores on AWS instances https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/cpu-options-supported-instances-values.html ↩
Concurrency in Go https://www.golang-book.com/books/intro/10 ↩