Published on

What is concurrency in Go?

Authors

Modern personal computers have multiple cores and servers on cloud providers such as AWS or GCP can have hundreds. 1

Go was designed with concurrency in mind in a way which many older programming languages were not.

Tutorial

Inspiration draw from this more academic blog post! 2

When we start our app, we spin up a Go routine

The main function/entrypoint of every Go application is a Go routine.

package main

func main() {
  // ...
}

Process concurrently

Whether it's a list of movies a user might want to watch, conversations a user might want to continue, or list of websites a user might want to visit; each application we build will probably have a list of item's we'll need to process.

As with all applications, the faster we process items the better.

package main

func main() {
  links := []string {
    "http://www.facebook.com",
    "http://www.amazon.com",
    "http://www.apple.com",
    "http://www.netflix.com",
    "http://www.google.com",
  }
}

We can process the items using a loop

If we run the code now, we'll see that each print is asynchronously handled and that each item is printed to the screen almost immediately.

// ...

func main() {
  // ...

  for _, link := range links {
    fmt.Println(link)
  }
}

Blocking Operations

We usually do more than print to the screen. The processing will take some time, in this case due to a HTTP request inside the body of checkLink.

import (
	"fmt"
	"net/http"
)

func main() {
    // ...

    for _, link := range links {
        checkLink(link)
    }
}

func checkLink(l string) {
    _, err := http.Get(l)

    if err != nil {
        fmt.Println(l, " looks to be down!")
        return
    }

    fmt.Println(l, "is up!")
}

The processing of each item is handled asynchronously, meaning we wait for the completion of one call to checkLink before calling it again with a different url.

Ideally, we should be able to process all elements at the same time, not needing to wait for one element to finish processing before starting the next.

In other words, we should be able to begin processing each item in our list regardless of whether or not the previous item is done, the previous API call has returned.

We should be able to start another request before we get back the response from the previous checkLink call.

This is the concurrency we want.

Processing concurrently

We can process the items concurrently by adding a go call before each invocation of checkLink, telling Go that this function call is a subroutine.

This is us spinning up 5 routines for each element.

func main() {
    // ...

    for _, link := range links {
        go checkLink(link)
    }
}

func checkLink(l string) {
    // ...
}

We now see that the program exits without anything printing to the screen.

This is because our main routine started, spun up 5 subroutines, saw nothing else left to do inside the body of the main program; so immediately exited.

This happened within fractions of a second.

func main() {
    for _, link := range links {
        go checkLink(link)
    }

    fmt.Println("Main routine done!")
}

func checkLink(l string) {
    // ...
}

We can prove this by adding a fmt.Println after the loop.

Speed bump main routine

We want our subroutines to complete their processing before we finish our program.

If we add a fmt.Scanln(&input) which waits for user input we accomplish this.

The main function only exits after the user has interacted with the program.

func main() {
    for _, link := range links {
        go checkLink(link)
    }

    var input string
    fmt.Scanln(&input)

    fmt.Println("Main routine done!")
}

func checkLink(l string) {
    // ...
}

This approach will not work in production environments because we cannot depend on a user or admin being available to input to the program in order to continue processing.

Channels

By using channels we block the app from exiting on line 8.

On lines 17 & 21 we return data from our routine to the main routine.

func main() {
    c := make(chan string)

    for _, link := range links {
        go checkLink(link, c)
    }

    fmt.Println(<- c)
    fmt.Println("Main routine done!")
}

func checkLink(l string, c chan string) {
    _, err := http.Get(l)

    if err != nil {
        fmt.Println(l, " looks to be down!")
        c <- l
    }

    fmt.Println(l, "is up!")
    c <- l
}

If we run the program now, we'll see that the app exits after processing one of the go subroutines.

We need a way to make sure that every subroutine broadcasts it's message to the channel before the main routine exits.

Processing each subroutine

If we use one fmt.Println(<- c) for each item, we'll process each item.

func main() {
    c := make(chan string)

    for _, link := range links {
        go checkLink(link, c)
    }

    for i := 0; i < len(links); i++ {
        fmt.Println(<- c)
    }
}

func checkLink(l string, c chan string) {
    // ...
}

Process

Although that works, a more elegant way of ensuring all subroutines complete their execution is like this:

func main() {
    c := make(chan string)

    for _, link := range links {
        go checkLink(link, c)
    }

    for {
        fmt.Println(<- c)
    }
}

func checkLink(l string, c chan string) {
    // ...
}

Refactor

Clean up the <- syntax by using the range keyword which behaves like range in python.

Each channel message becomes an enumerated item in the loop.

func main() {
    c := make(chan string)

    for _, link := range links {
        go checkLink(link, c)
    }

    for l := range c {
        go checkLink(l, c)
    }
}

func checkLink(l string, c chan string) {
    // ...
}

Slow down execution by using a sleep method

If we add a sleep to our main function, we'll see that our go routines become blocked again.

This is because we're blocking inside our main function.

import (
    "fmt"
    "net/http"
    "time"
)

func main() {
    // ...

    for l := range c {
        time.Sleep(time.Second * 5)
        go checkLink(l, c)
    }
}

func checkLink(l string, c chan string) {
    // ...
}

Using a function literal which

We can use an immediately invoked anonymous function to unblock our main function.

If we wrap Sleep with the immediately

When we do this, we see that our routines execute concurrently.

Thus, our individual routines are no longer blocked.

func main() {
    // ...

    for l := range c {
        go func() {
          time.Sleep(time.Second * 5)
          checkLink(l, c)
        }()
    }
}

func checkLink(l string, c chan string) {
    // ...
}

Unfortunately however, when we look at our output, we'll see that every subroutine is now processing the same item, the same url, the same l.

This occurs because we're passing by reference when we invoke the anonymous function this way.

In other words, our subroutine is looking at the latest value of l instead of the value of l when it was invoked.

Pass by value to anon function

In order to store the value of l at the time we invoked the routine we need to refactor to pass l to our immediately invoked anonymous function.

// ...

func main() {
    // ...

    for l := range c {
        go func(link string) {
          time.Sleep(time.Second * 5)
          checkLink(link, c)
        }(l)
    }
}

func checkLink(l string, c chan string) {
    // ...
}

We update the function to receive a link and the call to checkLink to consume it.

Conclusion

We've seen achieve multi tasking in Go. By using sub routines we can

  • Concurrently process items in our application by using multiple threads.
  • Leverage the multi core processes of modern computers.
  • Speed up the overall run time of our application.
import (
  "fmt"
  "time"
  "net/http"
)

func main() {
	links := []string{
		"http://www.facebook.com",
		"http://www.amazon.com",
		"http://www.apple.com",
		"http://www.netflix.com",
		"http://www.google.com",
	}
	c := make(chan string)

	for _, link := range links {
		go checkLink(link, c)
	}

    for l := range c {
        go func(link string) {
          time.Sleep(time.Second * 5)
          checkLink(link, c)
        }(l)
    }

}

func checkLink(l string, c chan string) {
	_, err := http.Get(l)

	if err != nil {
		fmt.Println(l, " looks to be down!")
		c <- l
	}

	fmt.Println(l, "is up!")
	c <- l
}

Footnotes

  1. Nearly 200 cores on AWS instances https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/cpu-options-supported-instances-values.html

  2. Concurrency in Go https://www.golang-book.com/books/intro/10