momer@soryy:/$ cd /home/soryy

momer@soryy:~$ for dir in presentations posts; do echo $dir/:; ls -lath $dir | tail -5; done

presentations/:

-rw-r--r-- 1 momer momer 22 Oct 2014 groupcache-in-depth-overview.pres
posts/:
-rw-r--r-- 1 momer momer 623 07 Dec 2014 apache-cassandra-introduct....blog
-rw-r--r-- 1 momer momer 15136 09 Aug 2014 not-another-go/golang-net/h....blog
-rw-r--r-- 1 momer momer 2409 09 Aug 2014 indepth-golang-resources-a....blog
-rw-r--r-- 1 momer momer 9847 31 Jul 2014 ajax/javascript-enabled-par....blog
-rw-r--r-- 1 momer momer 10214 05 Jul 2014 common-mistakes-made-with-g....blog
-rw-r--r-- 1 momer momer 3374 16 Jun 2014 docker-resolving-dns-issue....blog
-rw-r--r-- 1 momer momer 3735 25 Apr 2014 why-jruby.blog
-rw-r--r-- 1 momer momer 9776 16 Mar 2014 apis-with-devise.blog

Below are some highlights from the Go at Google: Language Design in the Service of Software Engineering talk/paper.

The document is well worth a read; and, while the highlights below get at some of the key pieces of information presented, there is much more context and additional information in the paper itself. I highly recommend reading it if any of the below quotes pique your interest.

Highlights

Introduction

  • "Go is a compiled, concurrent, garbage-collected, statically typed language developed at Google. It is an open source project: Google imports the public repository rather than the other way around."

Go at Google

  • "The goals of the Go project were to eliminate the slowness and clumsiness of software development at Google, and thereby to make the process more productive and scalable."
  • "Go is more about software engineering than programming language research."

Pain Points

  • ... the properties Go *does* have address the issues that make large-scale software development difficult[:]

    • "slow builds"
    • "uncontrolled dependencies"
    • "each programmer using a different subset of the language"
    • "poor program understanding (code hard to read, poorly documented, and so on)"
    • "duplication of effort"
    • "cost of updates"
    • "version skew"
    • "difficulty of writing automatic tools"
    • "cross-language builds"
  • "It is better to forgo convenience for safety and dependability, so Go has brace-bounded blocks."

Dependencies in C and C++

Basically, ANSI C defined a spec which allowed header files to be included multiple times in a given program, without causing an error by using this pattern to tell the preprocessor a condition:

#ifndef _SYS_STAT_H_
#define _SYS_STAT_H_
...
#endif

So that if and only if _SYS_STAT_H_ was undefined, then the content would be included. This worked great, but caused a lot of bloat due to the #include clauses still being included; and, it also slowed compilation as pointed out in the paper.

This problem is exaccerbated in C++.

As a result, during compilation of a single Google binary comprised of 2000 different files, which were altogether 4.2 megabytes when concatenated, ended up delivering over 8 gigabytes of input to the compiler.

Plan 9 avoided these issues since, "header files were forbidden from containing further #include clauses; all #includes were required to be in the top-level C file."

Dependencies in Go

  • "... after the package clause (the subject of the next section), each source file may have one or more import statements, comprising the import keyword and a string constant identifying the package to be imported into this source file (only):"
import "encoding/json"
  • "The first step to making Go scale, dependency-wise, is that the language defines that unused dependencies are a compile-time error (not a warning, an error)."

  • Consider a Go program with three packages and this dependency graph:

    • package A imports package B;
    • package B imports package C;
    • package A does not import package C

    ... When A is compiled, the compiler reads the object file for B, not its source code. ... In other words, when B is compiled, the generated object file includes type information for all dependencies of B that affect the public interface of B.

    ... This design has the important effect that when the compiler executes an import clause, it opens exactly one file, the object file identified by the string in the import clause. This is, of course, reminiscent of the Plan 9 C (as opposed to ANSI C) approach to dependency management, except that, in effect, the compiler writes the header file when the Go source file is compiled. The process is more automatic and even more efficient than in Plan 9 C, though: the data being read when evaluating the import is just "exported" data, not general program source code.

  • "The language defines that there can be no circular imports in the graph, and the compiler and linker both check that they do not exist."

  • "It can be better to copy a little code than to pull in a big library for one function. (A test in the system build complains if new core dependencies arise.) Dependency hygiene trumps code reuse."

Packages

  • "It's important to recognize that package paths are unique, but there is no such requirement for package names. The path must uniquely identify the package to be imported, while the name is just a convention for how clients of the package can refer to its contents. The package name need not be unique and can be overridden in each importing source file by providing a local identifier in the import clause."
  • "Every company might have its own log package but there is no need to make the package name unique. Quite the opposite: Go style suggests keeping package names short and clear and obvious in preference to worrying about collisions."

Remote Packages

  • "It's worth noting that the go get command downloads dependencies recursively, a property made possible only because the dependencies are explicit. Also, the allocation of the space of import paths is delegated to URLs, which makes the naming of packages decentralized and therefore scalable, in contrast to centralized registries used by other languages."

Syntax

  • "... if the language is hard to parse, automated tools are hard to write. Go was therefore designed with clarity and tooling in mind, and has a clean syntax."
  • "A method is just a function with a special parameter, its receiver, which can be passed to the function using the standard "dot" notation"
  • "One mitigating factor for the lack of default arguments is that Go has easy-to-use, type-safe support for variadic functions."
  • ... for initializing declarations, one can drop the var keyword and just take the type of the variable from that of the expression. These two declarations are equivalent; the second is shorter and idiomatic:
var buf *bytes.Buffer = bytes.NewBuffer(x) // explicit
buf := bytes.NewBuffer(x)                  // derived
  • There is a blog post at [golang.org/s/decl-syntax](http://golang.org/s/decl-syntax) with more detail about the syntax of declarations in Go and why it is so different from C.

Naming

  • "If the initial character is an upper case letter, the identifier is exported (public); otherwise it is not:"
    • "upper case initial letter: Name is visible to clients of package"
    • "otherwise: name (or _Name) is not visible to clients of package"
  • " the program source text expresses the programmer's meaning simply."
  • Another simplification is that Go has a very compact scope hierarchy:

    • universe (predeclared identifiers such as int and string)
    • package (all the source files of a package live at the same scope)
    • file (for package import renames only; not very important in practice)
    • function (the usual)
    • block (the usual)

    There is no scope for name space or class or other wrapping construct. Names come from very few places in Go, and all names follow the same scope hierarchy: at any given location in the source, an identifier denotes exactly one language object, independent of how it is used. (The only exception is statement labels, the targets of break statements and the like; they always have function scope.)

  • "... method lookup is always by name only, not by signature (type) of the method. In other words, a single type can never have two methods with the same name. Given a method x.M, there's only ever one M associated with x"

Semantics

  • Go makes many small changes to C semantics, mostly in the service of robustness. These include:

    • there is no pointer arithmetic
    • there are no implicit numeric conversions
    • array bounds are always checked
    • there are no type aliases (after type X int, X and int are distinct types not aliases)
    • ++ and -- are statements not expressions
    • assignment is not an expression
    • it is legal (encouraged even) to take the address of a stack variable
    • and many more

    There are some much bigger changes too, stepping far from the traditional C, C++, and even Java models. These include linguistic support for:

    • concurrency
    • garbage collection
    • interface types
    • reflection
    • type switches

Concurrency

This section mentions 'CSP' frequently, which is an initialism for Communicating Sequential Processes.

  • "Go embodies a variant of CSP with first-class channels."
  • "... CSP has the property that it is easy to add to a procedural programming model without profound changes to that model."
  • "The approach is thus the composition of independently executing functions of otherwise regular procedural code."
  • "There is one important caveat: Go is not purely memory safe in the presence of concurrency. Sharing is legal and passing a pointer over a channel is idiomatic (and efficient)."
  • "Go enables simple, safe concurrent programming but does not forbid bad programming. We compensate by convention, training programmers to think about message passing as a version of ownership control. The motto is, 'Don't communicate by sharing memory, share memory by communicating.'"

Garbage Collection

  • "In C and C++, too much programming effort is spent on memory allocation and freeing."
  • "... in a concurrent object-oriented language it's almost essential to have automatic memory management because the ownership of a piece of memory can be tricky to manage as it is passed around among concurrent executions. It's important to separate behavior from resource management."
  • "The language is much easier to use because of garbage collection."
  • Of course, garbage collection brings significant costs: general overhead, latency, and complexity of the implementation. Nonetheless, we believe that the benefits, which are mostly felt by the programmer, outweigh the costs, which are largely borne by the language implementer.

    Experience with Java in particular as a server language has made some people nervous about garbage collection in a user-facing system. The overheads are uncontrollable, latencies can be large, and much parameter tuning is required for good performance. Go, however, is different. Properties of the language mitigate some of these concerns. Not all of them of course, but some.

    The key point is that Go gives the programmer tools to limit allocation by controlling the layout of data structures. Consider this simple type definition of a data structure containing a buffer (array) of bytes:

type X struct {
    a, b, c int
    buf [256]byte
}
  • In Java, the buf field would require a second allocation and accesses to it a second level of indirection. In Go, however, the buffer is allocated in a single block of memory along with the containing struct and no indirection is required. For systems programming, this design can have a better performance as well as reducing the number of items known to the collector. At scale it can make a significant difference.
  • "... To give the programmer this flexibility, Go must support what we call interior pointers to objects allocated in the heap. The X.buf field in the example above lives within the struct but it is legal to capture the address of this inner field,"

Composition not inheritance

  • Go takes an unusual approach to object-oriented programming, allowing methods on any type, not just classes, but without any form of type-based inheritance like subclassing. This means there is no type hierarchy.

    ... Instead, Go has interfaces

    ... In Go an interface is just a set of methods. For instance, here is the definition of the Hash interface from the standard library.

type Hash interface {
    Write(p []byte) (n int, err error)
    Sum(b []byte) []byte
    Reset()
    Size() int
    BlockSize() int
}
  • All data types that implement these methods satisfy this interface implicitly; there is no implements declaration. That said, interface satisfaction is statically checked at compile time so despite this decoupling interfaces are type-safe.
  • Type hierarchies result in brittle code. The hierarchy must be designed early, often as the first step of designing the program, and early decisions can be difficult to change once the program is written. As a consequence, the model encourages early overdesign as the programmer tries to predict every possible use the software might require, adding layers of type and abstraction just in case. This is upside down. The way pieces of a system interact should adapt as it grows, not be fixed at the dawn of time.

    Go therefore encourages composition over inheritance, using simple, often one-method interfaces to define trivial behaviors that serve as clean, comprehensible boundaries between components.

  • "Go's interfaces have a major effect on program design. One place we see this is in the use of functions that take interface arguments. These are not methods, they are functions."

Errors

  • "Go does not have an exception facility in the conventional sense, that is, there is no control structure associated with error handling."
  • "Libraries use the error type to return a description of the error. Combined with the ability for functions to return multiple values, it's easy to return the computed result along with an error value, if any. For instance, the equivalent to C's getchar does not return an out-of-band value at EOF, nor does it throw an exception; it just returns an error value alongside the character, with a nil error value signifying success."
  • "... if errors use special control structures, error handling distorts the control flow for a program that handles errors. The Java-like style of try-catch-finally blocks interlaces multiple overlapping flows of control that interact in complex ways. Although in contrast Go makes it more verbose to check errors, the explicit design keeps the flow of control straightforward—literally."

Tools

  • "Tools to manipulate Go programs are so easy to write that many such tools have been created, some with interesting consequences for software engineering."
  • Gofmt is run on all Go programs we write, and most of the open source community uses it too. It is run as a "presubmit" check for the code repositories to make sure that all checked-in Go programs are formatted the same.

    Gofmt is often cited by users as one of Go's best features even though it is not part of the language. The existence and use of gofmt means that from the beginning, the community has always seen Go code as gofmt formats it, so Go programs have a single style that is now familiar to everyone.

  • "Another important tool is gofix, which runs tree-rewriting modules written in Go itself that are therefore are capable of more advanced refactorings."

  • Note that these tools allow us to update code even if the old code still works. As a result, Go repositories are easy to keep up to date as libraries evolve. Old APIs can be deprecated quickly and automatically so only one version of the API needs to be maintained. For example, we recently changed Go's protocol buffer implementation to use "getter" functions, which were not in the interface before. We ran gofix on all of Google's Go code to update all programs that use protocol buffers, and now there is only one version of the API in use. Similar sweeping changes to the C++ or Java libraries are almost infeasible at the scale of Google's code base.