An example of "a little copying is better than a little dependency"
The gist of the Go proverb “a little copying is better than a little dependency” is to be careful when bringing new dependencies to our programs. Of course, taking DRY or this proverb literally will lead to poor outcomes. But, I think being vigilant about dependencies and going through some checklist is good advice.
In this post, we’re going to list why we removed an existing dependency (moby/buildkit) from the aws/copilot-cli, and explore the mechanics of the process.
Rational
First of all, buildkit is a perfectly fine module to depend on! But for us, the library only provided nice-to-have functionality that was low effort to rewrite and reduced our binary size.
optional We use buildkit to parse a Dockerfile, and extract certain instructions like the ports exposed or container health check settings. If the parsing fails, it’s no big deal. We print a warning and the user can re-enter that information later.
size Today, the darwin/amd64
executable is roughly 45MB.
$ ls -l bin/local/copilot-darwin-amd64
... 47662160 Sep 17 10:10 bin/local/copilot-darwin-amd64
Doing a quick prototype by commenting out the use of the library shows that we can save ~3MB of space.
$ ls -l bin/local/copilot-darwin-amd64
... 44355104 Sep 17 10:13 bin/local/copilot-darwin-amd64
Not as big of a saving as I had hoped for but not bad either.
cheap Since we use only a small surface area of the library, it’s also relatively easy to replace the functionality. Since buildkit is open source, we can dive into the codebase and get a sense of how Dockerfiles are parsed.
Mechanics
Ensure compatibility in unit tests
Just like Rob Pike’s strconv.IsPrint
test example in the Go proverbs talk, we don’t depend on buildkit in the source files. Instead, we only import the library in our unit tests and ensure that our parsed outputs match buildkit’s!
For example, in the tests for parsing HEALTHCHECK [OPTIONS] CMD command
instructions, we ensure that the command
values match exactly. However, we don’t compare the [OPTIONS]
because AWS Copilot uses different defaults than Docker.
Similarly, for the exposed ports we ensure that Copilot detects the same ports in a Dockerfile as buildkit.
Swap the logic
Finally, we have the fun part of replacing the parser. For Copilot, all we had to do was replacing the Dockerfile scanner with a custom version. Luckily, there is another great Go talk from Rob Pike on the topic: Lexical Scanning in Go that we can pair with buildkit’s implementation. Replacing the library resulted in roughly adding 240 lines of code.
Takeaways
There can be some quick wins out there by duplicating a little bit of code. Some further reading material:
- Russ Cox, “Our Software Dependency Problem” January 2019.
- Niklaus Wirth, “A plea for lean software.” Computer 28.2 (1995): 64-68.