Zig

Efficient and portable as C, without the "footguns"

Jun 12, 2023

Welcome to new subs + thanks a million for the Godot love! This thing has grown a whole lot faster than my inner-pessimist could have imagined. So, grazie!

If you enjoy these primers, I’d appreciate it if you favourite this post’s tweet to help spread the good (?) word of Why Now!

As always, I’m at alex@tapestry.vc should you ever wish to say hello. If you are building with Zig, I most definitely want to chat.

Not every technology is capable of direct value capture. This can be perilous as an infra investor, as it’s easy to become enamoured by something that’s innovative, yet devoid of a business model or meaningful market. Guilty.

Programming languages are a good example here. How Rust handles concurrency is a thing of beauty, Roc’s VM-less memory allocation is pretty cool.. and, well, there’s a lot to like about Zig. However, these languages aren’t licensed out, their value capture certainly isn’t tied to usage/utility.

What helps me sleep at night however is that these technologies (think languages, runtimes, protocols, etc) can serve as proxies for sophisticated & opinionated technologists. Dissenters capable of being wrong.

Now this is something that history would tell me capital is worth putting towards.

Take Zig for example. Bun is outpacing Node.js and its anagrammatic counterpart, Deno. How so? Well, Bun does a lot of things well, but it does tout Zig’s low-level control of memory and lack of hidden control flow (so good!) as a key unlock.

Continuing down this proxy, we find Stephen Gutekanst / Mach Engine working 24/7 on a game engine to help “upend the gaming industry”. Building a game engine ~from scratch is no small feat.

++ finally, we have TigerBeetle. Again, one doesn’t simply build a financial database from the ground up over a leisurely weekend. Zig has served as a beacon, attracting those who’re thinking orthogonally. I’m fond of folk like this.

As you may have guessed from, I don’t know, the title (!) we’re going to dig into Zig.

Selfishly, I want to truly get my head around why companies are opting to build with this nascent language as well as what we should expect next. I figured I’d bring you all along for the ride.

If you are building with Zig, I’d love to say hello! alex@tapestry.vc

Finally, I’d ask that, if possible, you consider supporting the Zig Software Foundation.

Zig is many [1][2][3] things [4][5] but at its core it’s:

A general-purpose programming language (think Python or JavaScript).
A “toolchain”.

The programming language part here is familiar, and hence, easy to grok. However, it’s worth lingering on. My favourite quality of Zig’s is its simplicity. What does this mean?

Well, firstly, the language is tiny.

It’s specified with a 500-line PEG grammar file. For context, a PEG (parsing expression grammar) file is typically used to define the structure and syntax of a programming language.

The tl;dr on this benefit is that a “small” language ultimately means that you have less language-specific keywords, etc., to remember. Hence, the language is “simple”.

** As someone that somehow still struggles to recall git flags, I’m --happy about this! **

Less keywords also means that there’s ideally “only one obvious way to do things.” Why is this good? Well, it becomes much easier to read your/someone else’s code when you know that a specific keyword is typically used for a handful (vs. infinite) number of things.

Fun Fact: Zig's developers took this mandate so much to heart that for a time, Zig had no for loop, which was deemed an unnecessary syntactic elaboration upon the already adequate while loop. “What you do is who you are”.

This goal stands in stark contrast with decisions made within languages like Python, where I’ve truly lost track of the number of keywords that essentially say: “if this happens then do that”. Sorry, I’m ranting.

Back to Zig. The language doesn’t stop here though, oh no. Zig also has “no hidden control flow”. This essentially means that each line-of-code written in Zig executes sequentially, as you would expect:

If you haven’t written code before this is likely confusing: doesn’t all code execute.. line-by-line (ie sequentially)? Nope. Oftentimes languages (e.g., JavaScript) will take ~helpful but “hidden” steps for the developer to ~fix/improve the execution order (ie “control flow”) of their code.

Technical Detail: Check out variable/function “hoisting” in JavaScript as an example of such hidden magic.

This action is often helpful, as intended. However, because it’s hidden from the developer, it can make their code more difficult to reason about (especially for other developers).

Make sure this point clicks as it’s important. Think about it, a developer could forget about the hidden control flow rules of a given language; all of a sudden, their code isn’t executing sequentially as expected? This happens, often in subtle ways, and is confusing for all involved.

As Zig eloquently puts it, you should “focus on debugging your application, not your programming language knowledge”.

Bun provides an equally glowing endorsement: “low-level control over memory and lack of hidden control flow makes it much simpler to write fast software.”

Alas, we have introduced a new value prop here: “Low-level control over memory”. Que?

Much like Fall Out Boy (sorry), our computers have “memory” or “mmrs” (sorry again). ie spaces (e.g., RAM) within the system where they store data or instructions that will ultimately be used/manipulated again.

For example, in my JavaScript program: `example.js` I might create a variable: `let age = 28`. JavaScript will then pull some Houdini-work once more and dynamically allocate enough space in memory at “runtime” to store the variable `age` for me. Helpful.

JavaScript is recognising on its own that age is an integer and allocating memory accordingly.

Zig, is less.. presumptuous. Within Zig we have to specify what type our variable is, which reveals a tad more information about how much memory our variable requires.

** If you want to visit / revisit “types” check out my primer on Deno (+ TypeScript)! **

We must explicitly state how much memory we want a variable to take up. Hence the “i32”.

Zig doesn’t stop here though, defining your types “statically” is the easy part.

Zig, much like C and C++, enables developers to allocate memory (remember, a space) manually. This means that Zig developers can ~precisely state how much memory they require for a given variable, function, etc., as well as when this memory should be freed, and hence, used elsewhere.

Again, to make the comparison to Brendan Eich’s darling — JavaScript handles this freeing of memory automatically. This process is known as “garbage collection”.

Technical Detail: Unlike C and C++, which use the `malloc` keyword, Zig doesn’t expose memory allocation directly in the language’s syntax.
Instead, memory access is exposed via Zig’s “standard library”.

A standard library is a set of pre-installed functions/objects that are ready to be used with a given language. For example, Python’s standard library comes with the `datetime` object.
This, again, makes memory allocation in Zig more explicit / clear.

At this point you’re likely thinking, “so what”. Fair. What’s important to know — and what relates back to Bun’s proclamation in our intro — is that this granular control of memory leads to performance gains. Why?

Well, there are a few reasons. I’ll point-er (sorry) out two:

1. Fragmentation: As memory is allocated and deallocated dynamically, free memory blocks become scattered across the “heap” (place especially for dynamic memory). This can result in fragmented memory, where there are small gaps between allocated blocks.

I’ve “drawn” a diagram that hopefully depicts this issue better than the wall of text above does. The main point being that these fragments, because they’re small, end up being a literal waste of space. (harsh, I know)

2. Garbage Collection: Yep, y’boy garbage collection (GC) packs a punch. GC introduces additional overhead. Why? Well, because it’s ultimately another “program” running in the background of your own.

Andrew Kelley, the creator of Zig, goes as far as saying that GC can result in “stop the world latency glitches”. When it comes to building critical systems (think aviation software), “latency” doesn’t cut it.

GC can also result in “non-deterministic” memory deallocation. ie, it may ultimately free memory that you would have ideally still had allocated. This is another example of a “hidden” action that other languages take.

To reiterate, whilst potentially perilous, software written in C, C++, Zig, etc., can be more performant than software written in dynamically allocated memory (DAM.. Daniel) languages a la Python or JavaScript.

Once again, Zig’s explicitness (in this case, explicit memory allocation) is what makes it simple. You, and your rag-and-tag crew of developers, don’t have to figure out how memory’s allocated/freed in your application, you literally state this in your code.

** Did I mention the word explicit? **

As hard is it may be to believe, Zig does even more to foster “simple” codebases such as omitting a “preprocessor” and “macros”. Don’t worry, we’ll get into what these terms mean.

You know what’s also simple? Telling your friends about this post / Why Now.

** Ok, take a sip of water, go get some sunshine etc., we’re onto Zig’s “toolchain” **

The meaning of “toolchain” is a little more difficult to accurately scope. However, the word typically means a set of utilities: libraries, compilers, build tools, etc., that the language, or users of the language, can leverage.

Libraries = code that someone else has written and packaged which can now be used by others to achieve a specific task. E.g., Rust’s Pola.rs library for data manipulation.
Compilers = take your high-level code and convert it to “machine code” (1s and 0s) that corresponds to a specific instruction set. Do some other helpful things like optimising your code (e.g., removing “dead code”).
Build Tools = a build tool manages the entire build process, which includes compilation but also includes dependency management, testing, packaging, etc. I wrote about “building” software in detail on my Nix primer.

We can use Zig’s stated goals (“maintaining robust, optimal and reusable software”) to fine-tune our definition.

Robust = software written in Zig works consistently, even during edge-cases.
Optimal = software written in Zig can be.. optimised.. for a specific task.
Reusable = software written in Zig is simple, scalable & portable.

With these goals in mind, I consider (go easy HN!) the Zig toolchain’s most notable features to be the following two:

Zig’s “Comptime”.
Zig’s 4 build systems.

Ok, I’m done writing in lists of twos & threes, I promise. Let’s delve into Zig’s “Comptime”.

Zig touts its Comptime as “A fresh approach to metaprogramming based on compile-time code execution and lazy evaluation.” Let’s unpack each word emphasised as per. First, compile-time.

Software has a “lifecycle” that ultimately results in said software being executed (ie running on a computer):

Developers write code (think Python), “compile” this code, “link” each compiled file generated (called an “object file”) into a final “executable” and then “run” (ie execute) this.. um.. executable.

Programming languages are typically evaluated at either compile-time (e.g., TypeScript) or runtime (JavaScript). “Evaluation” essentially means checking for errors, determining the “type” of a given variable, etc., all with the aim of ultimately executing a program.

Like any technical decision, there isn’t an objectively “correct” way to evaluate a program. Rather, there are trade-offs.

For example, if you evaluate a language’s “types” at compile-time, then you’ll pick-up the incorrect usage of a “string” in a function that expects an “integer” before you compile said language and run it somewhere. Thus picking up a “bug” before your software is deployed. Phew.

The drawback of this compile-time eval is that developers have to specify the exact type of data they expect their function to receive. This can get rather tricky. Why? Well, end-users of software are unpredictable, they may end up inserting valid data types (e.g., an integer in a “first name” field on a form) that you may not expect.

Zig takes a more.. democratic, approach. The language enables developers to explicitly state which blocks of their code they’d like “evaluated” at compile-time vs. runtime. This is handled via Zig’s `comptime` keyword:

Taking all that we now know about Zig, we can assume that the primary goal of this explicit statement of compile-time vs. runtime evaluation is.. you guessed it, explicitness.

A developer reading your Zig code doesn’t have to identify/recall what’s being evaluated at compile-time, you literally tell them. Much like Zig’s control flow, nothing is “hidden” from the developer.

Ok, cool, we like explicitness. However, I want to also point out that Comptime reiterates Zig’s ability to be fine-tuned for performance.

For example, if we offload type inference to the developer who compiles their software, then the end-user (think a general “consumer”) doesn’t have to handle type inference on their own machine at runtime. Nice.

Given we like explicitness, I’ll be explicit. Tell your friends etc.!

Right, so we know what evaluation is and when it happens (compile-time / runtime). What’s “lazy” evaluation?

Well, thankfully, it’s rather self-explanatory. Lazy evaluation, much like a “lazy person”, isn’t proactive, it only completes a task at the last-minute, when it must.

I’ll make this more concrete with some (simple!) Zig code which we’ll build on.

Note that we’re using “string literals” here as a data type.

If we were to lazily evaluate this code, we would only check/determine the values of the variables: first_name (“Alex”) and second_name (“Mackenzie”), when we need them. In this case, we need these values to complete the first_name ++ second_name operation.

Why is lazy evaluation helpful you ask? Well, it means you’re not doing any heavy-lifting before you have to, which ultimately results in more-efficient resource allocation.

Why calculate the value of an expression if you’re only maybe (e.g., in the context of conditional logic) going to use it later? Smart.

This said, whilst Zig makes this lazy eval.. explicit.. I do personally feel that lazy evaluation goes against Zig’s simplicity / feels a little “hidden”. This is a primer though, so let’s leave my judgement to the side.

Well, this is a “lengther”. Sorry, but programming languages are very much the aggregation of minute technical decisions that, in aggregate, support a handful of objectives. If you want to grok a language, you’ve got to appreciate its nuances.

Next, “Metaprogramming”. Remember our brief mention of “preprocessors” and “macros”? They’re back. Kinda.

Metaprogramming is common in systems-level programming languages like C, C++, Rust. It’s what you likely expect — a program, “programming” itself. Woah, meta.

In practice, metaprogramming involves leveraging compile-time information (e.g., type declarations like: var age: i32 = 28;) to manipulate (e.g., edit/generate code) your program in some way.

For example, with this “type information” our program could automatically edit our variable age’s data type to be “i8” vs. “i32”. i8 is a smaller data type, and hence, takes up less memory. Thus, through metaprogramming, we have optimised our Zig code at compile-time. C'est très cool!

Technical Detail: In C/C++ metaprogramming is handled by a “preprocessor” program that uses “macros” (ie specific keywords like: #define).

Without getting unnecessarily into the weeds, these macros are complex/error-prone; so much so that they’re considered by some to be a “separate programming language” beyond C/C++.

Whereas Zig treats metaprogramming as a “first-class citizen”, and hence, tightly integrates the process with the rest of its “toolchain” (via Comptime).

Share so that other people share? Woah, meta.

Alright, we’re nearly wrapped up here with Zig. For those of you still here, nice job, this isn’t an easy read by any means. It certainly wasn’t a walk in the park for my ghost-writer to draft! (joke).

As mentioned, Zig is… supple. It has 4 “build modes”. Again, we discussed what “building software” is at length within my Nix primer so I shall point you there if you need a refresher.

Zig’s 4 build modes are:

Debug = used during development (ie writing your code) and prioritises ease of debugging over performance. In this mode, code is compiled with additional debugging info.
ReleaseSafe = used for the final build of an application when performance and optimisation are critical.
ReleaseSmall = prioritises generating the smallest possible “executable”. Achieved through techniques such as dead code (ie unused) elimination or “function/data merging” (removing duplicates). This mode’s particularly useful for embedded systems (e.g., a Ring doorbell) that have limited resources (compute/memory).
ReleaseFast = sits in between debug and release modes. Optimises for performance but still includes some additional debugging info.

You select one of these build modes via the command line like so:

In particular, these build modes speak to Zig’s stated goal of producing optimal and reusable software. Wanna run some Zig code on your toaster? Cool, use ReleaseSmall. Fancy building a database? Impressive, but please use ReleaseSafe.

As hard as it may be to believe, there’s so much more (build.zig, cross-compilation, etc.) that I’d like to take you through re. Zig. However, I feel like the law of diminishing returns is almost certainly kicking in already.

I suspect you “get it”, you understand the essence and purpose of Zig. I’d encourage you to jump into the following posts/videos if you’re interested in learning more:

Mitchell Hashimoto: Zig Build Internals.
Fastly: Build an Efficient & Portable Programming Language with Zig.
Andrew Kelley: The Road to Zig 1.0.

As I have espoused many times, I like “Serious Software”. Think game engines, 3d modelling software, runtimes, etc., as the “real estate” of features that can (and should!) be optimised within them is tremendous. Good luck building Blender over a weekend!

Jarred puts it similarly when asked “Why is Bun Fast?”: “In one word: obsession. An enormous amount of time spent profiling, benchmarking and optimising things. The answer is different for every part of Bun”.

I suspect Zig will continue to make it easier for more folk like Jarred to lean into their obsessions and take on incumbents through fine-grained tweaks and tuning. If so, I am very excited to see what’s coming around the corner.

If you are building, or considering building, with Zig, I’d love to say hello! alex@tapestry.vc

Finally, I’d ask that, if possible, you consider supporting the Zig Software Foundation.

Why Now

Discussion about this post

Ready for more?