Whilst it’s not required, if you want to follow along, you can install Flox in an instant here.
Oh and thanks for the 1k subs. Pretty cool!
What’s cool about technology investing is that amongst all the hysteria, there are technologies out there, laying in plain sight, that are criminally underrated. I recently pointed to canaries as but one example, but let’s not forget that the transformer was hanging around ~5 years before ChatGPT.
Today, dear readers, I’m going to layout why I think that the world’s 20 million-something developers (not to mention agents) will, eventually, all be using Flox.
Some background: I personally use Flox for every project, pretty much everyday, and I can’t think of a reason why I would pick Docker over it. This is particularly true when running AI models locally.
Even better, I don’t use Flox for just one task. The reason why Nix (Flox’s underlying technology) is infamously difficult to grok is that it is many things. If anything, Nix is more of a philosophy, an opinion. Thankfully, Flox takes those many things and makes them a little more.. approachable.
Want a reproducible developer environment like Python’s venv or Conda? Cool, Flox can do that, but like, it actually works. Want a package manager like npm? Yeah, Flox does that better too. Want to build packages? Again, Flox is the correct way to do this.
Even better, Flox passes Sergey Brin’s “toothbrush test”. It’s something that every developer should use every hour of every day. Products with these characteristics, naturally, capture our attention at Tapestry.
So, dear readers, my gift to thee is to: 1) explain why I feel so strongly and 2) help you install Flox yourselves. Yeno, if that’s your kinda thing.
Alright, so something a few of you enjoyed during my piece on Restate & Durable Execution is when I explained Restate through sections of their website. I figured that with a website as pretty as Flox’s, it’d be rude not to flaunt it a little too: flox.dev
Ok, so firstly.. what is a “dev environment?”.
As developers, we use a tonne of third party software. For example, instead of writing the raw python code to do some specific Tapestry VC portfolio data analysis, I might use a third party library or “dependency” like Pandas.
To use Pandas on my “machine” (i.e. laptop), I need to install it. If it’s not there already of course. I typically do this with a python-specific (take note, this is important later) package manager like Pip:
Alas, this is where my problems begin. Look at what happens when I run the above command in my terminal:
It turns out that when we install the Pandas library, we’re also installing specific “versions” of other libraries that Pandas itself relies on. In this instance, we see that Pandas “requires” Numpy. Specifically, it requires any version of Numpy that is greater than 1.22.4 but less than 2.0.0.
Hmm. But what happens if I have the latest version of Numpy (which is 2.0.0) installed on my system already? I.e. before I installed Pandas. Then we’re in trouble. Why? Well, if we then install Pandas, we now have two versions of Numpy on our machines. C’est très problématique.
What’s the problem? Well, how does my machine know what version of Numpy it should use for my new project? Whilst we do have some smart ways of “resolving” these “dependency conflicts”, we still often run into issues. I assure you, there is nothing more frustrating than dependency conflicts as a developer.
This is a problem that every developer runs into, and hence, one where every developer needs some solution. This is where “developer environments” kick in. You’ve probably heard of a few of these guys in the past: venv, pipenv, conda, nvm, rvm, etc. When you learn to code in Python, venv is likely one of the first three tools you pick up. Let’s ‘av a look at how venv works:
Ok, so, here I’ve just created a developer or “virtual” environment and called it “myenv”. Let’s now “activate” this environment and then I’ll explain what this accomplishes:
Cool, I’m now “inside” my developer environment called “myenv”. Just think of developer environments as folders that store the dependencies a specific project requires. See below that “myenv” is a folder in the `whynow_code_examples` directory alongside some other Why Now folders [1][2]:
Within these environments, when we install packages like Pandas, they’re “isolated” or “contained” (sound familiar?) within the directory of our project.
Hence, this time when I run `pip3 install pandas` my terminal doesn’t tell me that the requirement of Numpy is “already satisfied”. Why? Well, my developer environment is isolated. It doesn’t have access to dependencies that are installed elsewhere. Thus, there’s no (or at least, less) room for dependency conflicts! Nice.
Ok, we’ve made a lot of progress here. One final thing to understand about developer environments is that they’re intended to be “portable”. I as SickSoftwareEng34 want to ensure that my esteemed peer BasedDev22 can successfully run the software that I’ve written, on their machine also.
In order to do this, they need to “reproduce” the exact developer environment that’s on my machine, on their own machine. I.e. they need specific versions of Pandas, Numpy, etc.
When using venv, and many other developer environments, we achieve this by using a requirements.txt file. Let me show you how:
Running `pip3 freeze` within my terminal outputs all of the dependencies installed by pip within a given directory (in this case, our developer environment). We can then save, or “pipe”, the output of this command to a text file like so:
Ok sweet, we now have a `requirements.txt` file that contains the `pip3 freeze` output:
Things are quite simple from here. I can then share this file with BasedDev22, they’ll run `pip3 install -r requirements.txt` et voilà, they’ll install the exact* same dependencies in their environment.
*Of course, I wouldn’t be writing this primer on Flox if things worked so.. swimmingly. Tools like venv are far from complete.
First up in the case of Alex Mackenzie vs. venv, 2024 is that venv isn’t a package manager (yep, I said it). I need to use a separate tool, pip, when installing packages. I’m aware that this may sound a little pedantic.. but I’d prefer a consolidated toolset here.
Next, venv is language-specific. If I’m writing a node-based (i.e. JavaScript) web app, I can’t manage my JavaScript dependencies with venv. As a developer this is a PITA, as it means that I have to use an alternate tool for JS et al. Habits aren’t formed this way.
More generally, other language-specific environment or package managers like npm (JavaScript), Cargo (Rust) and RubyGems (Ruby) are similarly not language-agnostic.
Finally, package managers like npm et al can’t install “system-level” dependencies (e.g. compilers or other utilities). For this, we need to use other package managers like Homebrew for macOS or APT for Linux. Again, I hope the fragmented toolchain is overt here.
Ok, so to sum up our current woes as developers just tryin’ to get our jobs done:
Language-specific package managers like pip or npm aren’t versatile enough to accommodate the other programming languages I use.
Language-specific package managers can’t install the system-level dependencies I often need to install.
Many environment managers aren’t package managers (venv), many package managers aren’t environment managers (Homebrew).
My real “a ha!” moment was when I realised that Flox (or, Nix) solves all of the above challenges (& some other pretty gnarly ones). As mentioned, you can install Flox in an instant if you want to follow along below. It’s totally fine to follow along without doing so too.
Alright, I’m now back in my `whynow_code_examples` directory:
Let’s spin up a Flox developer environment with the `flox init` command. This should feel quite familiar as Flox’s workflow is similar to that of venv’s:
Great, I’ve now created a virtual environment. Let’s “activate” the environment, climb inside it, and then I can show you some of Flox’s magic:
Want to install Python’s Pandas library? Let’s do it. Want to install JavaScript’s Node.js? We can do that too. We can even give Homebrew a run for its money and install a system-level utility like htop. Flox is both language and platform agnostic:
At this point you should have hopefully realised that Flox is acting as a developer environment too. When we’re inside our Flox environment, we can run `htop`:
When we `exit` outside of our Flox environment, we can’t:
Hmm. It would appear to me that Flox has just solved the three problems enumerated above? Furthermore, my workflow used to require an array of tools, now it’s converging towards just one. Nice work Flox.
Next - remember our `requirements.txt` file? Well, you guessed it, with Flox we don’t need it either. If I want to use a colleague’s work, I can simply activate their “remote environment” (published to FloxHub) and with one-command, their software works flawlessly on my machine (irrespective of my OS).
Flox’s Ross Turk created the FLAIM (Flox AI Modelling) environment that let’s us run Stable Diffusion locally, so let’s ‘av a bit of fun and do that. Later, we’ll get into why Flox is particularly suited to running AI workloads locally.
Again, with one command `flox activate -r flox/flaim` we can get to work:
Ok, the final test, has Flox installed the exact dependencies that enables Ross’ software to run successfully on my machine?:
I mean, it’s not exactly the Emerald Isle’s tricolour, but hey, that’s Stable Diffusion’s problem. Ross’ software was successfully “reproduced” on my machine (macOS) and it’d run just as successfully on any Linux distribution. Now we can truly “write once, run anywhere”.
Now is when the dude in the back of the room, often with a long grey beard, usually asks: “what about Docker?”. A fair question. This is where things get particularly interesting, and where Nix really comes into a league of its own.
Firstly, we need to level set on Docker. Docker, and the technology they democratised, containers, achieves comparable levels of isolation, reproducibility and portability to that of Flox / Nix. Shall we familiarise ourselves with an example?:
First, let’s log into “Docker Hub” (hub.docker.com) and find a Docker “Image” that we want to “pull”. Docker Images are essentially developer environments, they describe the dependencies a given project requires, as well as some other important metadata like a project’s filesystem, etc.
Ok, cool, I’m logged into Docker Hub via my CLI. Next, much like when using Flox, I’ll search for a piece of software I want to reproduce and get to work via `docker search pandas`:
Ugh, I receive an error message. We’ll dissect this in a second, but note, I don’t run into this issue when using Flox:
Many of you who’re avid Docker users will immediately call me out on my bullshit (rightly so), but this is but one of many examples of working with Flox being just a little less cumbersome than Docker.
Let’s resolve this issue by running “Docker Desktop” (yes, a separate app I need to install on macOS / Windows..) which is the “Docker daemon” that wasn’t running previously.
Now all works perfectly fine when I run `docker search pandas` (a Python library), `docker search nodejs` (a JavaScript library) or, our pal,`docker search htop` (a system utility). Much like Flox, Docker is language and platform agnostic.
Next, I’ll pull the Pandas Image I want and get to work:
I’m now running a shell within my Docker “container”. Within it, I have comparable levels of “isolation” to that of a Flox developer environment. Let’s use Pandas within our container to prove it:
For those unfamiliar with Python / Pandas, the code isn’t important. Just know that I’m using Pandas (the `import pandas as pd` snippet) and it’s running successfully within my shell (i.e. I’m not running into an error message). If I run this code outside of my container, it’d fail.
So, if much like Flox / Nix, containers are language and platform agnostic developer environments, then what’s the catch? Well, like most technologies, containers come with trade-offs. We need to understand exactly how containers are “isolated”, and how they achieve this isolation, in order to appreciate their limitations.
Ok, commençons. People have a tendency to overcomplicate containers, but at a high-level, you can just think of a container as a “slice” or “partition” of your overall machine. Each container has it’s own:
Resource Limits: a slice of CPU, memory, and I/O resources. This ensures that a container does not over-consume resources and affect other containers or the host system.
Network Interfaces: a unique “address” within the overall system. This ensures that other containers running on the same machine can’t access any inbound/outbound traffic.
File System: everything needed to run a specific application, such as code, runtime, system tools, and libraries. This ensures that other containers running on the same machine can’t access my code etc (which is likely proprietary).
Etc.
Where folk often get tripped-up is on the difference between containers and “virtual machines” (or “VMs”). Put very simply, the ~one resource containers do share is the host machine’s underlying operating system. VMs are greedy, they want a copy of the operating system (a “guest OS”) and all.
Containers achieve this isolation by using two Linux kernel (note: Linux, not other OS’s) utilities: Control Groups (or “cgroups”) and Namespaces. Alas, this is where our trade-offs begin (particularly when working with macOS / Windows).
Firstly, when using macOS or Windows (80%+ of desktops), I, naturally, can’t access Linux-native utilities like cgroups or namespaces. So, when working with containers, Docker actually first spins up a Linux virtual machine, and theeen, we access cgroups and namespaces to create a container. Rather circuitous, non?
This is one of the reasons why we have to install Docker Desktop when working with macOS. Docker Desktop does a lot of heavy-lifting to make containers work across all machines. Again, there’s no such requirement for Flox / Nix.
I can get over this “virtualisation tax”. What I can’t get comfortable with is that containers’ strong isolation properties actually impede us from accelerating AI workloads via GPUs (amongst many other impediments). In order to begin to work with GPUs we need Nvidia’s Container Toolkit, which itself requires some config.
Whereas, as you may recall, when using Flox (FLAIM) earlier we had no problem using our GPU to accelerate Stable Diffusion. We used Flox in exactly the same manner as if we we running “traditional” software. This is how habits are formed.
So, to sum up, Flox is a language and platform agnostic developer environment and package manager. Whilst it rhymes with Docker, Flox doesn’t incur Docker’s virtualisation tax, it can seamlessly access system resources and it’s, frankly, a better developer experience. These properties are why Nix is often referred to as “containers without containers”. Or perhaps, containers, without Docker 😉 (I jest, they play nice).
What’s kinda incredible about Flox / Nix is that we haven’t really scratched the surface re. why Flox / Nix’s developer environments, builds, packages, etc., are objectively more reproducible (reducing performance/security issues) than Docker’s / containers also. Thankfully for you all, I’ve scratched this surface previously with a primer on Nix.
Ok, this primer is wrapping up. At this point, I want to be clear that whilst this post may have seemed “anti-Docker”, this is by no means the case. I have used Docker for most of my “professional” life, and will continue to do so for the use cases it’s suited to best.
If anything, the fact that Docker has been used for so much beyond its core use case (running cloud apps in production) is a testament to its technology and community. One of the perks of investing in infra is that it’s versatile. Just ask the Dow Chemical Company!
So, my final ask to you all is to unite these two worlds. 1) Give Flox a go if you haven’t already and 2) build a Docker image with FloxBuild (ready soon 👀).
Oh yeah, and subscribe etc if you enjoyed this and/or learned a thing or two. If you want to say hi, I’m at alex@tapestry.vc
Okay so if I'm pushing to docker hub should create equivalent flox environments then?
Otherwise I assume flox has a layers concept, allowing one to building off of several previous ones?