Squeezing Rust into production: Part 2

Post by Jalmari Ikävalko

Previously I wrote a small post (now unpublished, it was quite boring) about finding the smallest possible opening for introducing Rust into production use. In that case, Rust's place was quite minor; just the small script-like snippet for updating the installations of our main client suite.

But as of some days ago, we're now succesfully running a bit more of Rust in a quite central position amongst our products!

Image of the Rustacean Crab

Let's dive in (imaged is Ferris The Crab, the unofficial Mascot of Rust):

The product

So the actual product where we're running Rust is a realtime audio analyzer. As of now, it only provides volume levels (in the form of RMS values), but the idea is to introduce some spectrum and beat analysis to it. Nothing overtly fancy - no need for any groundbreaking accuracy or anything - just simple stuff.

This program was previously written in Python (with Numpy) and there was a pre-existing protocol to follow. The quirk about this program is that it doesn't really do much as a standalone program. It's meant to be used side-by-side with our main product, which connects to the audio analyzer over a socket. So, everything the audio analyzer does, is initiated by our main product. The audio analyzer then sends analysis data over to our main product, which uses it for visualization, data gathering and entertainment purposes.

Also, due to the strict realtime requirements and potentially dozens of audio inputs it has to listen to, with several socket connections and clients to serve, it by its nature benefits a lot from some parallelism.

Why reimplement?

The main problem with the Python version of this product - or subproduct - was that it was pretty unstable, hard to extend and hard to debug. Python in itself is pretty fine as a language - at least I definitely love it - but when we're speaking of situations where you really want failure safety, have no easy access to immediately fix bugs that may require immediate fixing, it is a bit scary. Further, I do, personally, find that typing systems where the type is hidden from the programmer are more difficult to document and work in with the editor and linting tools we have.

Distribution and distribution sizes were also a bit of a problem. Python works great in the infrastructure, in web products, et cetera, but as a desktop software installed on the clients, it's a bit painful. The packaging software I ended up with was cx_freeze, but with SciPy and Numpy and PortAudio and all the required dlls and bytecode files, the packaging size ended up at +50 megs. Also, randomly, on every 10th or so time, starting the program would fail due to some weird race problem with locking file handles or something like that. Performance was also a bit fifty-fifty. It would start show up as a meaningful hog of the system resources when you had several connections open to it and were listening to several audio inputs at the same time.

Now, what we wanted was better runtime safety guarantees, better performance, smaller distribution size and better extendability of the codebase.

Making the choice

Now I do admit it straight away - I did have a bit of a personal bias for choosing Rust out of my own interest towards the language. It's pretty tricky choice to make, when to reimplement instead of just refactoring and deciding how much your personal interest can affect these decisions. As an employee responsible for leading the development of the central codebases of the company I work for, I should have particular responsibility for making choices that work in the long term as well as in the short term.

Still, despite my bias, I do of course believe that Rust was a good choice.

Let's look at this from the perspective of deciding which languages/frameworks to not use. The process of elimination, if you will (including just languages I am familiar enough with to confidently set into a project like this and that would have been plausible choices):

  • Java - Verbose, parallelism is quite scary. Not too exited about including a requirement for the Java runtime.
  • C++ - No package manager, adds a little bit of time in setting projects up. Can be very poorly approachable to people with no experience on it (or at least C). Fail to follow RAII, shoot yourself in the foot.
  • C - Yeah no.
  • Python - covered above, but: Performance concerns, distribution trouble, a bit hard to easily get back into after months of not having visited the project.
  • C# - ..and here I actually had a small stop. C#'s a solid language. But in the end, I didn't really need the OOP clutter for this project, which makes C# a bit verbose. Further, parallelism in C# is pretty easy to get wrong.

So why Rust, then?

Well:

  • It has decent safety guarantees. Our software runs in realtime, in situations where it'd be pretty bad if it suddenly crashed.
  • It makes parallelism almost fun. Rust is very strict about ownership and that fixes most of race conditions in one swell swoop.
  • It has good performance and a small distribution size.
  • Its ecosystem has gotten quite decent - at least for the purposes of this project - and its package manager is pretty sweet.
  • I like the idioms around Rust and Rust's development and I like the community around it.

The good bits

There's a lot of posts going through various things that people love about in Rust. It's been now voted the most loved language of the year twice in row. So I'll make this short and skip any deeper analysis, but what I particularly like about Rust is the variant-based enum and the associated match system, its totally fearless concurrency and the trait-based type abstraction system which is a very gentle push to a more functional thinking for someone who can't quite stomach the purely functional languages, but would still like to slowly drift to that direction.

The average bits

Now Rust is not all that old and there are a bunch of missing things and there's a bunch of things that are currently implemented in a way that might be a bit unintuitive at the first glance.

For example, initializing an array of vecs is oddly cumbersome as showcased in this StackOverflow question. I get the reason behind this - Rust uses the Copy trait which vec doesn't have to fill the array - but I'd still imagine it could be a bit less cumbersome. Could it just execute the initializer many times if there's no Copy? I don't know.

Far as I can tell, there's no current support in the standard library for parallelizing the computation of non-overlapping slices into arrays and vecs. Instead, you'll probably want to rely on various libraries for it.

Then there are some missing links in the infra and development tools. Like Rust's current main component for linting - The Rust Language Server - or its vscode implementation do not seem to support multiple crates in a project. Crates being libraries and binaries you generate from a project.

But yeah - as a young language, there's a whole galaxy of issues and feature requests, many of which are actively worked on.

The compromises

To put it mildly, Rust has some learning curve. People will struggle with changing from OOP to more functional/trait based approaches. Many of those who've introduced Rust to their teams agree with that it's not really the easiest language (comment #3). I think it's fair to say that you'll take some time to get used to the borrow checker and so on.

Personally I still find myself rereading large parts of the manuals. I still have to google a bit too often for particular issues I get with the borrow checker and how to best go around them. And as you'll be using quite a lot of pure reference docs, you'll really need to have a good amount of existing experience with whatever you're trying to solve. In this project we've now released to production, I was trying to figure out the potential Error codes returned by TcpStreams. The manual about TcpStream didn't really help too much. It says it returns Errors. But where are the error codes? Well, Error has a member of the type ErrorKind which has categories for errors that map to OS error codes. But which of the categories map to which error codes?

In the end, I used Error's raw_os_error and manually matched it to the error codes I wanted from Windows' documentations.

All in all, it's a bit of an uphill struggle to get going, the harder the more junior you are (I suppose that goes for any language).

One thing I do wonder a bit about is how excited am I going to be about returning to Rust after lenghty breaks. I'm pretty much a hopeless generalist. I might write code in 6 different languages for a dozen different projects in the spawn of a few weeks. At the same time, I might not work with a language for several months in a go. How will Rust tolerate my absences? Will it gleefully welcome me back to pick off from where I left, or is the first thing it shows me going to be a pageful of borrow errors?

But I am happy with the choice!

There are still several kind of projects for which I would not pick Rust up, at least not quite yet. GUIs are the usual painpoint as in this discussion.

For some algorithms, like spatial partitioning schemes often found in game development, the fact that self-referencing data structures are a bit difficult or weird to do is also a major blocker, though there's an upcoming feature currently in the unstable API available to make it a bit easier!

However, for this particular project, I think Rust was a pretty fine choice. I get to sleep a little bit more soundly with the added safety guarantees. Rust's also pretty fun to code, at least when you don't end up spending too much time struggling with the borrow checker. It feels modern and you really see that stuff has been thought of from the ground up with the full availability of modern knowledge and best practices.

While it may again take a while until I get to use Rust in a project aiming for production use, I am pretty happy with knowing that it is a viable choice for those applications where its unique combination of performance and safety is central.

Link time!

I'm hoping to use some of my free time to continue on the audio analyser project to make it a bit more suitable to be a generic, extendable solution for audio analysis.

For these purposes, it's open source and can be found at: https://github.com/tzaeru/rust-audio-analysis

I need to fix the code up a bit at some point and put up a documentation, but basically it's a project that aims to:

  • Be both a library, a server application which can be accessed remotely, and a simple GUI application that you can use to check mic levels etc.
  • Be easily extendable for new analysis components.
  • Be embeddable as a solid solution for getting simple volume/spectrum/beat detection data to a parent application.

But anyhow. That was some of my thoughts on Rust & production.

Thanks for reading!