Packaging Scrutiny for NixOS
Table of Contents
Update: Since writing this post, I’ve contributed Scrutiny upstream to nixpkgs/NixOS. You can see the write up of that effort here.
Introduction #
In a recent (well, recent-ish) episode of the Self Hosted Show, there was some talk of a hard drive monitoring tool called Scrutiny. Scrutiny is a hard drive monitoring tool that exposes S.M.A.R.T data in a nice, clean dashboard. It gathers that S.M.A.R.T data using the venerable smartd, which is a Linux daemon that monitors S.M.A.R.T data from a huge number of ATA, IDE, SCSI-3 drives. The code is available on Github.
The aim of running such monitoring is to detect and replace failing hard drives before they cause an outage, or any data loss. Depending on the firmware of a drive, there are potentially hundreds of S.M.A.R.T attributes that can be collected, so it can be hard to understand which to pay attention to.
As good as smartd is, it has some shortcomings such as not recording historical data for each attribute, and relying upon value thresholds that are set by the manufacturers of the drive. Scrutiny aims to provide historical tracking of attribute readings, and thresholds based on real-world data from Backblaze.
This all looked very compelling, but it hasn’t been packaged for NixOS as far as I can tell. This post is quite “code heavy” and is intended as a bit of a deep-dive/walkthrough for people who are new to packaging in Nix. I’m certainly no expert here, and if you spot something I’ve done wrong - I’d love to hear about it!
Below is a screenshot of Scrutiny’s dashboard:
Architecture #
In order to get Scrutiny up and running, there were a few components that all needed packaging separately.
Scrutiny itself is made up of the following:
- A web application backend - written in Go
- A web application frontend - written in NodeJs/Angular
- A collector service - written in Go
The web application is the dashboard that you’ll see the pretty screenshots of. The collector service is designed to be run on an interval to collect information from smartd, and send it to the web application’s API.
Scrutiny relies upon InfluxDB to store data about smart attributes in a timeseries.
Thinking about how to structure things, I decided that I would build two separate Nix packages: one for the dashboard and UI, and another for the collector. It seems one could run the collector and dashboard on different machines, so this seemed like a logical split.
Packaging for NixOS #
Packaging the Dashboard #
I decided to start with the web application. Because of how the project is laid out, this means combining two separate derivations based built from the same source code: one for the UI and one for the backend. I began by creating a file that would carry common attributes for each of the derivations, such as the version, repository information, hash, etc.
|
|
I then started packaging the frontend:
|
|
This is a relatively simple derivation, mostly thanks to the magic of the buildNpmPackage function. This helper function takes care of creating an offline cache containing all of the NodeJS dependencies required to build the frontend. Nix builds are always done in an offline environment to help ensure reproducibility, so source code and dependencies are fetched (and hashed) early in the process before any software is actually built.
I chose to override the build phase to match the process used by the upstream Makefile. The result of this derivation is a package named scrutiny-webapp-frontend, which contains just the built output from the npm run build:prod command.
Next up was the dashboard backend. Another pleasingly simple derivation thanks to some help from buildGoModule:
|
|
This one is a little more interesting. There are some commonalities, such as setting the vendorHash to ensure the correct Go dependencies are used (imported from common.nix in this case), and overriding the build to match the upstream process and ensure the right binary is built.
Where things differ is in the install phase, where the contents of the scrutiny-webapp-frontend derivation is copied into the output of this derivation - which will ultimately result in a single Nix package (named scrutiny-webapp-backend) which will contain both the frontend and backend components of the application. This is ultimately exposed in an overlay as a package simply named scrutiny.
You can see the finished product (with additional package metadata) in app.nix and common.nix on Github.
Packaging the Collector #
The collector is just a single, statically compiled Go binary, and as such the derivation very much resembles that of the web application backend above:
|
|
Of interest here is the installPhase. The collector works by invoking smartctl to scrape information from smartd. Scrutiny itself expects that tool to be readily available in it’s PATH, and to accomplish that I used the makeWrapper package to create a wrapper script that ensures scrutiny-collector-metrics is executed with the PATH correctly set such that smartctl can be found.
The final derivation for the collector can be seen in collector.nix on Github.
Writing a NixOS Module #
While I now had functioning packages for all of Scrutiny’s components, for them to function correctly on a machine there are a few things that need to be in place:
- The dashboard package must be installed and started
- The collector collector package must be installed and started
- InfluxDB must be installed and started
- Scrutiny dashboard must be configured to speak to the host’s InfluxDB
smartdmust be installed and running
This sort of challenge is exactly what the NixOS modules system aims to solve. Modules take a set of configuration attributes, and convert them into rendered system configuration in the form of systemd units, configuration files and more.
It’s tempting to try to support all possible configurations when writing a module, but I generally prefer to start small, and support only the configuration I need for my use-case. This keeps things easy to test (and leaves some fun for the future!). In this case, the module would be first installed on a server which hosts a set of services behind the Traefik reverse proxy. To work out what configuration I wanted to provide, I looked at the upstream’s example config file. The important things that stood out to me for consideration were:
- The location of the web UI files to serve
- The host/port of the InfluxDB instance
- The “base path” - Scrutiny will be exposed at
https://<tailscale node>/scrutiny
As mentioned before - each NixOS module consists of some options, and some rendered config. The options block for my Scrutiny web app looks like the below snippet. I’ve omitted comments this time, as I think the language is quite descriptive here:
|
|
This provides some basic configuration for the attributes I care about - noticeably missing is any advanced configuration for InfluxDB (such as org, bucket, token), and and any ability to configure notifications which Scrutiny supports through the excellent shoutrrr library. These things will come later.
I needed a convenient way to convert this Nix-native configuration format into the right format for Scrutiny - I wrote a small Nix function to help with that, which takes configuration elements from the options defined above, and writes them into a small YAML file:
|
|
Note that this hard-codes some key elements, such as the path of the web application assets which are shipped as part of the scrutiny package.
Let’s take a look at the part of the module which turns these options into a rendered system configuration:
|
|
That’s enough to get the dashboard started, but it doesn’t take care of starting the collector. For that, I added a couple more configuration options to the module:
|
|
With that configuration in place, I needed to adjust the rendered configuration to include starting the collector. The collector is designed to run on an interval to post metrics to the dashboard. The upstream achieves this with cron, but I decided to use systemd timers:
|
|
And that’s it! You can see the final module all stitched together on Github.
Automated Testing #
To me, a super exciting part of the NixOS ecosystem is its automated testing framework, which is used to great effect for testing hundreds of packages end-to-end before they’re released to the various NixOS channels. The NixOS testing framework provides a set of helpers for spawning fresh NixOS virtual machines, and ensuring that applications can start, services have the right side effects, etc.
I wanted to write a simple test to validate that Scrutiny will continue to work as I update my flake. In my view, the test needed to:
- Ensure that the packages could be built
- Ensure that when
services.scrutiny.enable = trueis set, the services start - Ensure that the dashboard app renders the UI
- Ensure that the metrics collector can speak to the dashboard
Most of these are relatively trivial - one piece that’s a little harder is testing that the UI renders correctly. The application is rendered using Javascript client-side, so a simple curl won’t get us the results we’re expecting. I had previously used selenium for this purpose when I submitted a test for the LXD UI, so I chose to use that approach again.
The test code can be seen below, or on Github:
|
|
This relatively short code snippet will take care of:
- Building a dedicated virtual machine according to the
machinespec - Starting the VM
- Running the tests inside the VM (specified in `testScript)
- Collecting the results
I ended up adding this check in a Github Actions workflow so that it’s run each time I make a change to my flake.
Try It! #
You can give this a go too! If you’ve got a NixOS machine, or perhaps a virtual machine you’ve been experimenting with, then you need only add the following to your flake:
|
|
And in your NixOS machine configuration:
|
|
The next time you rebuild your system, you should be able to browse to http://localhost:8080 and see some hard drive metrics!
What’s Next? #
This was a pretty quick exercise - I’d estimate at about 3 hours in total. No doubt I will find some bugs, but already in my mind are the following future improvements:
- Add some more tests that exercise more of the config options
- Enable notifications configuration in the module
- Submit the packages/module to the upstream or to nixpkgs
I also need to go and investigate this rather sad looking hard disk…
That’s all for now! Thanks for reading if you got this far!