Benchmarking NINA XISF Compression: Speed, File Size, and Byte Shuffling
Share
NINA has several compression algorithms available when saving your astroimages as .xisf files. They're documented here, but their descriptions of the speed & effectiveness of the compression algorithm options are just qualitative. My imaging PC is a MeLE mini PC -- with a measly N150 CPU -- and I upload my images to the cloud via a mobile hotspot. So I wanted some real data on how computationally intensive and space efficient these compression algorithms are. So I went looking for existing comparisons of the compression algorithms in NINA.
What I found is that NINA's compression techniques are both very common algorithms whose performance are thoroughly benchmarked for a wide variety of applications, and NINA's Byte Shuffling is similarly not unique to NINA. But astrophoto data -- being extremely dark -- is uniquely sparsely populated, meaning that existing compression scheme comparisons may not necessarily be applicable to our purposes. And NINA's implementation of one of the algorithms might be more or less efficient than tests of the same algorithms implemented in other software. So this necessitated some testing of my own.
Results Up Front
The to-be-compressed images were 16-bit, 26MP monochromatic images from my QHY268M Pro. To keep things fair between tests while averaging out statistical variations, the same batch of 10 images were tested for each permutation of algorithm & byte shuffling. More details about my testing methodology are below.
This is the resulting average file size, in megabytes, using each compression algorithm with byte shuffling enabled and disabled:

With no compression, the file size was 52.2 MB.
With byte shuffling disabled, LZ4 was 38.3 MB, LZ4HC was 29.1 MB, and Zlib was 23.8 MB.
With byte shuffling enabled, LZ4 was 27.1 MB, LZ4HC was 26.2 MB, and Zlib was 19.4 MB.
It's good to know that byte shuffling always further reduces file size. But does it come at the cost of additional computational effort to compress the images? This next chart compares the average times to compress an image, measured in seconds, for each compression algorithm with byte shuffling enabled and disabled:

Note: The absolute values shown here are system-dependent, but the relative values should still be roughly correct for your system. These timings measure compression + checksum computation only, not disk I/O, which was measured separately and found to be relatively constant across configurations.
With no compression, the time was 0.02s.
With Byte Shuffling disabled, LZ4 was 0.17s, LZ4HC was 2.51s, and Zlib was 17.43s(!!!)
With Byte Shuffling enabled, LZ4 was 0.12s, LZ4HC was 1.17s, and Zlib was 3.59s.
Notice how without byte shuffling, Zlib performs extremely poorly on raw astronomical data. Byte shuffling transforms the data into a form that Zlib can exploit efficiently.
Conclusions
You're welcome to draw your own conclusions from my data, but my takeaways (and my recommendation going forward) will be this:
- There's little reason to use no compression algorithm at all. LZ4 compression is practically instant, bringing moderate file size savings essentially for free.
- When using a compression algorithm, byte shuffling should always be enabled. I talked to the NINA developers about this, and my takeaway is that essentially every computer capable of reasonably running NINA is fast enough that byte shuffling is exclusively beneficial to both compression speed and effectiveness, never a downside.
- Each of the algorithms has a purpose. LZ4 is great if you're majorly concerned about computer performance. Zlib is great if you're majorly concerned about disk space. LZ4HC is a great middle-ground if you want most of the benefits of LZ4 and most of the benefits of Zlib.
- Checksum has a basically negligible performance hit (relative to compression & disk write time). I did not include plots of these results because they take a minute or two to read and frankly they're not very interesting -- it's mostly just a flat line. I'm leaving mine to SHA-1, as support for it is required by the XISF file format spec, so I figure it'll be future-proof while still providing more than enough corruption protection for anything I need.
One final thing to note: In NINA, your sequence continues on as soon as the sensor readout is complete, with image compression, checksum, and disk writing happening in parallel in the background. Your choice of compression setting does not impact the time your camera spends imaging.
Methodology
I used NINA's Camera Simulator, pointing it at a directory containing 10 16-bit, monochromatic .tiff files from my QHY268M. The Camera Simulator bypasses sensor readout but preserves the file-writing and compression pipeline used during real acquisitions. The photos were light frames of the Crescent Nebula, taken at my usual gain (56) and offset (10), which are dialed in such that none of the pixels on my bias frames have an ADU value of zero, as is best-practice.
There are 32 permutations of compression algorithm, byte shuffling enabled/disabled, and checksum strength. I coded a version of NINA 3.3 to save each image taken with the simulated camera using each of these 32 permutations of parameters, one after the other, with 10 seconds in between each save to prevent my laptop from building up heat between runs. I then ran this for each of the 10 images, for 320 saved images in total. In my NINA version I coded in a timer to measure the duration of the file compression & checksum verification process, logging it to a CSV alongside details of the file saved. The data was manually reviewed to check whether there were any abnormal outliers -- there were none. The standard deviation of each run was unremarkable, not changing any of the conclusions, and so was excluded from the charts shown here for the sake of clarity.
The laptop I ran this on was a Dell XPS 9730 with an i7-13700H, 32 GB RAM, and an NVMe SSD. This is more powerful than some of the low-spec imaging computers used, but my results should scale roughly accordingly.