LERC compression benchmarks

9 November 2021

Benchmarks

After looking at how LERC compression works last week, I compared the compression ratio and read/write speed for different LERC MAX_Z_ERROR values against other compression methods. I ran the analysis on my desktop computer (which has an SSD) and did the reading/writing with rasterio 1.2.6 (which uses gdal 3.3.0).

I used the file Copernicus_DSM_COG_10_S43_00_E171_00_DEM.tif: a tile of the 30m Copernicus DEM containing elevation values for New Zealand’s West Coast. The file is 3600x3600 pixels, is in float32 format, has a 512x512 internal block size, and is about 56MB uncompressed.

DEM heatmap

The results roughly align with @kokoalberti’s geoTIFF compression benchmarks. 10 runs were done for each compression method, and the median was taken. Higher is better for all measures.

Algorithm	Max error	Write speed	Read speed	Compression ratio
none	0	1087	3217	1.0
int16	0	1577	6700	2.0
int16 + ZSTD	0	560	1815	4.4
ZSTD	0	519	957	1.5
ZSTD (predictor 3)	0	224	367	2.0
Deflate	0	33	253	1.6
Deflate (predictor 3)	0	35	216	2.0
LERC	1e^-5	315	473	1.9
LERC	1e^-4	325	482	2.2
LERC	1e^-3	337	492	2.7
LERC	1e^-2	347	490	3.4
LERC	1e^-1	361	511	4.6
LERC	1.0	377	519	7.0

Some observations

Even with a small MAX_Z_ERROR of 0.00001, LERC matches DEFLATE compression ratio and massively outperforms in read/write speed.
Compared to ZSTD, LERC achieves similar compression with MAX_Z_ERROR of 0.00001, though read and write speed are faster with ZSTD by about 2x for all error values.
If you’re converting floating point data to integers and storing as a compressed geotiff, consider using LERC instead with an error of 0.5: the compression ratio is better, though speeds are slower.
Increasing MAX_Z_ERROR has a big impact on compression ratio. Larger MAX_Z_ERROR also slightly improves read and write speeds.

Stacked LERC compression

LERC can be stacked with other compression algorithms: first the data is compressed using LERC, then that result is compressed with a lossless algorithm like ZSTD or Deflate.

The idea is that the output of LERC compression is integers with high autocorrelation, so there is still some signal that can be used for compression.

In practice, I don’t recommend this. Firstly due to support: LERC already has limited support in geo software, and stacked compression even more so. Stacked LERC compression isn’t supported by tifffile, and is only recently supported by rasterio.

But the main reason I don’t use stacked LERC compression is that it doesn’t really help that much. Read and write speeds are strictly worse than LERC alone: for this benchmark LERC+ZSTD is only about 5% slower, while LERC+Deflate was a twice as slow to read and write. Stacked compression does result in smaller files but not by much: LER+ZSTD compressed about 2% smaller than LERC alone, while LERC+Deflate managed 4% averaged across multipole MAX_Z_ERROR levels.

Write speed

Max error →	1	1e^-1	1e^-2	1e^-3	1e^-4	1e^-5
Algorithm
lerc	377	361	347	337	325	315
lerc_deflate3	138	116	96	81	69	60
lerc_zstd3	344	325	304	282	270	253

Read speed

Max error →	1	1e^-1	1e^-2	1e^-3	1e^-4	1e^-5
Algorithm
lerc	519	511	490	492	482	473
lerc_deflate3	379	326	311	283	259	238
lerc_zstd3	478	498	494	489	477	468

Compression ratio

Max error →	1	1e^-1	1e^-2	1e^-3	1e^-4	1e^-5
Algorithm
lerc	7.00	4.61	3.42	2.71	2.25	1.92
lerc_deflate3	7.49	4.70	3.46	2.74	2.27	1.94
lerc_zstd3	7.27	4.63	3.42	2.71	2.25	1.93