In this post we will use the word “normalization” to describe a process of adjusting raw data from multiple temperature sensors to remove offsets for accurate comparison when they are deployed as a set. This is not the same as calibration which often uses references with exact physical values to improve accuracy. However, normalization is a highly complementary way to improve precision which can be vital when you are monitoring systems that are driven primarily by gradients rather than absolute values.

Here we use this rig to normalize the Si7050 reference sensors we employ to calibrate the NTC thermistors on every Cave Pearl logger. We have accumulated quite a few of those Si’s over the years and it will be interesting to see if the old ones have drifted. But this 48 quart cooler also has enough volume to enable calibration of temp. sensors inside the large PVC housings on our older generation of loggers.
With the significant setup time, I created several ramps to provide some surplus data. Even with the thermal control of the cooker, I prefer slow passive cooling curves for normalization. The circulator pump continues to operate whenever the set temp is below actual. The system took five days to cool from 40 to 20°C – a rate slow enough to avoid any lag issues due to inconsistent circulation pattern.
We have released many sensor normalization tutorials over the years, but I will review the basic steps here for completeness. You start by drawing an average from your group of sensors. This will be the value to which all of the individual sensors are adjusted toward: (click to enlarge the images)

With that average you can calculate normalization constants using Excel’s slope & intercept functions. Using these formulas lets you copy/paste equations from one data column to the next which dramatically speeds up the process when you are working through several sensors at a time. It also recalculates those constants dynamically if you add or delete information from your normalization set:
The next step is to calculate the difference (called residuals) between the individual sensor data and the average: you will do this both before and after the Y=Mx+B corrections have been applied to the original temperature readings. These differences between the group average and a given sensor should be dramatically reduced by the adjustment:
Then create x/y scatter plots of the pre & post residuals so you can compare them side-by-side. The normalized residuals should be equally distributed around ZERO:

I must mention here how remarkable it was that this batch of sensors produced what I can only describe as textbook residuals with soft fuzzy distributions showing no observable patterns. Usually a batch of mixed sensors will show zigzag structures in the residual plots -indication that one (or more) of the sensors has some kind of physical issue and should be booted out of the average. Another common problem is accidentally including data that had its timestamp offset from the rest of the group. In that case the structure visible in residuals plots will be a version of the same shapes you see in raw sensor data – and this will show up in EVERY residual plot.
Si7051 sensors have a maximum accuracy of ±0.1 °C in the human body temperature range of 35.8 °C to 41 °C, but this widens to ±0.25 °C between -40 and +125 °C. Even with this batch starting within that respectable spec, our normalization tightens the spread to about 0.05°C:

It’s important to note that there is no guarantee that fitting your sensors to an average will do anything to improve accuracy. However, sensors purchased from different vendors, at different times, tend to have randomly distributed offsets. In that case normalization also improves accuracy, but the only way to know if that has happened is to validate against some external reference. The method described here only corrects differences in Offset [with the B value] & Gain/Sensitivity [the M value] – more complex methods are needed to correct non-linearity issues.
After the normalization, data from those runs was also used to calibrate the NTC sensors on those new loggers (ice points were collected separately). NTC’s deliver higher resolution than the Si’s allowing us to look more closely at the Anova’s performance:

I suspect the 16 litre volume being driven here is near the upper limit of the cookers capability, but the shape of that curve indicates a very good level of PID control. Especially considering the used cooker was $35 on eBay and the cooler was $10 at the local Goodwill. I’m still scratching my head about why the little Si7051 module reads rising thermal spikes faster than the NTC but spikes in the cooling direction are aligned. I suspect its due to conduction from the NTC into the larger thermal mass of the ProMini board.
References & Links:
PID Controller – MATLAB (Guide) by Mike Deffenbaugh
PID Temperature Controllers: TUTCO Conductive









