How to Normalize a Set of Pressure Sensors

Once your project starts to grow it’s common to have multiple different sensors, from different vendors, measuring the same environmental parameter. Ideally, those sensors would produce the same readings in the same environment – but in practice there are significant offsets. Datasheets for the MS5837-02BA and MS5803-14BA that we will compare in this post claim an accuracy of (±0.5mbar) and (±2ºC) for the 2-bar while the 14-bar sensors are only rated to (±20mbar) and (±2ºC). Sensors from Measurement Specialties are directly code compatible so the units here were read with the same Over Sampling settings.

Barometric pressure from a set of nine MS58xx pressure sensors running on a bookshelf as part of normal burn-in testing. The main cluster has a spread of about 10millibar, with one dramatic outlier >20 mbar from the group. These offsets are much wider than the datasheet spec for those 2-bars sensors.

But this is only a starting point: manufacturers have very specific rules about things like the temperature ramps during reflow and it’s unlikely that cheap sensor modules get handled that carefully. Housing installation adds both physical stress and thermal mass which will induce shifts; as can the quality of your supply voltage. Signal conditioning and oversampling options usually improve accuracy, but there are notable exceptions like the BMP280 which suffers from self-heating if you run it at the startup defaults.

As described in our post on waterproofing electronics, we mount pressure sensors under mineral oil with a nitrile finger cot membrane.

Sensors like NTC thermistors are relatively easy to calibrate using physical constants. But finding that kind of high quality benchmark for barometric sensors is challenging if you don’t live near a government-run climate station. So we typically use a normalization process to bring a set of different sensors into close agreement with each other. This is a standard procedure for field scientists, but it’s hard to find because the word ‘normalization’ means different things in various industry settings. In Arduino maker forums it usually describes scaling the axes from a single accelerometer with (sensor – sensor.min )/( sensor.max – sensor.min ) rather than standardizing a group of sensors.

When calibrating to a good reference you generally assume that all the error is in your cheap DIY sensor and then do a linear regression by calculating a best fit line with the trusted data on they Y axis of a scatter plot.  However, even in the absence of a established benchmark you can use the same procedure with a ‘synthetic’ reference created by drawing an average from a group of sensors:

Note: Sensor #41 was the dramatic outlier more than 20millibar from the group (indicating a potential hardware fault) so this data is not include in our initial group average.

With that average you calculate y = Mx + B correction constants using the slope & intercept functions. This lets you copy/paste equations from one data set to the next which dramatically speeds up the process when you are working through several sensors at a time. It also recalculates those constants dynamically when you add or delete information:

The next step is to calculate the difference between the raw sensor data and the average: before and after these Y=Mx+B corrections have been applied to the original pressure readings. These differences between the group average and an individual sensor should be dramatically reduced by the adjustment:

After you copy/paste these calculations to each sensor, create x/y scatter plots of the residuals so you can examine them:

While the errors are now centered around zero, these graphs indicate that we are not quite finished. In the ideal case, residuals are soft fuzzy distributions with no observable patterns. But here we have a zigzag that is showing up in all the sensors. This is an indication that one (or more) of our sensors have an issue. Scrolling further along the columns identifies the offending sensors with nasty looking residual plots even after the corrections have been applied:

Sensor #41 (far right) was already rejected from the general average because of its enormous offset, but the high amplitude residual plots indicate that the data from #45 and #42 are also suspect. If we eliminate those two from the average the zigzag pattern virtually disappears from the rest of the sensors in the set:

There’s more we could learn from the residual distributions, but here we’ve simply used them to prune our reference data, preventing bad sensor input from harming the the average we use for our normalization.

And what do the sensor plots look like after the magic sauce is applied?

The same set of barometric pressure sensors, before and after normalization corrections. (minus #41 which could not be corrected)

It’s important to note that there is no guarantee that fitting your sensors to an average will do anything to improve accuracy. However, sensors purchased from different vendors, at different times, tend to have randomly distributed offsets. In those cases normalization improves both precision and accuracy, but the only way to know if that has happened is to validate against some external reference like the weather station at your local airport. There are several good long term aggregators that harvest METAR data from these stations like this one at Iowa State, or you can get the most recent week of data by searching for your local airport code at weather.gov

METAR is a format for weather reporting that is predominately used for pilots and meteorologists and they report pressure adjusted to ‘Mean Sea Level’. So you will have to adjust the MSL data before you can compare it to the pressure reported by your sensors. You will also need to know the exact altitude of your sensors when the data was gathered to remove the height offset between your location and the airport station.

Technically speaking, you could calibrate your pressure sensors directly to those official sources. However there are a lot of Beginner, Intermediate and Advanced details to take care of. Even then you still have to be close enough to know both locations are in the same weather system.
Here I’m just going to use the relatively crude adjustment equation:
Station Pressure = SLP – (elevation/9.2) and millibar = inchHg x 33.8639 to see if we are in the ballpark.

Barometric data from the local airport (16 miles away) overlayed on our normalized pressure sensors. It’s worth noting that the airport data is at a strange odd-minute intervals, with frequent dropouts which could complicate a real calibration.

Like most pressure sensors an MS58xx also records temperature because it needs that for internal calculations. So we can repeat the entire process with the temperature readings from this set:

Temperatures °C from a set of MS58xx Pressure sensors: before & after group normalization. Unlike pressure, this entire band was within the ±2ºC specified in the datasheet.

These sensors were sitting pretty far back on a bookshelf that was partly enclosed, so some of them were quite sheltered while others were exposed to direct airflow. So I’m not bothered by the spikes or the corresponding blips in those residual plots. I’m confident that if I had run this test inside a thermally controlled environment (ie: a styrofoam cooler with a small hole in the top) the temperature residuals would have been well behaved and smooth.

One of the loggers in this set had a calibrated NTC thermistor onboard. While this sensor had significant lag because it was located inside the housing, we can still use it to check if the normalized temperatures benefit from the same random distribution of errors that were corrected so nicely by the pressure normalization:

Once again, we have good alignment between a trusted reference and our normalized sensors.

Comments:

Normalization is a relatively low effort way to improve sets of sensors – and it’s vital if you are monitoring systems that are driven primarily by deltas rather than absolute values. This method generalizes to many other types of sensors although a simple y=Mx +B approach usually does not handle exponential sensors very well. As with calibration, the raw data used for normalization should span the range of values you expect to gather with the sensors later.

The method described here only corrects differences in Offset [with the B value] & Gain/Sensitivity [the M value] – more complex methods are needed to correct non-linearity problems. To have enough statistical weight for accuracy improvement you want a batch of ten or more sensors and it’s a good idea to exclude data from the first 24 hours of operation so new sensors have time to settle. Offsets are influenced by several factors and some sensors need to ‘warm up’ before they can be read. The code driving your sensors during normalization should be identical to the code used to collect data in the field.

All sensor parameters drift so, just like calibration, normalization constants have a shelf life. This is usually about one year, but can be less than that if your sensors are deployed in harsh environments. Fortunately this kind of normalization is easy to redo in the field, and it’s a good way to spot sensors that need replacing.


References & Links:

Decoding Pressure @ Penn State
Environmental Mesonet @ Iowa State
Calibrating your Barometer: Part1, Part2 & Part3
ISA Standard Atmosphere calculator
Starpath SLP calculator
SensorsONE Pressure Calculators
Mean Sea Level Pressure converter

Leave a comment