Data validity: are your numbers legit?

By Christopher G Relf*
Wednesday, 13 July, 2005


While the emergence of data acquisition technology has substantially changed the way we do business, often overall system designs invalidate the calibration and data paths at lower levels. This month we look at the process of getting valid data right up the measurement chain, from sensor to disk.

Accuracy versus resolution

One issue I come up against time and time again is an engineer's inability to differentiate accuracy and resolution and such confusion can often lead to poor system design resulting in inappropriate data. Although related, accuracy and resolution are indeed very different animals. With respect to a data acquisition (DAQ) instrument, accuracy is the device's degree of absolute correctness, whereas resolution is the smallest number that can be displayed or recorded by the device. For example, an instrument that has an accuracy of say ±0.0015% will represent a measurement as between 0.99985 and 1.00015 of the true value, whereas the display might have a resolution of say six digits. In this example, although a DC signal of 1.00000 V might be represented on the display as 1.00015 V, it should not be assumed that a reading of 1.00016 V is possible - it isn't, as the instrument is not accurate enough to actually measure an input voltage step of 0.00001 V.

Calibration

Calibration is a necessary engineering evil. Raw numbers acquired from physical sensors mean very little without transformation to real-world units and this process is called calibration.

Figure 1 shows three traces with identical data points (the red dots) and three methods of curve fitting calibration. The first trace is fitted to a straight line using the old faithful y = mx + b equation. As this set of the acquired data doesn't appear to be linear, this fit is inappropriate. The second trace is fitted with the ever popular polynomial transform with an order of one and although it follows the original data more closely than the proceeding linear fit, it misses several of the outlying points. The third trace increases the polynomial order to three and the mean squared error (MSE) between the data points and the fitted curve is decreased. So, which fit is the most appropriate? Well, it depends. If we assume that the traces shown represent the full scale of the acquisition system, then the linear fit is less appropriate than the polynomial fits. Although the 3rd order polynomial has a lower MSE than the 1st order, this could be due to noise in the signal and repeating the measurements several times or decreasing the voltage between test samples may reveal that the curve is actually exponential. This example demonstrates that although polynomial fitting is seen as the best general-purpose fitting routine in practical terms, it pays to know about the sensor characteristics from its technical specifications, or at least do a little testing to find an appropriate transform.

Consider the system as a whole of many parts

System calibration is almost always a multi-step process, as the path of the physical phenomena to the data storage location is usually rather complex. Too many system engineers assume that correctly coupled and calibrated hardware components will result in valid data without considering how the hardware's drivers, DAQ applications, calibration transformation algorithms and even operating systems can transform their data. Every step of data transformation should be treated with caution to avoid clipping, re-mapping and coercion.

For example, a typical data acquisition path includes transducer errors when converting the physical phenomena measurement into a voltage or current, the resulting signal often degrades as it is transported through connectors and along the sensor cable due to internal resistance and external noise, the DAQ hardware analog to digital converters (ADCs) quantise the signal to make it appropriate for logging, the system operating system may also impose quantisation or range limits on the acquired data and possibly even the logging software coerces and converts the data representation before it is sent to disk. It is for these reasons that calibration services prefer to certify calibration of systems, rather than system components - the calibration of a transducer may be irrelevant when it is plugged into an uncalibrated DAQ card.

Quantisation: steps to valid data

As mentioned above, when the voltage or current signals are converted to their respective digital representations, they are quantised. As values of infinite accuracy are impossible in a digital system, the resulting signal is represented as a series of steps between the accepted levels. If the accuracy of the quantisation steps is less than that of the physical signal (as is almost always the case), then accuracy loss is ensured. As you might expect, quantisation exists in both the amplitude and time domains, so the speed that a signal is acquired will also lead to quantisation, as demonstrated in Figure 2.

Whilst on the topic of quantisation, non-DC signals must be acquired using a DAQ system that observes the Nyquist theorem which suggests that in order to recover all Fourier components of a periodic waveform, it is necessary to sample at least twice as fast as the highest waveform frequency. So if v is the sampling rate, the Nyquist frequency limit (often called the Nyquist rate) is the highest frequency that can be coded at a given sampling rate in order to be able to fully reconstruct the signal [image].

If data is acquired below the Nyquist rate, then data features are lost, as shown in Figure 3. Although the example acquired signal is repeatable like the input signal, it certainly isn't representative of the input signal.

It all starts in the hardware

By definition, transducers convert the representation of one physical phenomenon to another. For example, a fluid level transducer may convert the representation of the level of fuel in a tank to a corresponding voltage to excite the coil of a panel meter: an indicative representation of the amount of fuel. Although most traditional sensors have a practically infinite accuracy, the process of turning the detected physical phenomena to a meaningful signal (voltage, current, etc) introduces errors. No transducer mapping is truly linear over its operational range and the introduction of hard- or firm-ware transform systems often includes elaborate calibration algorithms until recently only seen in the software domain. These transforms can be attached to a specific transducer and/or included in the attached DAQ hardware. For example, the recently released National Instruments M Series DAQ cards boast a full range 29 point 3rd order polynomial calibration curve, as opposed to the previous E Series standard of two (therefore linear) points. Although this process verifies the calibration of the DAQ card's ADCs, it does not represent the calibration of the whole system. Custom systems have also been developed to support even more complex calibration algorithms for specialist transducers, including the ever-popular polynomial fit, often using field-programmable gate arrays (FPGAs), a fast and configurable method of embedding transparent calibration and transformation routine in firmware.

Software = hard transforms

Once the data is into the DAQ system, it is almost always manipulated by the software that displays and stores it. The acquired data is represented as a quantised number and if the representation or 'width' of the number is less than that of the accuracy of the transducer, then data loss is inevitable. For example, if data is acquired at a width of say 18 bits, but then in software is represented as 16 bits, then 2 bits are discarded. In the case of signed representations, this coercion may result in clipping of the data and in the case of un-signed representations the data may wrap to the opposite of the representative scale.

Does it all really matter?

The answer is 'it depends'. Although any representation of real physical phenomena is indicative, you must decide to what degree you are comfortable with. There's no point in specifying an expensive and complex system design to minimise data error when your end user doesn't require it, when a 'rough' guide is all that's required. That said, if you do need the numbers, it's certainly worth taking the time to do a comprehensive system analysis to find the cumulative errors in your data acquisition and handling.

* Christopher G Relf is the development manager/senior technical specialist for Neo Vista System Integrators Pty Ltd (www.nvsi.com.au). A keen software and hardware automation engineer, Christopher is a National Instruments Certified LabVIEW Architect and the author of 'Image Acquisition and Processing with LabVIEW' (CRC Press). Christopher's industrial experience includes the Division of Telecommunications and Industrial Physics of the CSIRO, JDS Uniphase and as a freelance technical journalist for Australian Consolidated Press.

Related Articles

Collaborative robots: the smarter way forward

Robots that can work side by side with humans are changing the way manufacturing is done.

AOG bringing the best of the best to Perth in 2015

With more than 620 companies queuing up to participate in this year's annual Australasian Oil...

Understanding data storage technologies

With the growing amounts of data being stored by industrial organisations today, understanding...


  • All content Copyright © 2024 Westwick-Farrow Pty Ltd