Orthogonal signal correction of near-infrared spectra

https://doi.org/10.1016/S0169-7439(98)00109-9Get rights and content

Abstract

Near-infrared (NIR) spectra are often pre-processed in order to remove systematic noise such as base-line variation and multiplicative scatter effects. This is done by differentiating the spectra to first or second derivatives, by multiplicative signal correction (MSC), or by similar mathematical filtering methods. This pre-processing may, however, also remove information from the spectra regarding Y (the measured response variable in multivariate calibration applications). We here show how a variant of PLS can be used to achieve a signal correction that is as close to orthogonal as possible to a given Y-vector or Y-matrix. Thus, one ensures that the signal correction removes as little information as possible regarding Y. In the case when the number of X-variables (K) exceeds the number of observations (N), strict orthogonality is obtained. The approach is called orthogonal signal correction (OSC) and is here applied to four different data sets of multivariate calibration. The results are compared with those of traditional signal correction as well as with those of no pre-processing, and OSC is shown to give substantial improvements. Prediction sets of new data, not used in the model development, are used for the comparisons.

Introduction

Near-infrared (NIR) spectroscopy is being increasingly used for the characterisation of solid, semi-solid, fluid and vapour samples. Frequently the objective with this characterisation is to determine the value of one or several concentrations in the samples. Multivariate calibration is then used to develop a quantitative relation between the digitised spectra (the matrix X) and the concentrations (in the matrix Y), as reviewed by Martens and Naes [1]. NIR spectroscopy is also increasingly used to infer other properties (Y) of samples than concentrations, e.g., the strength and viscosity of polymers, the thickness of a tablet coating, or the octane number of gasoline.

The first step of a multivariate calibration based on NIR spectra is often to pre-process the data. The reason is that NIR spectra often contain systematic variation that is unrelated to the responses (Y). For solid samples this systematic variation is due to, among others, light scattering and differences in spectroscopic path length, and may often constitute the major part of the variation of the sample spectra. Another reason for systematic but unwanted variation in the sample spectra may be that the analyte of interest absorbs only in small parts of the spectral region. The variation in X that is unrelated to Y may disturb the multivariate modelling and cause imprecise predictions for new samples.

For removal of undesirable systematic variation in the data, two types of pre-processing are commonly reported in the analytical chemistry literature, differentiation and signal correction. Popular approaches of signal correction include Savitzky–Golay smoothing [2], multiple signal correction (MSC) 1, 3, Fourier transformation [4], principal components analysis (PCA) [5], variable selection 1, 6, and base line correction 1, 11.

These signal corrections are actually different cases of filtering, where a signal (e.g., a NIR spectrum) is made to have `better properties' by passing it through a `filter'=mathematical function. The objectives of filtering often are rather vague; it is not always easy to specify what we mean by `better properties'. Even if we, in the case of calibration, can specify this objective in terms of lowered prediction errors or simpler calibration models, it is difficult to construct general filters that indeed improve these properties of the data.

We here wish to report the first results of a very simple idea, namely to construct a filter that removes from the spectral matrix (X) only the part that definitely is unrelated to Y. This is made by ensuring that the removed part is mathematically orthogonal to Y, or as close to orthogonal as possible. We call this approach OSC for orthogonal signal correction. As an illustration, we shall here use (a) the modelling of the viscosity of three sets of modified cellulose samples in terms of their NIR reflectance spectra and (b) a similar example where NIR spectra are used to model and predict 17 measured properties of pulp samples.

Differentiation to first and second derivatives, and variable selection, will not be further discussed, since these operations cannot easily be performed orthogonally to Y.

Section snippets

Notation

We shall use capital bold characters for matrices, e.g., X and Y, small bold characters for column vectors, e.g., v, non-bold characters for vector and matrix elements, e.g., xik, and vi, and for indices, e.g., i, j, k, and l, and capital non-bold characters for index limits, e.g., K and N, and A. Row vectors are seen as transposed vectors, e.g., v′, and hence transponation is indicated by a prime . The index i is used as sample index (rows in X and Y; i=1, 2, …, N) and the index k as index of

Filtering and calibration

Before multivariate calibration, the unanimous agreement was that ideal spectra looked like well resolved NMR or IR spectra, i.e., mainly a straight baseline plus some narrow and symmetrical peaks unambiguously raising above this baseline. Noise introduced `wiggles', but could be removed by a judicious `filtering' of the spectra. Much of the objectives of filtering are still formulated accordingly, as a way to make signals and spectra `smooth' and pleasing for the eye, and thus easy to

Multiple signal correction (MSC)

Let us first consider the often used additive and multiplicative signal correction (MSC), of Geladi et al. [3], see also Martens and Naes [1]. Here each digitized spectrum, the row vector xi, is `normalised' by regressing it against the average spectrum of the training set, m:xik=ai+bimk+eik

The average training set spectrum has the elements:mk=∑xik/NThen, from each row, xi, one subtracts the intercept (ai) and divides by the multiplicative constant (bi):xi,corr=(xiai)/bi

We see that this

Orthogonal signal correction (OSC)

We will now investigate the possibility to remove bilinear components from X which are orthogonal to Y, i.e., make a signal correction that does not remove information from X.

We do this by setting up a PCA/PLS-related solution which removes only so much of X as is unrelated (orthogonal) to Y. This approach is based on the fact that as long as the steps of the NIPALS iterative algorithm of the classical two-block PLS regression are retained, the weight vector (w) can be modified in any way to

Related signal corrections

The target rotation method presented by Kvalheim and Karstang [8] and later reviewed by Christie [9] is a filtering method related to PCA and hence also to OSC. This target rotation method uses a specific target score vector to filter the X matrix. The idea with the target rotation is to remove `known' information from X (as expressed by the target vector) that masks the information of interest. But since no orthogonalization to Y is performed there is a risk for removing information correlated

Scaling

The results of any projection, including OSC, are influenced by the scaling of the original data in X. In NIR applications one normally either uses un-scaled data, data scaled to unit variance (auto-scaling), or something in between these two, e.g., so called Pareto scaling [10].

A problem with scaling of the original data occurs when much of the variation in the X-data is due to light scattering and other phenomena which will be removed by the OSC-filtration. Then, the auto-scaling or Pareto

Data sets

In this study four different data sets were used for comparison of filtering (signal correction) methods. Three of the data sets are NIR data collected on cellulose derivatives in order to predict the measured viscosity. The fourth data set is NIR data on pulp samples from the pulp and paper industry on which 17 physical properties have been measured.

Each data set is divided into one calibration set and one external test set used for validation of the model predictive ability.

Results and discussion

For all four data sets two OSC components were used for the filtering.

For the cellulose data sets we see that the OSC method gives better calibration models, i.e., higher Q2, according to cross-validation (Table 1). Evaluation of the prediction errors for the external test sets reveals that the OSC treated data give substantially lower RMSEP values than the raw and MSC data (Table 2). Also, the OSC-filtered data give much simpler calibration models with fewer components than the ones based on

Conclusions

Since evidently projection methods such as PLS are affected by strong systematic variation in the predictor matrix (X) which is unrelated to the response matrix, Y, there is a need for removing such variation from X before further modelling. We have here presented an approach where signal correction (filtering) is made in such a way that the removed parts are linearly un-related (orthogonal) to the response matrix, Y. OSC seems to have additional advantages beyond improved predictability of the

Acknowledgements

Financial support to SW and HA by the Swedish Natural Science Research Council (NFR) and the Swedish Foundation for Strategic Research is gratefully acknowledged. We are most grateful for permission to use data from our collaborations with Akzo Nobel Surface Chemistry, Ö-vik, and Assi-Domän Corporate Research, Piteå, Sweden.

References (13)

  • O.M. Kvalheim et al.

    Interpretation of latent-variable regression models

    Chemometrics and Intelligent Laboratory Systems

    (1989)
  • H. Martens, T. Naes, Multivariate Calibration, Wiley, New York,...
  • A. Savitzky et al.

    Anal. Chem.

    (1993)
  • P. Geladi et al.

    Linearization and scatter-correction for near-infrared reflectance spectra of meat

    Appl. Spectrosc.

    (1985)
  • P.C. Williams, K. Norris, Near-Infrared Technology in Agricultural and Food Industries, American Cereal Association,...
  • J. Sun

    Statistical analysis of NIR data: data pretreatment

    J. Chemometr.

    (1997)
There are more references available in the full text version of this article.

Cited by (0)

View full text