Research Article Open Access

Genetic Algorithm for Variable and Samples Selection in Multivariate Calibration Problems

Kelton de Souza Santiago1, Anderson Silva Soares1, Telma Woerle de Lima1, Clarimar José Coelho2 and Paulo Henrique Ribeiro Gabriel3
  • 1 Federal University of Goias, Brazil
  • 2 Pontifical University, Brazil
  • 3 Federal University of Uberlandia, Brazil

Abstract

One of the main problems of quantitative analytical chemistry is to estimate the concentration of one or more species from the values of certain physicochemical properties of the system of interest. For this it is necessary to construct a calibration model, i.e., to determine the relationship between measured properties and concentrations. The multivariate calibration is one of the most successful combinations of statistical methods to chemical data, both in analytical chemistry and in theoretical chemistry. Among used methods can cite Artificial Neural Networks (ANN), the Nonlinear Partial Least Squares (N-PLS), Principal Components Regression (PCR) and Multiple Linear Regression (MLR). In addition of multivariate calibration methods algorithms of samples selection are used. These algorithms choose a subset of samples to be used in training set covering adequately the space of the samples. In other hand, a large spectrum of a sample is typically measured by modern scanning instruments generating hundreds of variables. Search algorithms have been used to identify variables which contribute useful information about the dependent variable in the model. This paper proposes a Genetic Algorithm based on Double Chromosome (GADC) to do these tasks simultaneously, the sample and variable selection. The obtained results were compared with the well-known algorithms for samples and variable selection Kennard-Stone, Partial Least Square and Successive Projection Algorithm. We showed that the proposed algorithm can obtain better calibrations models in a case study involving the determination of content protein in wheat samples.

Journal of Computer Science
Volume 11 No. 4, 2015, 621-626

DOI: https://doi.org/10.3844/jcssp.2015.621.626

Submitted On: 14 February 2015 Published On: 8 June 2015

How to Cite: Santiago, K. S., Soares, A. S., de Lima, T. W., Coelho, C. J. & Gabriel, P. H. R. (2015). Genetic Algorithm for Variable and Samples Selection in Multivariate Calibration Problems. Journal of Computer Science, 11(4), 621-626. https://doi.org/10.3844/jcssp.2015.621.626

  • 2,779 Views
  • 1,956 Downloads
  • 1 Citations

Download

Keywords

  • Genetic Algorithm
  • Variable Selection
  • Regression