Journal of Computer Science

HEURISTIC DISCRETIZATION METHOD FOR BAYESIAN NETWORKS

Mariana D.C. Lima, Silvia M. Nassar, Pedro Ivo R.B.G. Rodrigues, Paulo J. Freitas Filho and Carlos M.C. Jacinto

DOI : 10.3844/jcssp.2014.869.878

Journal of Computer Science

Volume 10, Issue 5

Pages 869-878

Abstract

Bayesian Network (BN) is a classification technique widely used in Artificial Intelligence. Its structure is a Direct Acyclic Graph (DAG) used to model the association of categorical variables. However, in cases where the variables are numerical, a previous discretization is necessary. Discretization methods are usually based on a statistical approach using the data distribution, such as division by quartiles. In this article we present a discretization using a heuristic that identifies events called peak and valley. Genetic Algorithm was used to identify these events having the minimization of the error between the estimated average for BN and the actual value of the numeric variable output as the objective function. The BN has been modeled from a database of Bit’s Rate of Penetration of the Brazilian pre-salt layer with 5 numerical variables and one categorical variable, using the proposed discretization and the division of the data by the quartiles. The results show that the proposed heuristic discretization has higher accuracy than the quartiles discretization.

Copyright

© 2014 Mariana D.C. Lima, Silvia M. Nassar, Pedro Ivo R.B.G. Rodrigues, Paulo J. Freitas Filho and Carlos M.C. Jacinto. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.