LEARNING THE BAYESIAN STRUCTURE IN BIGDATA USING THE K2 ALGORITHM WITH MAPREDUCE

Arilene Santos De França, João Gabriel R. de O. Lima, Antonio F. L. Jacob Junior, Ádamo Lima De Santana

Abstract


The focus of this paper is to propose a new approach for learning Bayesian Network in the context of BigData using the K2 algorithm, which allows finding the most probable structure of a belief network from a given dataset. The main challenge is to deal with the complexity of the problem, in order to reduce the execution time of the algorithm to produce faster responses without the need of reducing dataset and, consequently, lose some useful information. Based on it, this paper proposes a modification of the KDD process in preparation and pre-processing steps, through the insertion of another stage, in order to optimize the search process in frequency of the data analyzed. In order to prove the efficiency of this model, statistics related to performance of the technique according to the number of states of each attribute, as compared to the amount of attributes, are presented. Index Terms - Big Data, Data mining, K2 algorithm, NoSQL Databases.

Full Text: PDF

Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.

----------------------------------------------------------------------

ISSN 2319-0507

----------------------------------------------------------------------

Indexing

Logotipo do IBICT

----------------------------------------------------------------------

Scientific Societies and Directories

Logotipo COPEC Logotipo SHERO Logotipo da Capes

----------------------------------------------------------------------

Follow Us

Logotipo facebook Logotipo LinkedIn Logotipo Twitter