Introduction to Datamining

Datamining, is the no trivial process of discovering valid patterns, new, potentially useful and comprehensible inside a group of data, as the definition of Piatetsky-Shapiro published in the magazine "AI Magazine".
To simplify, we could say that data mining treats to extract knowledge from data.

By means of a series of processes applied in different phases on the data in gross, and defined by an expert that know the meaning of these data, and have clear the aims that pursues, can extract relations between these data, discover unseen patterns and build models that describe this knowledge.
The phases by which would have to happen this process of finding of knowledge are the following:

- Definition of the datamining task.
¿What objective pursue ?
- Data selection
- Data preparation
- Application of datamining processes on the treated data
- Evaluation and interpretation of the model obtained
- Integration of the results in the information systems

It is a continuous process, and can feature of different iterations, where the results of an iteration feeds the start of the following.

Of course, for the realization of all the process exist different specialized tools that facilitate, or make possible, the go through all the phases.
Two of the most known are SAS Enterprise Miner and SPSS Clementine.
Also exists some open source software projects, as WEKA, developed in the University of Waikato, that allows to realize processes of Datamining.