1. Scalability
If data mining algorithms are to handle these massive data sets, then they must be scalable.
2. High Dimensionality
For some data analysis algorithms, the computational complexity increases rapidly as the dimensionality increases.
3. Heterogeneous and Complex Data
Dealing with data with not the same type.
4. Data Ownership and Distribution
Data is geographically distributed among resources belonging to multiple entities.
5. Non-traditional Analysis
The traditional statistical approach is based on a hypothesize-and-test paradigm.
Current data analysis tasks often require the generation and evaluation of thousands of hypotheses, and consequently, the development of some data mining techniques has been motivated by the desire to automate the process of hypothesis generation and evaluation.