[Data Mining] Basics Summary

By | Y2014Y2014-10M-D

Basics of Data Mining Summary

* reference: An Introduction to Data Mining( Dr. Saed Sayad)

* further readings: link

  1. Data Mining Process
    • Overall: Problem Definition>Data Preparation>Data Exploration>Modeling>Evaluation>Deployment
    • Explaining the past: Exploration>Univariate/Bivariate
    • Predicting the future: Modeling>Classification/Regression/Clustering/Association Rules
  2. (Past) Exploration: Describing the data by means of statistical/visualization techs.
    1. Univariate Analysis:
      • Variables could be either
      • numerical->categorical : binning or discretization
      • categorical ->numerical: encoding
      • Proper handling of missing values are important
    2. Bivariate Analysis: between two variables,
      • Relationship – Existence of association / strength of the association
      • Differences and significance of the differences
      • Types
  3. (Future) Modeling: Predictive Modeling->model to predict an outcome
    1. Classification
      1. Frequency Table
      2. Covariance Matrix
      3. Similarity Functions
      4. Others
    2. Regression
      1. Frequency Table
      2. Covariance Matrix
      3. Similarity Function
      4. Others
    3. Clustering
      1. Hierarchical
      2. Partitive
    4. Association Rules
      1. AIS Algorithm
      2. AprioriTid Algorithm
      3. SETM Algorithm
      4. Apriori Algorithm
      5. AprioriTid Algorithm
      6. AprioriHybrid Algorithm

2,066 total views, 2 views today

댓글 남기기