Data Mining

Information

Teacher coordinatorGilles Gasso
Teacher(s)Gilles Gasso, Benoît Gaüzère
LanguageEnglish
Credits4.5
TeachingLectures : 21h Exercises : 21h
Web sitehttps://moodle.insa-rouen.fr/course/view.php?id=92

Aim and objective

  • Formalizing a data-mining problem and identify the related issues
  • Acquaint and apply machine learning methods and algorithms for Data-Mining.

Outcome learning

  • INSA reference data :
    • Concevoir un système d'ingénierie des données [3P]
    • Optimiser un modèle [3P]
  • CNISF reference data :
    • J10P [2P]
    • J40K [1P]
1 - Notion, 2 - Concept, 3 - Application, I - fully, P - incomplete

Course description

  • Introduction - Dimensionality reduction method
    • Gentle introduction to Data-Mining and Statistical Learning
    • Principal Component analysis (PCA)
  • Data Clustering in 3 lessons
    • Agglomerative hierarchical clustering
    • K-means algorithm
    • Mixture model and Expectation-Maximization (EM)
  • Classification methods (linear and non-linear approaches)
    • KNN approach
    • Theory of Bayes decision, Bayes classifier
    • Support vector machines - kernel methods
    • SVM

Prerequisites

Notion of statistics and matlab programming

Bibliography

  • Christopher Bishop, Pattern Recognition and Machine Learning, 2006
  • Trevor Hastie, Robert Tibshirani, Jerome Friedman, The Elements of Statistical Learning (Data Mining, Inference, and Prediction), 2009
  • Richard Duda, Peter Hart, David Stork, Pattern Classification,

Assessment

  • Final Exam: 50%
  • Practical exam: 50%