Unsupervised Learning & Dimensionality Reduction (Weeks 10)
The focus shifts to unsupervised learning techniques involving clustering and dimensionality reduction. Key concepts include Gaussian Mixture Models (GMMs) for clustering, various anomaly detection algorithms, and mastering Principal Component Analysis (PCA) for reducing dimensionality. Understanding the differences between feature selection and feature extraction further enhances practical application in data analysis.
Sections
Navigate through the learning materials and practice exercises.
What we have learnt
- Unsupervised learning helps discover patterns in unlabeled data.
- Gaussian Mixture Models provide a flexible approach to clustering with probabilistic assignments.
- Dimensionality reduction techniques like PCA simplify complex datasets while retaining essential information.
Key Concepts
- -- Gaussian Mixture Models (GMMs)
- A probabilistic model that assumes data points are generated from several Gaussian distributions, allowing for clusters that are non-spherical and of varying sizes.
- -- Anomaly Detection
- A method in unsupervised learning to identify rare items or events that deviate significantly from the majority of the data.
- -- Principal Component Analysis (PCA)
- A linear dimensionality reduction technique that identifies directions of maximum variance in the data to reduce feature space while retaining as much information as possible.
- -- Feature Selection
- The process of selecting a subset of relevant features for use in model construction, based on their contribution to model performance.
- -- Feature Extraction
- The process of transforming data into a new space of features that capture the most informative characteristics from the original dataset.
Additional Learning Materials
Supplementary resources to enhance your learning experience.