Supervised Learning - Classification Fundamentals (Weeks 5)
Supervised learning shifts focus from regression to classification, wherein the goal is to predict discrete categories based on labeled data. The chapter covers binary classification and multi-class classification concepts, introduces Logistic Regression as a key algorithm for classification, explores performance evaluation metrics like Precision, Recall, and F1-Score, and discusses K-Nearest Neighbors (KNN) as a unique 'lazy learning' method. Core challenges like the curse of dimensionality and practical implementation through hands-on labs are also emphasized.
Sections
Navigate through the learning materials and practice exercises.
What we have learnt
- Classification predicts discrete categories from labeled data, differing fundamentally from regression.
- Binary and multi-class classifications employ distinct strategies to manage decision-making between classes.
- Logistic Regression utilizes the Sigmoid function to transform probabilities, while KNN relies on proximity to classify new instances.
Key Concepts
- -- Classification
- A supervised machine learning task where a model learns from labeled data to predict the category or class of new instances.
- -- Logistic Regression
- A classification algorithm that predicts probabilities using the Sigmoid function, capable of handling binary outcomes and extendable to multi-class scenarios.
- -- KNearest Neighbors (KNN)
- A non-parametric, instance-based learning algorithm that classifies new instances based on the majority class of their 'K' closest neighbors from the training set.
- -- Confusion Matrix
- A table that categorizes the true positives, true negatives, false positives, and false negatives, providing insight into the performance of a classification model.
- -- Precision
- The ratio of true positive predictions to the total predicted positives (TP / (TP + FP)), indicating the quality of positive predictions.
- -- Recall
- The ratio of true positive predictions to the actual positives (TP / (TP + FN)), measuring the model's ability to identify all relevant positive cases.
- -- F1Score
- The harmonic mean of precision and recall, providing a balanced measure between these two metrics, especially useful in imbalanced datasets.
Additional Learning Materials
Supplementary resources to enhance your learning experience.