How does Netflix know what shows to recommend to you? Or how your phone recognizes your face? Or even how spam emails get filtered out of your inbox? The answer to all these questions is machine learning!
Machine learning is the process of building programs that learn from experience without being explicitly programmed with rules. Instead of writing step-by-step instructions for a computer to follow, we give the computer examples and let it figure out the patterns on its own.
Machine learning is about:
Think of it like teaching a child. You don't give them a rulebook for recognizing dogs—you show them many examples of dogs until they can identify one on their own!
Machine learning is typically divided into two main categories:
Supervised Learning:
In supervised learning, the algorithm learns from labeled data—examples that include both the input features and the correct output. The goal is to learn a mapping from inputs to outputs.
Training data: “examples” x with “labels” y
Classification: y is discrete; to simplify,
What is Classification?
Classification is when we want to predict a category or class. The output is discrete (belonging to specific groups).
Example: Is an email spam or not spam?
In a classification problem, we're drawing a decision boundary that separates different classes. For instance, if we were classifying fruits based on their length and width, we might find that bananas tend to be longer while oranges tend to be rounder.
Regression
Regression is when we want to predict a continuous value.
Examples:
In regression, we're trying to find a line (or curve) that best fits the data points, allowing us to predict values for new data.
Unsupervised Learning
In unsupervised learning, the algorithm learns from unlabeled data—examples that only include the input features without any corresponding output. The goal is to find structure or patterns in the data.
Clustering
Clustering is about grouping similar examples together.
Training data: “examples” x
Clustering/segmentation:
Examples: