In linear regression, our goal is to learn a function that can predict an outcome from some input . Let's use a real example:
Imagine we have data collected from different students:
Student | Hours of study | Score |
---|---|---|
1 | 2 | 75% |
2 | 3.5 | 82% |
3 | 1.5 | 68% |
4 | 4 | 90% |
In mathematical notation, we write this as:
Once we have our best-fit line, we can:
The "linear" in linear regression means we're looking for a straight line that best fits our data. This line can be written as:
Where:
For example, we might find that:
This would mean:
The challenge is that real data is messy! Some students might study for 3 hours and get 85%, while others study the same amount and get 78%. Our task is to find the line that best represents the general trend - the "best fit" line through all these points.