Linear Regression From Scratch in Python (Just NumPy, No scikit-learn)

scikit-learn can fit a linear regression in one line. That's great for shipping — and terrible for actually understanding what just happened. So this is the version I wish someone had shown me first: linear regression built from scratch in about fifteen lines of NumPy, with the math explained as we go.

By the end you'll know exactly what "fit a line" means, where the slope and intercept come from, and how to measure whether the line is any good.

The idea in one sentence

Linear regression draws the straight line that sits as close as possible to all your points. "As close as possible" has a precise meaning: the line that makes the total squared vertical distance from the points to the line as small as it can be. That's the least squares method — and it has a clean closed-form solution, no training loop required.

We're fitting:

y = a + b·x

where b is the slope and a is the intercept.

The two formulas

Least squares gives us the slope and intercept directly:

b = Σ(xᵢ − x̄)(yᵢ − ȳ) / Σ(xᵢ − x̄)²
a = ȳ − b·x̄

In plain English: the slope is "how x and y vary together" divided by "how much x varies on its own," and the intercept just shifts the line so it passes through the mean point (x̄, ȳ).

The code

import numpy as np

x = np.array([1, 2, 3, 4, 5], dtype=float)
y = np.array([2, 4, 5, 4, 6], dtype=float)

x_mean, y_mean = x.mean(), y.mean()

# slope and intercept (least squares)
b = np.sum((x - x_mean) * (y - y_mean)) / np.sum((x - x_mean) ** 2)
a = y_mean - b * x_mean

print(f"Equation: y = {a:.2f} + {b:.2f}x")

Output:

Equation: y = 1.80 + 0.80x

That's it. No .fit(), no library — just the definition turned into NumPy.

Is the line any good? Meet R²

A line always exists, but that doesn't mean it fits. The standard score for "how much of the variation in y does our line explain?" is R²:

y_pred = a + b * x

ss_res = np.sum((y - y_pred) ** 2)        # error the line leaves behind
ss_tot = np.sum((y - y_mean) ** 2)        # total variation in y
r2 = 1 - ss_res / ss_tot

print(f"R² = {r2:.3f}")

Output:

R² = 0.727

R² runs from 0 to 1. Here, ~73% of the variation in y is explained by x — decent for five noisy points. If you want to sanity-check your numbers without writing code, you can drop the same data into this linear regression calculator and confirm the slope, intercept, and R² match. (They will — same math.)

Bonus: the gradient-descent version (where the calculus sneaks in)

The closed-form solution is perfect for one feature. But real models with millions of parameters learn by gradient descent — nudging the parameters downhill on the error surface. Here's linear regression learned that way:

b, a = 0.0, 0.0
lr = 0.01
n = len(x)

for _ in range(10_000):
    y_pred = a + b * x
    error = y_pred - y
    grad_a = (2 / n) * np.sum(error)        # ∂MSE/∂a
    grad_b = (2 / n) * np.sum(error * x)    # ∂MSE/∂b
    a -= lr * grad_a
    b -= lr * grad_b

print(f"y = {a:.2f} + {b:.2f}x")   # → y = 1.80 + 0.80x

Notice grad_a and grad_b — those are partial derivatives of the mean squared error. That's the whole secret of training: the derivative tells each parameter which direction reduces the error, and you take a small step that way, over and over. If the calculus feels fuzzy, it helps to play with a derivative calculator and watch how the slope of a function changes — because that slope is the gradient your model is following.

Run it and you'll land on the exact same line as the closed-form solution: y = 1.80 + 0.80x. Two completely different methods, same answer — which is a nice sign you understood it.

What you actually learned

Least squares = the line that minimizes total squared error, with a closed-form slope/intercept.
R² tells you how much of y the line explains (0 = useless, 1 = perfect).
Gradient descent reaches the same answer by following derivatives of the error — the same mechanism behind every neural network.

model.fit() will never feel like a black box again. Next time you reach for scikit-learn, you'll know exactly what it's doing under the hood.

I write plain-English tutorials on the math behind machine learning, with free interactive tools, at mlforbeginners.com. If the math ever made you feel behind, that's who I build for.