Last modified: Jan 31 2026 at 10:09 PM • 2 mins read

Logistic Regression

Introduction

Logistic regression is a learning algorithm for binary classification problems where output labels $y$ are either 0 or 1.

Goal: Given an input feature vector $x$ (e.g., an image), output a prediction $\hat{y}$ that estimates $y$.

Formal Definition: $\hat{y} = P(y=1 \mid x)$

This represents the probability that $y = 1$ given the input features $x$.

Example: For a cat image classifier, $\hat{y}$ tells us the probability that the image contains a cat.

Logistic regression has two sets of parameters:

Question: Given input $x$ and parameters $w$ and $b$, how do we generate the output $\hat{y}$?

You might try: $\hat{y} = w^T x + b$

Problem with this approach:

This is why linear regression isn’t suitable for binary classification.

Instead, logistic regression uses the sigmoid function to ensure output is between 0 and 1:

\[\hat{y} = \sigma(w^T x + b)\]

Where: $z = w^T x + b$

\[\sigma(z) = \frac{1}{1 + e^{-z}}\]

Visual behavior:

Mathematical analysis:

When $z$ is very large (positive):

When $z$ is very small (large negative):

Your job when implementing logistic regression is to:

Learn parameters $w$ and $b$ such that $\hat{y}$ becomes a good estimate of the probability that $y = 1$.

Now that you understand the logistic regression model, the next step is to define a cost function to learn parameters $w$ and $b$.