TODO¶

todo: add a title, description of what i am doing, and links to references.

# Import necessary libraries
import numpy as np
import matplotlib.pyplot as plt

# Set number of points to generate
num_points = 10000

# Generate points along a line with normal error
x = np.linspace(0, 10, num_points)
coef = np.random.uniform(-10, 10, 2)
y = coef[0] * x + coef[1]

# Add normal error to x and y
x_error = x + np.random.normal(0, 0.5, num_points)
y_error = y + np.random.normal(0, 0.5, num_points)

# Plot the points with error
plt.scatter(x_error, y_error)
plt.xlabel('X-axis')
plt.ylabel('Y-axis')
plt.title(f'{num_points} Points Along a Line with Normal Error')
plt.show()

# Display the coefficients
coef

array([ 6.17225641, -7.60517112])

# Define the line function
phi = [0, 0]
line = lambda x: phi[0] * x + phi[1]

Problem 2.1: Calculate the gradient of the loss function¶

To walk “downhill” on the loss function (equation 2.5), we measure its gradient with respect to the parameters $\phi_0$ and $\phi_1$. Calculate expressions for the slopes $\frac{\partial L}{\partial \phi_0}$ and $\frac{\partial L}{\partial \phi_1}$.

The loss function calculated here is the sum of squared errors between the predicted line and the noisy points: $$L(\phi) = \sum_{i=1}^n (\phi_0 \cdot x_i + \phi_1 - y_i)^2$$

The gradient of the loss function with respect to $\phi_0$ and $\phi_1$:

$$\frac{\partial L}{\partial \phi_0} = \sum_{i=1}^n 2 \cdot (\phi_0 \cdot x_i + \phi_1 - y_i) \cdot x_i$$$$\frac{\partial L}{\partial \phi_1} = \sum_{i=1}^n 2 \cdot (\phi_0 \cdot x_i + \phi_1 - y_i)$$

loss = np.sum((line(x_error) - y_error)**2)
loss

8586265.521386892

Problem 2.2: Solve for $\phi_0$ and $\phi_1$ by setting the gradients to zero¶

Show that we can find the minimum of the loss function in closed form by setting the expression for the derivatives from problem 2.1 to zero and solving for $\phi_0$ and $\phi_1$.

To find the minimum of the loss function, we set the gradients to zero and solve for $\phi_0$ and $\phi_1$:

$$ \frac{\partial L}{\partial \phi_0} = 0 \implies \phi_0 \left( \sum\limits_{i=1}^n x_i^2 \right) + \phi_1 \left( \sum\limits_{i=1}^n x_i \right) = \left( \sum\limits_{i=1}^n x_i y_i \right) $$$$ \frac{\partial L}{\partial \phi_1} = 0 \implies \phi_0 \left( \sum\limits_{i=1}^n x_i \right) + \phi_1 \cdot n = \left( \sum\limits_{i=1}^n y_i \right) $$

We can write this as:

$$ \begin{bmatrix} \sum\limits_{i=1}^n x_i^2 & \sum\limits_{i=1}^n x_i \\ \sum\limits_{i=1}^n x_i & n \end{bmatrix} \begin{bmatrix} \phi_0 \\ \phi_1 \end{bmatrix} = \begin{bmatrix} \sum\limits_{i=1}^n x_i y_i \\ \sum\limits_{i=1}^n y_i \end{bmatrix} $$

Solving the matrix equation we get:

$$ \begin{bmatrix} \phi_0 \\ \phi_1 \end{bmatrix} = \begin{bmatrix} \sum\limits_{i=1}^n x_i^2 & \sum\limits_{i=1}^n x_i \\ \sum\limits_{i=1}^n x_i & n \end{bmatrix}^{-1} \begin{bmatrix} \sum\limits_{i=1}^n x_i y_i \\ \sum\limits_{i=1}^n y_i \end{bmatrix} $$

m = np.array([
    [sum(x_error), len(x_error)],
    [sum([x**2 for x in x_error]), sum(x_error)]
])
b = np.array([
    [sum(y_error)],
    [sum([x*y for (x, y) in zip(x_error, y_error)])]
])

best_guess = np.linalg.inv(m)@b

best_guess

array([[ 5.98658293],
       [-6.69455091]])

TODO¶

todo: rewrite this to show how the above maps to the following normal equation.

To find the best fitting line, we use the normal equation which solves for coefficients that minimize the sum of squared errors: $$\phi = (X^T X)^{-1} X^T y$$

A = np.vstack([x_error, np.ones(len(x_error))]).T
best_guess = np.linalg.lstsq(A, y_error, rcond=None)[0]

best_guess

array([ 5.98658293, -6.69455091])

# Plot the original line and the best guess line
best_x = np.linspace(0, 10, 400)
plt.plot(best_x, best_guess[0] * best_x + best_guess[1], label='Best Guess', color='r')
plt.plot(best_x, coef[0] * best_x + coef[1], label='Original', color='g')
plt.scatter(x_error, y_error)
plt.xlabel('X-axis')
plt.ylabel('Y-axis')
plt.title('Points Along a Line with Normal Error')
plt.legend()
plt.grid(True)
plt.show()

coef

array([ 6.17225641, -7.60517112])

best_guess

array([ 5.98658293, -6.69455091])

TODO¶

todo: show extension for multidimensional version of the above.

maybe put it in its own notebook?

upload to some cloudflare pages.

download