An [[Overdetermined Linear System|Overdetermined System of Linear Equations]] is a system of linear equations that does not have enough [[Degree of Freedom|degrees of freedom]].
>[!example]
>$\large \begin{align}
>2x+6y&=3\\
>4x+ y&=1\\
>-2x + y &= -2\\
>8x+ 2y&=4\\
>\end{align}$
> This system is over determined, there is no $x, y$ that satifies all of these equations.
In these cases, it is most likely not possible for an exact solution, so you can get an approximation.
What is meant by a 'aproximation' is for some [[System of Linear Equations]] represented by a [[Matrix]] $A\vec x = \vec b$,
find a [[Vector]] $\vec c$ such that the error between $\vec c$ and $\vec b$ is minimized.
Error in this context is using the [[Least Squared Solutions|Least Squared Solution]]:
$\large
(b_{1}-c_{1})^2+(b_{2}-c_{2})^2 + \cdots + (b_{k} - c_{k})^2 \\
$
Which is the same as the [[Distance]] between $\vec b$ and $\vec u$ squared.
$ \large \lvert \lvert \vec b - \vec c \rvert \rvert $
If $A\vec x=\vec b$ is [[Consistent]], then the closest [[Approximation]] is $\vec x$, which is the same thing as $\vec x \in U$ where $U=\op{Col}(A)$.
If the equation is [[Inconsistent]], then the approximation would be the [[Orthogonal Projection]] of $\vec b$ unto $A$.
$\large
\begin{align}
A \vec x &= \vec b\\
A^\intercal A \vec x &= A^\intercal \vec b \\
\end{align}
$
#### Scatter plots
Given some list of numbers $\vec x, \vec y$.
| $x$ | $y$ |
| -------- | -------- |
| $x_1$ | $y_{1}$ |
| $x_{2}$ | $y_{2}$ |
| $\vdots$ | $\vdots$ |
In order to connection each pair of $(x,y)$ with a trend line constitutes a [[Overdetermined Linear System|Overdetermined System]].
The error would be the [[Least Squared Solutions|least-squared]] formula:
$\huge
\epsilon = \sum_{i=1}^n (x_i-f(x_{i}))^2
$
$\huge
\begin{align}
ax_{1}+b &= y_{1} \\
ax_{2} + b &= y_{2} \\
&\vdots\\
ax_{n} + b &= y_{n} \\
\end{align}
$
$\huge \begin{align}
\underbrace{\mat{
x_{1}&1\\
x_{2}&1 \\
\vdots&\vdots \\
x_{n} & 1
}}_{\huge X}
\underbrace{
\mat{a\\b}}_{\huge {\vec a}}
&=
\underbrace{
\mat{y_{1}\\y_{2}\\\vdots\\y_{n}}
}_{{\vec y}}
\end{align}$
This problem can be rephrased as the vector $\vec y$ does not exist in the [[Column Space]] of $X$ ($\vec y \notin \op{\op{Col}(X)}$), we are trying to find the vector $\vec y_{X} \in X$ that is the closest vector in $X$ to $\vec y$. In this case, our solution is the same as finding the [[Orthogonal Projection]] of $\vec y$ onto $X$:
$\huge \begin{align}
X\vec a &= \vec y\\
X^\intercal X \vec a &= X^\intercal \vec b \\
\end{align}$
This new system is one that we can find an analytical solution for $\vec a$ that has a minimized [[Least Squared Solutions|least squared error]]
$\huge \begin{align}
X^{\intercal}X &= \mat{
n & \sigma_{x} \\
\sigma_{x} & \sigma_{xx}
} \\
X^{\intercal}\vec y &= \mat{ \sigma_{y} \\ \sigma_{x y}}
\end{align}$
>[!example]
>![[../../06 Mind Dump/202503260930|202503260930]]
## Non-Linear Functions used in [[Linear Combination]]
This method can also work for functions that are [[Linear Combination|Linear Combinations]] of non-linear functions.
For example if our funtion is:
$\huge
f(x) = a + bx + cx^{2}
$
$\huge \begin{align}
\underbrace{
\mat{
\vdots &
\vdots &
\vdots \\
1 & x_{i} & x_{i}^{2} \\
\vdots &
\vdots &
\vdots }
}_{X}
\mat{a\\b\\c} &=
\vec y
\end{align}$
If your function is not a linera combination, this still works if you can manipulate it to be one.
For example with [[Zipf's Law]]
$\huge y = kx^{-p} $
We can take the logarithm:
$\huge \ln(y) = \ln(k) - p\ln(x) $
Which leads:
$\huge X^{\intercal} X\mat{\ln k\\-p} = X^{\intercal} \vec y $