Testing a model ([[Logistic Function]]) $a,b, x_i$. [[Probability]] of Success: $\sigma(x_{i};a,b)= \frac{1}{1+e^{-b(x-a)}}$ Given: $x_{1},x_{2},x_{3}, \dots$, $y_{1},y_{2},y_{3},\dots$ Testing: $a,b$ [[Likelihood]] of my model (rating what model predicted vs happened): $\huge \mathcal L\set{a,b} = \prod _{i=1}^{n} \begin{cases} \sigma(x_{i}; a,b) & \text{if } y_{i} = 1\\ 1-\sigma(x_{i}; a,b) & \text{if } y_{i} = 0\\ \end{cases} $ Using [[Gradient Descent]] with this: We use the logarithm of the likelihood $\huge \begin{align} \ln \pa{\mathcal L\{a,b\}} &= \sum^{n}_{i=1} \begin{cases} \ln(\sigma(x_{i}; a,b)) & \text{if } y_{i} = 1\\ \ln(1-\sigma(x_{i}; a,b)) & \text{if } y_{i} = 0\\ \end{cases} \end{align}$ $\large \begin{align} \ln(\sigma(x_{i};a,b)) &= \ln\pa{ \frac{1}{1+e^{-b(x_{i}-a)}} } \\ \ln(1-\sigma(x_{i};a,b)) &= \ln\pa{ \frac{1}{1+e^{b(x_{i}-a)}} } \end{align}$ $\huge \begin{align} \ln\pa{\mathcal L\{a,b\}} &= \sum_{i=1}^{n} \ln\pa{ \frac{ e^{by_{i}(x_{i}-a)} }{ 1+ e^{b(x_{i}-a)} } } \end{align}$ $\large \begin{align} \ln \pa{\frac{ e^{b(x_{i}-a)y_{i}} }{ 1+ e^{b(x_{i}-a)} } } &= \ln\pa{e^{b(x_{i}-a)y_{i}}} - \ln\pa{ 1+e^{b(x_{i}-a)} }\\ &= b(x_{i}-a) y_{i} + \ln\pa{ 1+ e^{b(x_{i}-a)} } \end{align}$ $\huge \begin{align} \pderiv{ \pa{\ln \mathcal L\set{a,b}} }{a} &= \sum_{i=1}^{n} \pa{ -by_{i} + \frac{be^{b(x_{i}-a)}}{1+e^{b(x_{i}-a)}} } \\ &= \sum_{i=1}^{n} -b \pa{ \sigma(x_{i};a,b) - y_{i} } \end{align}$ $\huge \begin{align} \pderiv{ \pa{\ln \mathcal L\set{a,b}} }{b} &= \sum_{i=1}^{n} \pa{ (x_{i}-a )y_{i} - \frac{(x_{i}-a)e^{b(x_{i}-a)}}{1+e^{b(x_{i}-a)}} } \\ &= \sum_{i=1}^{n} (x_{i}-a)\pa{\sigma(x_{i};a,b)-y_{i}} \end{align} $ We can use these to compute the gradient of the logarithm: $\huge \nabla \ln \mathcal L\set{a,b} = \sum_{i=1}^{n} \mat{ -b\pa{x_{i};a,b-y_{i}} \\ (x_{i}-a)\pa{x_{i};a,b-y_{i}} \\ } $ [[Hessian Optimization]]: $\begin{align} \pderiv{^{2}\ln \mathcal L\set{a,b}}{a^{2}} &= \sum_{i=1}^{n} b^{2} \sigma(x_{i};a,b)(1-\sigma(x_{i};a,b)) \\ \pderiv{^{2}\ln \mathcal L\set{a,b}}{a\partial b} &= \sum_{i=1}^{n} - \sigma(x_{i};a,b)+y_{i}-b(x_{i}-a)\sigma(x_{i};a,b)(1- \sigma(x_{i};a,b)) \end{align} $