diff --git a/Gram-matrix-parameterization.md b/Gram-matrix-parameterization.md index 3e6407e..f9987e2 100644 --- a/Gram-matrix-parameterization.md +++ b/Gram-matrix-parameterization.md @@ -115,8 +115,8 @@ We minimize the loss function using a cheap imitation of Ueda and Yamashita's re The minimization routine is implemented in [`engine.rs`](../src/branch/main/app-proto/src/engine.rs). (In the old Julia prototype of the engine, it's in [`Engine.jl`](../src/branch/main/engine-proto/gram-test/Engine.jl).) It works like this. 1. Do Newton steps, as described below, until the loss gets tolerably close to zero. Fail out if we reach the maximum allowed number of descent steps. - 1. Find $-\operatorname{grad}(f)$, as described in "The first derivative of the loss function." - 2. Find the Hessian $H(f) := d\operatorname{grad}(f)$, as described in "The second derivative of the loss function." + 1. Find $-\operatorname{grad}(f)$, as described in ["The first derivative of the loss function."](Gram-matrix-parameterization#the-first-derivative-of-the-loss-function) + 2. Find the Hessian $H(f) := d\operatorname{grad}(f)$, as described in ["The second derivative of the loss function."](Gram-matrix-parameterization#the-first-derivative-of-the-loss-function) * Recall that we express $H(f)$ as a matrix in the standard basis for $\operatorname{End}(\mathbb{R}^n)$. 3. If the Hessian isn't positive-definite, make it positive definite by adding $-c \lambda_\text{min}$, where $\lambda_\text{min}$ is its lowest eigenvalue and $c > 1$ is a parameter of the minimization routine. In other words, find the regularized Hessian $$H_\text{reg}(f) := H(f) + \begin{cases}0 & \lambda_\text{min} > 0 \\ -c \lambda_\text{min} & \lambda_\text{min} \le 0 \end{cases}.$$