Bring up subtleties of regularization and frozen entries
parent
bdd7926ce2
commit
00d7d1b369
1 changed files with 7 additions and 4 deletions
|
@ -119,11 +119,14 @@ The minimization routine is implemented in [`engine.rs`](../src/branch/main/app-
|
||||||
2. Find the Hessian $H(f) := d\operatorname{grad}(f)$, as described in "The second derivative of the loss function."
|
2. Find the Hessian $H(f) := d\operatorname{grad}(f)$, as described in "The second derivative of the loss function."
|
||||||
* Recall that we express $H(f)$ as a matrix in the standard basis for $\operatorname{End}(\mathbb{R}^n)$.
|
* Recall that we express $H(f)$ as a matrix in the standard basis for $\operatorname{End}(\mathbb{R}^n)$.
|
||||||
3. If the Hessian isn't positive-definite, make it positive definite by adding $-c \lambda_\text{min}$, where $\lambda_\text{min}$ is its lowest eigenvalue and $c > 1$ is a parameter of the minimization routine. In other words, find the regularized Hessian
|
3. If the Hessian isn't positive-definite, make it positive definite by adding $-c \lambda_\text{min}$, where $\lambda_\text{min}$ is its lowest eigenvalue and $c > 1$ is a parameter of the minimization routine. In other words, find the regularized Hessian
|
||||||
$$H_\text{reg}(f) := H(f) + \begin{cases}0 & \lambda_\text{min} > 0 \\ -c \lambda_\text{min} & \text{otherwise} \end{cases}.$$
|
$$H_\text{reg}(f) := H(f) + \begin{cases}0 & \lambda_\text{min} > 0 \\ -c \lambda_\text{min} & \lambda_\text{min} \le 0 \end{cases}.$$
|
||||||
* The parameter $c$ is passed to `realize_gram` as the argument `reg_scale`.
|
* The parameter $c$ is passed to `realize_gram` as the argument `reg_scale`.
|
||||||
* Ueda and Yamashita add an extra regularization term that's proportional to a power of $\|\operatorname{grad}(f)\|$, but we don't bother.
|
* When $\lambda_\text{min}$ is exactly zero, our regularization doesn't do anything, so $H_\text{reg}(f)$ isn't actually positive-definite. Ueda and Yamashita add an extra regularization term that's proportional to a power of $\|\operatorname{grad}(f)\|$, which takes care of this problem.
|
||||||
4. Find the base step $u$, which is defined by the property that $-\operatorname{grad}(f) = H(f)\,u$.
|
4. Project the negative gradient and the regularized Hessian onto the orthogonal complement of the frozen subspace of $\operatorname{End}(\mathbb{R}^n)$.
|
||||||
5. Backtrack by reducing the step size until we find a step that reduces the loss at a good fraction of the maximum possible rate.
|
* For this write-up, we'll write the projection as $\mathcal{Q}$.
|
||||||
|
5. Find the base step $u \in \operatorname{End}(\mathbb{R}^n)$, which is defined by two properties: satisfying the equation $-\mathcal{Q} \operatorname{grad}(f) = H_\text{reg}(f)\,u$ and being orthogonal to the frozen subspace.
|
||||||
|
* When we say in the code that we're "projecting" the regularized Hessian, we're really turning it into an operator that can be used to express both properties.
|
||||||
|
6. Backtrack by reducing the step size until we find a step that reduces the loss at a good fraction of the maximum possible rate.
|
||||||
|
|
||||||
### Reconstructing a rigid subassembly
|
### Reconstructing a rigid subassembly
|
||||||
|
|
||||||
|
|
Loading…
Add table
Reference in a new issue