diff --git a/Gram-matrix-parameterization.md b/Gram-matrix-parameterization.md index c4358d3..dad08d9 100644 --- a/Gram-matrix-parameterization.md +++ b/Gram-matrix-parameterization.md @@ -52,7 +52,7 @@ df & = \operatorname{tr}(d\Delta^\top \Delta) + \operatorname{tr}(\Delta^\top d\ & = 2\operatorname{tr}(\Delta^\top d\Delta). \end{align*} \] -To compute $d\Delta$, it will be helpful to write the projection operator $\mathcal{P}$ more explicitly. Let $\mathcal{C}$ be the set of indices where the Gram matrix is unconstrained. We can express $C$ as the span of the elementary matrices $\{E_{ij}\}_{(i, j) \in \mathcal{C}}$. Observing that $E_{ij} X^\top E_{ij} = X_{ij} E_{ij}$ for any matrix $X$, we can do orthogonal projection onto $C$ using elementary matrices: +To compute $d\Delta$, it will be helpful to write the projection operator $\mathcal{P}$ more explicitly. Let $\mathcal{C}$ be the set of indices where the Gram matrix is unconstrained. We can express $C$ as the span of the standard basis matrices $\{E_{ij}\}_{(i, j) \in \mathcal{C}}$. Observing that $E_{ij} X^\top E_{ij} = X_{ij} E_{ij}$ for any matrix $X$, we can do orthogonal projection onto $C$ using standard basis matrices: \[ \mathcal{P}(X) = \sum_{(i, j) \in \mathcal{C}} E_{ij} X^\top E_{ij}. \] It follows that \[ @@ -79,7 +79,7 @@ df & = 2\operatorname{tr}(-\Delta^\top \big[\mathcal{P}(A^\top Q\,dA)^\top + \ma & = -4 \operatorname{tr}\big(\Delta^\top \mathcal{P}(A^\top Q\,dA)\big), \end{align*} \] -using the transpose-invariance and cyclic property of the trace in the final step. Writing the projection in terms of elementary matrices, we learn that +using the transpose-invariance and cyclic property of the trace in the final step. Writing the projection in terms of standard basis matrices, we learn that \[ \begin{align*} df & = -4 \operatorname{tr}\left(\Delta^\top \left[ \sum_{(i, j) \in \mathcal{C}} E_{ij} (A^\top Q\,dA)^\top E_{ij} \right] \right) \\