Fix typos <noupdate>

This commit is contained in:
2025-01-21 17:30:20 +01:00
parent beacf416b5
commit c9716b60ac

View File

@ -96,7 +96,7 @@
\item[Entropy] \marginnote{Entropy} \item[Entropy] \marginnote{Entropy}
Expected value of the self-information of a probability mass function: Expected value of the self-information of a probability mass function:
\[ H(p(\cdot)) = \mathbb{E}_{x \sim p} \left[ - \log(p(\cdot)) \right] \approx -\sum_{x \in \mathbb{X}} p(x) \log(p(x)) \] \[ H(p(\cdot)) = \mathbb{E}_{x \sim p} \left[ - \log(p(x)) \right] \approx -\sum_{x \in \mathbb{X}} p(x) \log(p(x)) \]
Intuitively, it measures the average surprise of a distribution. Intuitively, it measures the average surprise of a distribution.
\begin{example} \begin{example}
@ -218,8 +218,9 @@
\begin{split} \begin{split}
D_\text{EMD}(p || q) = \min_{\matr{P}}\left[ \sum_{i, j} \matr{P}_{i, j} |i-j| \right] \\ D_\text{EMD}(p || q) = \min_{\matr{P}}\left[ \sum_{i, j} \matr{P}_{i, j} |i-j| \right] \\
\begin{split} \begin{split}
\text{subject to}& \sum_{i} \matr{P}_{i, j} = p(i) \,\land \\ \text{subject to}
&\sum_j \matr{P}_{i,j} = q(j) \,\land \\ & \sum_{j} \matr{P}_{i, j} = p(i) \,\land \\
& \sum_{i} \matr{P}_{i,j} = q(j) \,\land \\
& \matr{P}_{i,j} \geq 0 & \matr{P}_{i,j} \geq 0
\end{split} \end{split}
\end{split} \end{split}
@ -929,7 +930,7 @@
\begin{description} \begin{description}
\item[Generation architecture] \item[Generation architecture]
Standard U-Net or transformers to predict the noise. Standard U-Net or transformer to predict the noise.
\begin{description} \begin{description}
\item[U-Net with self-attention] \item[U-Net with self-attention]
@ -1248,7 +1249,7 @@
\begin{split} \begin{split}
\varepsilon_t^{\text{cls}}(\x_t, c; \params) \varepsilon_t^{\text{cls}}(\x_t, c; \params)
&= \varepsilon_t(\x_t, c; \params) - w \nabla_{x_t}[ \log(p_\text{cls}(c \mid \x_t, t)) ] \\ &= \varepsilon_t(\x_t, c; \params) - w \nabla_{x_t}[ \log(p_\text{cls}(c \mid \x_t, t)) ] \\
&= - \big( - \varepsilon_t(\x_t, c; \params) + w \nabla_{x_t}[ \log(p_\text{cls}(c \mid \x_t, t)) ] \big) % &= - \big( - \varepsilon_t(\x_t, c; \params) + w \nabla_{x_t}[ \log(p_\text{cls}(c \mid \x_t, t)) ] \big)
\end{split} \end{split}
\] \]
By applying Bayes' rule on the second term, we have that: By applying Bayes' rule on the second term, we have that: