\(
\newcommand{\cat}[1] {\mathrm{#1}}
\newcommand{\catobj}[1] {\operatorname{Obj}(\mathrm{#1})}
\newcommand{\cathom}[1] {\operatorname{Hom}_{\cat{#1}}}
\newcommand{\multiBetaReduction}[0] {\twoheadrightarrow_{\beta}}
\newcommand{\betaReduction}[0] {\rightarrow_{\beta}}
\newcommand{\betaEq}[0] {=_{\beta}}
\newcommand{\string}[1] {\texttt{"}\mathtt{#1}\texttt{"}}
\newcommand{\symbolq}[1] {\texttt{`}\mathtt{#1}\texttt{'}}
\newcommand{\groupMul}[1] { \cdot_{\small{#1}}}
\newcommand{\groupAdd}[1] { +_{\small{#1}}}
\newcommand{\inv}[1] {#1^{-1} }
\newcommand{\bm}[1] { \boldsymbol{#1} }
\newcommand{\qed} { {\scriptstyle \Box} }
\require{physics}
\require{ams}
\require{mathtools}
\)
Math and science::INF ML AI
Negative log likelihood loss. A perspective.
Negative log likelihood loss is normally calculated as the positivized mean log likelihood. This is:
\[
\text{loss} = \frac{1}{N}\sum_{i=0}^{i=N} \log( \mathcal{P}(\text{data} | \text{model_out})) \]
As this mean is taken over many samples, it approximates an expectation—an expectation over log probabilities. Sound familiar? This is an approximation to [what?].
\[ \text{entropy} = \sum_{i=0}^{N} -p(x) \log(p(x)) \]