\(
\newcommand{\cat}[1] {\mathrm{#1}}
\newcommand{\catobj}[1] {\operatorname{Obj}(\mathrm{#1})}
\newcommand{\cathom}[1] {\operatorname{Hom}_{\cat{#1}}}
\newcommand{\multiBetaReduction}[0] {\twoheadrightarrow_{\beta}}
\newcommand{\betaReduction}[0] {\rightarrow_{\beta}}
\newcommand{\betaEq}[0] {=_{\beta}}
\newcommand{\string}[1] {\texttt{"}\mathtt{#1}\texttt{"}}
\newcommand{\symbolq}[1] {\texttt{`}\mathtt{#1}\texttt{'}}
\newcommand{\groupMul}[1] { \cdot_{\small{#1}}}
\newcommand{\groupAdd}[1] { +_{\small{#1}}}
\newcommand{\inv}[1] {#1^{-1} }
\newcommand{\bm}[1] { \boldsymbol{#1} }
\require{physics}
\require{ams}
\require{mathtools}
\)
Math and science::INF ML AI
Negative log likelihood loss. A perspective.
Negative log likelihood loss is normally calculated as the positivized mean log likelihood. This is:
\[
\text{loss} = \sum_{i=0}{i=\text{n_steps}} \mathcal{P}(\text{data} | \text{model_out})
\]
As this mean is taken over many samples, it approximates an expectation—an expectation over log probabilities. Sound familiar? This is an approximation to [what?].
\text{entropy} = \sum_{i=0}{N} -p(x) \log(p(x))