\( \newcommand{\matr}[1] {\mathbf{#1}} \newcommand{\vertbar} {\rule[-1ex]{0.5pt}{2.5ex}} \newcommand{\horzbar} {\rule[.5ex]{2.5ex}{0.5pt}} \newcommand{\E} {\mathrm{E}} \)
deepdream of
          a sidewalk
Show Question
\( \newcommand{\cat}[1] {\mathrm{#1}} \newcommand{\catobj}[1] {\operatorname{Obj}(\mathrm{#1})} \newcommand{\cathom}[1] {\operatorname{Hom}_{\cat{#1}}} \newcommand{\multiBetaReduction}[0] {\twoheadrightarrow_{\beta}} \newcommand{\betaReduction}[0] {\rightarrow_{\beta}} \newcommand{\betaEq}[0] {=_{\beta}} \newcommand{\string}[1] {\texttt{"}\mathtt{#1}\texttt{"}} \newcommand{\symbolq}[1] {\texttt{`}\mathtt{#1}\texttt{'}} \newcommand{\groupMul}[1] { \cdot_{\small{#1}}} \newcommand{\groupAdd}[1] { +_{\small{#1}}} \newcommand{\inv}[1] {#1^{-1} } \newcommand{\bm}[1] { \boldsymbol{#1} } \newcommand{\qed} { {\scriptstyle \Box} } \require{physics} \require{ams} \require{mathtools} \)
Math and science::INF ML AI

Score function

Score function

Say you observe data \( x \in \mathbb{R} \) from a distribution that depends on a parameter \( \theta \in \mathbb{R} \), \( x \sim p(x | \theta) \). The score function is simply the derivative of \( \log(p) \) with respect to \( \theta \):

\[ s(x, \theta) = \frac{\partial}{\partial \theta} \log(p(x | \theta)) \]

The emphasis is on the data being fixed, and the score being a function of \( \theta \), and so the score function is described as being the derivative of the log-likelihood with respect to parameter \( \theta \). Typically, it is calculated from a list of observations \( x_1, \ldots, x_n \).