\(
\newcommand{\cat}[1] {\mathrm{#1}}
\newcommand{\catobj}[1] {\operatorname{Obj}(\mathrm{#1})}
\newcommand{\cathom}[1] {\operatorname{Hom}_{\cat{#1}}}
\newcommand{\multiBetaReduction}[0] {\twoheadrightarrow_{\beta}}
\newcommand{\betaReduction}[0] {\rightarrow_{\beta}}
\newcommand{\betaEq}[0] {=_{\beta}}
\newcommand{\string}[1] {\texttt{"}\mathtt{#1}\texttt{"}}
\newcommand{\symbolq}[1] {\texttt{`}\mathtt{#1}\texttt{'}}
\newcommand{\groupMul}[1] { \cdot_{\small{#1}}}
\newcommand{\inv}[1] {#1^{-1} }
\newcommand{\bm}[1] { \boldsymbol{#1} }
\require{physics}
\require{ams}
\)

Math and science::INF ML AI::probabilistic graphical models

# Motivation for graphical models

Daphne Koller describes the motivation for graphical models:

With the goal of representing a joint distribution \( P \) over some set of random variables \( \mathcal{X} = \{X_1, ..., X_n\} \). Even in the simplest case where these variables are binary-valued, a joint distribution requires specifying

[...] numbers. For all but the smallest \( n \), the explicit representation of the joint distribution is unmanageable from every perspective. It is expensive both in terms of

[...] and

[...]. The numbers can be

[...] and not correspond to events that people can reaonably contemplate. If we wanted to learn the distribution from data, we would need ridiculously large

[...] to estimate this many parameters robustly. These problems were the main barrier to the adoption of probabilistic methods for expert systems until the development of the methodologies described in this book.

Probabilistic graphical models use a [...] as the basis for compactly encoding a complex distribution over a high-dimensional space.