deepdream of
          a sidewalk
Show Question
Math and science::Algebra

Preloaded dot product

Matrix products WTW and (WWT)1 can be thought of as preloading a dot product of vectors after both vectors are transformed by W or W1.


WTW

Consider the dot product xTy of two vectors x and y. Now consider the transformed vectors Wx and Wy. The dot product of the transformed vectors is:

(Wx)T(Wy)=xTWTWy=xT(WTW)y

So when looking at a matrix product WTW, know that it will carry out the transformation and then dot product of any two vectors if they are placed on either side of the product.

Special case: length² of transformed vector

Let b=Wx. What is the of b? Let's make this a function that takes in any vector x and returns the length-squared of the transformed vector Wx. This function is simply f(x)=xTWTWx. We can see WTW as being the implementation of this function.

WWT

WWT acts similarly, but with an inverse transformation.Again, consider the dot product xTy of two vectors x and y. Now consider the transformed vectors W1x and W1y, where we are now inverting the transformation W. The dot product of the transformed vectors is:

(W1x)T(W1y)=xT(W1)TW1y=xT(WWT)1y

So when looking at a matrix product WWT, know that when this matrix is inverted, (WWT)1, it will carry out the inverse transformation and then dot product of any two vectors if they are placed on either side of the product.

QQT=(QQT)1 for orthonormal Q

When Q is orthonormal, Q1=QT, so it's easy to see that:

(QQT)1=(QT)1Q1=QQT

So orthonormal QQT could be thought of as preloading the inverse then dot product of two vectors, even without the matrix inverse operation appearing in the formula. However, more pertinently, we have QQT=QQ1=I, so QQT is the identity matrix. And so we have arrived at the intuitive idea that a rotation transformation, Q, doesn't change the length of a vector or the angle between two vectors.

QQT when not full rank (projection)

If Q has say 2 orthonormal columns in R3, then it doesn't have an inverse, but we can ask for the projection of a vector b onto the column space of Q by computing QQTb. Here, QTb projects b onto the column space of Q, and then Q(QTb) rehydrates the projection so that it's expressed in the original space.

Example

Consider a random vector X:RnRn distributed as a standard normal distribution, XN(0,I). We can derive a random variable Y=WX for some matrix W. Using the change of variable formula for distributions, we can derive the distribution of Y.

If X and Y were 1 dimensional random variables, with Y=g(X), the change of variable formula would be:

PY(t)=PX(g1(t))|ddtg1(t)|

The multidimensional version, with Y=WX, of this formula is:

PY(t)=PX(W1t)|det(ddtW1t)|

and applying this to the standard normal distribution, we get:

PY(t)=PX(W1t)|det(W1)|=Constant×exp(12(W1t)T(W1t))=Constant×exp(12tT(W1)TW1t)=Constant×exp(12tT(WWT)1t)=N(0,(WWT)1)=N(0,S)

where S is the covariance matrix of S (aka the cross-covariance matrix S=Cov(Y,Y).

So while the covariance matrix Σ of Y is specified to parameterize the multivariate normal distribution, under the hood, this matrix is preloading the dot product of two vectors that are brought back to the original space by inverting the transformation W.

Note: the fact that WWT is the covariance matrix of Y is another separate result that is a special case of the more general theorem:

Cov(WZ)=WCov(Z)WT

for which there is another page.