The Convolution Operation

$$ s(t) =\int x(a)w(t − a)da $$

or denoting this as

$$ s(t) = (x ∗ w)(t) $$

This operation is called convolution.

Convolutions over More than One Axis

$$ S(i, j) = (I ∗ K)(i, j) =\sum_m\sum_nI(m, n)K(i − m, j − n) $$

is equivalent to (commutative property)

$$ S(i, j) = (K ∗I)(i, j) =\sum_m\sum_nI(i − m, j −n)K(m, n) $$

Motivation

Sparse Interactions

Achieved by kernel is smaller than input.

Untitled

Untitled

Parameter sharing

Untitled

Because there is no process for training, parameters are much less

Equivariant representation