$$ s(t) =\int x(a)w(t − a)da $$
or denoting this as
$$ s(t) = (x ∗ w)(t) $$
This operation is called convolution.
$$ S(i, j) = (I ∗ K)(i, j) =\sum_m\sum_nI(m, n)K(i − m, j − n) $$
is equivalent to (commutative property)
$$ S(i, j) = (K ∗I)(i, j) =\sum_m\sum_nI(i − m, j −n)K(m, n) $$
Achieved by kernel is smaller than input.



Because there is no process for training, parameters are much less