Learning XOR

$$ f(x; W , c, w, b) = w^\top max\{0, W^\top x + c\} + b $$

$$ W=\begin{bmatrix} 1 & 1\\ 1 & 1 \end{bmatrix} $$

$$ c=\begin{bmatrix} 0\\ -1\end{bmatrix} $$

$$ w=\begin{bmatrix} 1\\ -2 \end{bmatrix} $$

$$ b=0 $$

Gradient-Based Learning

$$ J(θ) = −E_{x,y\sim \hat{p}{data}}log p{model}(y | x) $$

Architecture Design

Untitled

Untitled

Untitled