Large-Scale Deep Learning

Computer Vision

Speech Recognition

Nowadays use CNN

Natural Language Processing

N-Grams

$$ P (x_1, . . . , x_τ) = P (x_1, . . . , x_{n−1})\prod^τ_{t=n}P (x_t| x_{t−n+1}, . . . , x_{t−1}) $$

Neural Language Models

One example is word embeddings

High-Dimensional Outputs