Natural Language Processing

Word Embedding Lookup

How does an embedding layer solve the curse of dimensionality problem?

6 min readOct 11, 2021

This article reviews A Neural Probabilistic Language Model (2003) by Yoshua Bengio et al. In the paper, the authors proposed to train a neural language model end-to-end, including a learnable word embedding layer.

Their neural language model significantly outperformed the best n-grams model at that time, thanks to the embedding layer solving the curse of dimensionality problem.

We discuss the following topics:

A Probabilistic Language Model
Curse of Dimensionality
Distributed Representation
Embedded Lookup
Word Similarity and Probability
Neural Language Model

A Probabilistic Language Model

A probabilistic language model predicts the next word given a sequence of words before that.

For example, given the following sequence of words, what would be the next word?

Natural Language Processing

Word Embedding Lookup

How does an embedding layer solve the curse of dimensionality problem?

A Probabilistic Language Model

Written by Naoki