Natural Language Processing

Word Embedding Lookup

How does an embedding layer solve the curse of dimensionality problem?

Naoki
6 min readOct 11, 2021

--

This article reviews A Neural Probabilistic Language Model (2003) by Yoshua Bengio et al. In the paper, the authors proposed to train a neural language model end-to-end, including a learnable word embedding layer.

Their neural language model significantly outperformed the best n-grams model at that time, thanks to the embedding layer solving the curse of dimensionality problem.

We discuss the following topics:

  • A Probabilistic Language Model
  • Curse of Dimensionality
  • Distributed Representation
  • Embedded Lookup
  • Word Similarity and Probability
  • Neural Language Model

A Probabilistic Language Model

A probabilistic language model predicts the next word given a sequence of words before that.

For example, given the following sequence of words, what would be the next word?

--

--