THE TRANSFORMER SERIES

Transformer’s Positional Encoding

How Does It Know Word Positions Without Recurrence?

Naoki
11 min readOct 30, 2021

--

In 2017, Vaswani et al. published a paper titled “Attention Is All You Need” for the NeurIPS conference.

They introduced the original transformer architecture for machine translation, performing better and faster than RNN

--

--