List: Language Models | Curated by Naoki

Aug 13, 2023
21 stories
1 save
Language ModelsNatural Language Processing Articles including Transformer Models
Naoki
Learning Transferable Visual Models From Natural Language Supervision (2021)Bridging the Gap Between Vision and Language — A Look at OpenAI’s CLIP Model
Aug 13, 2023
Aug 13, 2023
Naoki
ICL: Why Can GPT Learn In-Context? (2022)Why Can GPT Learn In-Context? Language Models Secretly Perform Gradient Descent as Meta-Optimizers
Apr 30, 2023
Apr 30, 2023
Naoki
GPT-3: In-Context Few-Shot Learner (2020)Language Models are Few-Shot Learners
Jan 4, 2023
Jan 4, 2023
Naoki
GPT-2: Too Dangerous To Release (2019)Language Models are Unsupervised Multitask Learners
Dec 30, 2022
Dec 30, 2022
Naoki
GPT (2018)Generative Pre-Trained Transformer
Dec 26, 2022
Dec 26, 2022
Naoki
DistilBERT — distilled version of BERTHow can we compress BERT while keeping 97% of the performance?
Mar 6, 2022
Mar 6, 2022
Naoki
RoBERTa — Robustly optimized BERT approachHow did RoBERTa outperform XLNet with no architectural changes to the original BERT?
Feb 20, 2022
1
Feb 20, 2022
1
Naoki
BERT — Bidirectional Encoder Representation from TransformersHow and Why Does It Use The Transformer Architecture?
Feb 6, 2022
Feb 6, 2022
Naoki
Transformer’s Evaluation DetailsGreedy and Beam-Search Translator
Jan 30, 2022
Jan 30, 2022
Naoki
Transformer’s Training DetailsOptimizer, Scheduler, Loss Function
Jan 26, 2022
1
Jan 26, 2022
1
Naoki
Transformer’s Data LoaderTo Make Writing A Training Loop Simple
Jan 22, 2022
Jan 22, 2022
Naoki
Transformer’s Coding DetailsSimple Implementation
Jan 16, 2022
3
Jan 16, 2022
3
Naoki
Transformer’s Encoder-DecoderUnderstanding The Model Architecture
Dec 12, 2021
1
Dec 12, 2021
1
Naoki
Transformer’s Self-AttentionWhy Is Attention All You Need?
Nov 14, 2021
Nov 14, 2021
Naoki
Transformer’s Positional EncodingHow Does It Know Word Positions Without Recurrence?
Oct 30, 2021
Oct 30, 2021
Naoki
BLEU (Bi-Lingual Evaluation Understudy)How do we evaluate a machine translation with reference sentences?
Oct 19, 2021
Oct 19, 2021
Naoki
Beam Search for Machine TranslationHow Greedy, Exhaustive and Beam Search Algorithms Work
Oct 17, 2021
1
Oct 17, 2021
1
Naoki
Word Embedding LookupHow does an embedded layer solve the curse of dimensionality problem?
Oct 11, 2021
Oct 11, 2021
Naoki
Neural Machine Translation with Attention MechanismHow Does A Machine Translation Model Know Where To Look?
Sep 28, 2021
Sep 28, 2021
Naoki
Long Short Term MemoryHow LSTM Mitigated the Vanishing Gradients but not the Exploding Gradients
Sep 26, 2021
Sep 26, 2021