GPT-3: In-Context Few-Shot Learner (2020)

Language Models are Few-Shot Learners

Naoki
7 min readJan 4, 2023

#GPT #Transformer

In 2020, OpenAI announced GPT-3, a generative language model with 175 billion parameters, 10x more than any previous language model, and published its performance on NLP benchmarks. However, it wasn’t just another size upgrade. GPT-3 showed the improved capability to handle tasks purely via text interaction.

Those tasks include zero-shot, one-shot, and few-shot learning, where the model is given a task definition and/or a few examples and must perform the task without additional training. That is, no fine-tuning is used. It is as though humans perform a new language task from only a few examples of simple instructions. Yet, in some cases, GPT-3 nearly matches the performance of SOTA (state-of-the-art) fine-tuned systems.

This article explains how in-context learning works.

GPT3 and Meta-Learning

Kids build language capability by absorbing experiences without concrete tasks or instructions. They acquire skills like understanding the language structure and recognizing patterns in conversations and written texts. Eventually, they can predict what should come next, given the context. Furthermore, with enough language skills, they become proficient in…

--

--