AI & MLtransformer
Transformer
The neural network architecture underlying modern LLMs, introduced in 'Attention Is All You Need' (2017). Transformers use self-attention mechanisms to process input sequences in parallel (unlike recurrent networks). Key components: multi-head attention, positional encoding, feedforward layers, and layer normalization. Variants include encoder-only (BERT), decoder-only (GPT), and encoder-decoder (T5).
Related terms
2AI & ML
LLM (Large Language Model)
A neural network trained on vast text corpora to understand and generate human language. LLMs (GPT-4, Claude, Llama, Gem...
AI & ML
Attention Mechanism
A neural network component that allows models to weigh the relevance of different parts of the input when producing outp...