AI & MLtransformer

Transformer

The neural network architecture underlying modern LLMs, introduced in 'Attention Is All You Need' (2017). Transformers use self-attention mechanisms to process input sequences in parallel (unlike recurrent networks). Key components: multi-head attention, positional encoding, feedforward layers, and layer normalization. Variants include encoder-only (BERT), decoder-only (GPT), and encoder-decoder (T5).

Decode this term

Related terms

AI & ML

LLM (Large Language Model)

A neural network trained on vast text corpora to understand and generate human language. LLMs (GPT-4, Claude, Llama, Gem...

AI & ML

Attention Mechanism

A neural network component that allows models to weigh the relevance of different parts of the input when producing outp...