Research notes.
-
Attention Is All You Need
Introduces the Transformer — a sequence-to-sequence architecture built entirely on attention, dispensing with recurrence and convolutions. Foundation of every modern LLM.
Introduces the Transformer — a sequence-to-sequence architecture built entirely on attention, dispensing with recurrence and convolutions. Foundation of every modern LLM.