Back to portfolio
Section 13

Transformer Architectures

Deep dives into model internals: Building Multi-Head Attention mechanisms from the ground up.

Projects in this section: 0

Transformer LLM from scratch
Transformer ArchitecturesGitHub

Transformer LLM from scratch

Complete transformer-based language model built from scratch.

Multi-Head Attention from Scratch
Transformer ArchitecturesLocal path

Multi-Head Attention from Scratch

Building the Attention mechanism tensor by tensor.