Content on this site is AI-generated and may contain errors. If you find issues, please report at GitHub Issues .

Transformer Core Mechanisms

Deep dive into every component of the Transformer, from architecture to attention

  1. 1

    Transformer Architecture Overview

    Intermediate
    #transformer#architecture
  2. 2

    QKV Data Structures and Intuition

    Intermediate
    #transformer#attention#qkv
  3. 3

    Attention Computation in Detail

    Intermediate
    #transformer#attention#softmax
  4. 4

    Multi-Head Attention

    Intermediate
    #transformer#attention#multi-head
  5. 5

    MQA and GQA

    Advanced
    #transformer#attention#mqa#gqa#kv-cache
  6. 6

    Attention Variants: From Sliding Window to MLA

    Advanced
    #transformer#attention#mla#sliding-window#cross-attention
  7. 7

    KV Cache Fundamentals

    Advanced
    #inference#kv-cache#memory#optimization
  8. 8

    Prefill vs Decode Phases

    Intermediate
    #inference#prefill#decode#performance
  9. 9

    Flash Attention Tiling Principles

    Advanced
    #attention#hardware-optimization#flash-attention#memory
  10. 10

    Positional Encoding β€” Giving Transformers a Sense of Order

    Intermediate
    #transformer#attention#positional-encoding
  11. 11

    Sampling & Decoding β€” From Probabilities to Text

    Intermediate
    #inference#sampling#decoding#perplexity
  12. 12

    Speculative Decoding β€” Accelerating LLM Inference via Guessing

    Advanced
    #inference#optimization#speculative-decoding
  13. 13

    Mixture of Experts: Sparsely Activated Large Model Architecture

    Advanced
    #transformer#moe#routing#deepseek#mixtral
  14. 14

    State Space Models and Mamba

    Advanced
    #ssm#mamba#state-space-model#selective-scan#sequence-modeling
  15. 15

    Hybrid Architectures: Fusing Mamba with Attention

    Advanced
    #hybrid#mamba#jamba#zamba#hymba#architecture
  16. 16

    Qwen3-Coder-Next Architecture: When SSM, Attention, and MoE Converge

    Advanced
    #hybrid#moe#ssm#deltanet#qwen#architecture