My Blog

May 17, 2025

Layer Normalization as a Projection: The Complete Geometric Interpretation

Apr 27, 2025

Multi Head Latent Attention: The RoPE Compatibility Problem - A Mathematical Analysis

Jan 1, 2025

Analysis of Matrix Multiplications in Transformer Architectures

May 27, 2024

Balancing Memory & Compute: Strategies to Manage KV Cache in LLMs