My Blog
May 17, 2025
Layer Normalization as a Projection: The Complete Geometric Interpretation
Apr 27, 2025
Multi Head Latent Attention: The RoPE Compatibility Problem - A Mathematical Analysis
Jan 1, 2025
Analysis of Matrix Multiplications in Transformer Architectures
May 27, 2024
Balancing Memory & Compute: Strategies to Manage KV Cache in LLMs