Writing
I write about Large Language Models, Generative AI, and the practical challenges of building production ML systems. Here you’ll find deep dives into model architectures, training techniques, and lessons learned from deploying AI at scale.
Understanding Transformer Architectures
January 23, 2025
The Transformer architecture revolutionized NLP by introducing the attention mechanism. This deep dive explores how transformers work, from self-attention to positional encoding, and why they've be...