Writing
on ML & Systems

Deep-dives into machine learning, NLP, distributed systems, and research papers — written to be understood, not just referenced.

All Articles

Cut Cross Entropy: 20x Memory reduction in LLM training through optimized cross entropy kernels

Jan 2026

Introduction 2024 saw the first 100m$+ training runs, while the insane compute requirements of training Large Language Models is no secret, this jump brought more attention to the...

Transformers Deep Learning LLMs

Read arrow_forward

Understanding Differential Attention.

Dec 2024

Introduction Over the last few years, Transformers have emerged as the de-facto deep learning architecture in modelling language. Their unprecendented success in solving complex...

Natural Language Processing Transformers

Read arrow_forward

What do you do when majority of your Coroutines are blocking?

Dec 2024

Introduction I recently had to implement a feature at work that basically sent a ton of emails, with custom attachments to a ton of buyers across the world. Since we had a broker,...

Async Python Coroutines

Read arrow_forward

Finetuning GPT2 to Reconstruct Sentences

Jun 2024

Two words are anagrams if one can be formed by permuting the letters of the other. Applying the same logic to a sentence, would be saying that two sentences are anagrams(no such...

Natural Language Processing Transformers

Read arrow_forward

Classifying Code snippets with BERT.

Aug 2023

This is a fun side project where I explored transformers based sentiment classification for the first time by training BERT to identify 15 of the most popular programming...

Natural Language Processing Deep Learning Transformers

Read arrow_forward

Byte-Pair Encoding, The Tokenization algorithm powering Large Language Models.

Jul 2023

Tokenization is an umbrella term for the methods used to turn texts into chunks of words or sub-words. Tokenization has a lot of applications in computer science, from compilers...

Natural Language Processing Deep Learning Transformers

Read arrow_forward

A guide on how AI is changing Computational Photography

May 2023

And Enhance!! (from Blade Runner ), that’s Computational Photography . Computational photography describes signal processing techniques and algorithms that allow computers...

Aritificial Intelligence Deep Learning Computer Vision

Read arrow_forward

Writingon ML & Systems

All Articles

Cut Cross Entropy: 20x Memory reduction in LLM training through optimized cross entropy kernels

Understanding Differential Attention.

What do you do when majority of your Coroutines are blocking?

Finetuning GPT2 to Reconstruct Sentences

Classifying Code snippets with BERT.

Byte-Pair Encoding, The Tokenization algorithm powering Large Language Models.

A guide on how AI is changing Computational Photography

Writing
on ML & Systems