Home

Writings

All Articles

Cut Cross Entropy: 20x Memory reduction in LLM Pre-training through optimized cross entropy kernels

Jan 2026

Introduction Whilst working on pretraining SabiYarn in 2025, I came across a really interesting paper by a team at Apple called “Cut Your Losses In Large-Vocabulary Language...

Transformers Deep Learning LLMs
Read arrow_forward

Handling Blocking Coroutines effectively in Asyncio

Dec 2024

Introduction I recently had to implement a feature at work that basically sent a ton of emails, with custom attachments to a ton of buyers across the world. Since we had a broker,...

Async Python Coroutines
Read arrow_forward

Finetuning GPT2 to Reconstruct Sentences

Jun 2024

Two words are anagrams if one can be formed by permuting the letters of the other. Applying the same logic to a sentence, would be saying that two sentences are anagrams(no such...

Natural Language Processing Transformers
Read arrow_forward

Classifying Code snippets with BERT.

Aug 2023

This is a fun side project where I explored transformers based sentiment classification for the first time by training BERT to identify 15 of the most popular programming...

Natural Language Processing Deep Learning Transformers
Read arrow_forward

Byte-Pair Encoding, The Tokenization algorithm powering Large Language Models.

Jul 2023

Tokenization is an umbrella term for the methods used to turn texts into chunks of words or sub-words. Tokenization has a lot of applications in computer science, from compilers...

Natural Language Processing Deep Learning Transformers
Read arrow_forward

A guide on how AI is changing Computational Photography

May 2023

And Enhance!! (from Blade Runner ), that’s Computational Photography . Computational photography describes signal processing techniques and algorithms that allow computers...

Aritificial Intelligence Deep Learning Computer Vision
Read arrow_forward