ML/NLP March 28, 2025

Grappling with LLM Evaluation: A Student's Field Notes

Words count 4.3k Reading time 4 mins.

Grappling with LLM Evaluation: A Student’s Field NotesOne of the biggest shifts for me moving from traditional NLP tasks to working with large language models has been evaluation. In my earlier cou... Read article

ML/NLP February 22, 2025

Trying to Understand MoE: How LLMs Get Both Bigger and Smarter

Words count 3.7k Reading time 3 mins.

Trying to Understand MoE: How LLMs Get Both Bigger and SmarterEvery few months, a new paper or model release sends a shockwave through the NLP community. Recently, it was all about models with “trillions of parameters.” My first reaction was, “How is that even possible?” The computational cost to run a dense model of that size would be astronomical. The answer, as I learned after a deep dive with my reading group, lies in a clever architecture called Mixture of Experts (MoE). The “Committee o... Read article

ML/NLP December 20, 2024

How I Shrank My LLM: A Student's Dive into Model Quantization

Words count 4.3k Reading time 4 mins.

How I Shrank My LLM: A Student’s Dive into Model QuantizationOne of the most exciting and frustrating moments in my Master’s program was when I finally got my hands on a powerful, pre-trained language model. The excitement came from its incredible capabilities; the frustration came when I realized it was too big to run on my university-provided GPU for any serious fine-tuning. This sent me down the rabbit hole of model compression, and my first major stop was quantization. What Exactly is Qua... Read article

ML/NLP October 25, 2024

On Data Contamination in LLMs: A Grad Student's Perspective

Words count 4.7k Reading time 4 mins.

On Data Contamination in LLMs: A Grad Student’s PerspectiveIn my NLP seminar last semester, a recurring theme was the integrity of our evaluation benchmarks. We spent weeks discussing how to measure progress, but one topic that really stuck with me was data contamination—the subtle, almost accidental way we can end up “cheating” on our tests. It’s a problem that seems technical on the surface but cuts to the very core of our field’s credibility. The Core of the Problem: When Test Data Becomes... Read article
0%