「 ML/NLP 」
January 18, 2025
Words count
4.3k
Reading time
4 mins.
Building My First RAG System: Grounding LLMs in RealityOne of the first things you learn about large language models is their “knowledge cutoff.” Ask a model about an event that happened yesterday, and it will politely tell you it doesn’t have access to real-time information. For a project last semester, we were tasked with building a Q&A bot about recent developments in our field, and this limitation was a huge roadblock. That’s when our professor introduced us to Retrieval-Augmented Gen...
Read article
「 ML/NLP 」
November 28, 2024
Words count
4.8k
Reading time
4 mins.
A Grad Student’s Guide to Fine-Tuning LLMs: From Brute Force to FinesseWhen I was assigned my first big research project, the goal was to adapt a general-purpose large language model for a very specific task: analyzing sentiment in financial news. My first thought was, “Easy, I’ll just fine-tune it.” I quickly learned that “just fine-tuning” is a massive oversimplification. The journey taught me a ton about the different strategies we have at our disposal.
The Default: Full-Parameter Fine-Tun...
Read article
「 ML/NLP 」
September 30, 2024
Words count
4.6k
Reading time
4 mins.
Making Sense of Multimodality: How Models See and ReadFor the longest time in my NLP studies, the world was made of text. Then came the rise of multimodal AI, and suddenly models could see, hear, and read all at once. For my seminar on advanced models, I had to do a deep dive into how exactly you get a model to understand both an image and a sentence at the same time. It turns out there are a few competing philosophies, each with its own flavor.
The First Big Question: When Do We Mix the Ingr...
Read article