avatar

Alan

AI Practitioner

The Hidden Cost of LLM Training: Why Optimizers Gulp Down Vram

The error of death 🔗Have you been constantly battling with VRAM usage in finetuning LLM, and constantly struggling with the below error RuntimeError: CUDA error: out of memory.. The above is the destroyer of joy, the sudden stop of happiness and the most dreadful error you might have faced as someone trying to train an AI model, or more specifically LLM (cuz I assume it’s the most VRAM intensive among the bunch).

Beyond the Hype: Use Case-Driven AI Development

Today I went biking and… I realize something I wanted. When I were biking I weren’t able to use the phone for navigation I were listening to music, but could not change tracks easily … I realize how much I need a hands-free solutions to control my phone in general while I riding a bicycle. My work is in AI, at this point I should have something just to parse my voice to just pick something I need, a better (yes) Siri or something like that? Guess what It is literally nowhere to be seen, something like that, for a cyclist!

Multi Modal Tokenizing With Chameleon

In LLMs, a very fundamental step is tokenizing. In order to make the LLM understand what you are inputing you need to convert text into numbers. But one might wonder? how about images, sounds, … everything else but text? That’s exactly the question I will answer today. The Logic Behind Tokenizing an Image 🔗To tokenize an image, we must first understand the fundamental principles behind tokenizing text. There are three key aspects of text tokenization that differ significantly from image tokenization: