Radek Osmulski
  • Twitter
  • LinkedIn
  • YouTube
  • GitHub
  • My book
  • Projects
  • Twitter threads
  • About

Hey friends —

I'm Radek. I'm a Senior Data Scientist at NVIDIA and an author.

On this site I write about machine learning techniques and strategies you can use to learn faster.

How did I get here?
Featured

How did I get here?

An introductory chapter to a book on learning machine learning that I wrote.
Nov 9, 2021 3 min read
Introduction to Proximal Policy Optimization (PPO)

Introduction to Proximal Policy Optimization (PPO)

The previous blog post looked at the Vanilla Policy Gradient (VPG) method. Trust Region Policy Optimization and Proximal Policy Optimization build on top of VPG and aim to address its shortcomings. The two main problems of training on-policy reinforcement learning algorithms that PPO addresses are: 1. Training data distribution shifts
Jun 13, 2024 4 min read
Understanding Policy Gradient - a fundamental idea in RL

Understanding Policy Gradient - a fundamental idea in RL

How do you begin to learn Reinforcement Learning? My preferred approach is to study code. Reading and analyzing code can help disambiguate many ideas and concepts in papers or blog posts that can be hard to understand otherwise. Crafting an optimal policy by learning value functions is a very straightforward
Jun 12, 2024 7 min read
Diving into Diffusion Policy with LeRobot

Diving into Diffusion Policy with LeRobot

In a recent blog post, we looked at the Action Chunking Transformer (ACT). At the heart of ACT lies an encoder-decoder transformer that when passed in * an image * the current state of the robot * and an optional style variable z generates the next chunk_size number of actions. But even
Jun 2, 2024 11 min read
Meta Learning: Addendum or a revised recipe for life

Meta Learning: Addendum or a revised recipe for life

In 2021 I published Meta Learning: How To Learn Deep Learning And Thrive In The Digital World. The book is based on 8 years of my life where nearly every day I thought about how to learn machine learning and how to do machine learning efficiently and at a high
Jun 1, 2024 10 min read
How to teach your computer to play video games

How to teach your computer to play video games

Teaching your computer to play video games has all the components of a sublime storyline: * it is ingenious * beautiful in its simplicity * and utterly surprising I will explain the main ideas of deep Q learning, using as few big and scary nouns as possible, as we teach our computer to
May 26, 2024 4 min read
An Introduction to the Action Chunking Transformer

An Introduction to the Action Chunking Transformer

This is a gentle introduction to training two robotic arms using transformers. If you would rather jump straight into code, please find a batteries-included notebook here. Humans have figured out how to do a lot of neat and useful things. Wouldn't it be great if we could teach
May 5, 2024 10 min read
How to train an Alpaca?

How to train an Alpaca?

There used to be a time when fine-tuning LLMs on off-the-shelf hardware wasn't a thing. Then the Llama weights got leaked, Stanford Alpaca was released, and the rest is history. So how was Alpaca fine-tuned? And why might we care? On one hand, Alpaca is where the Cambrian
Aug 25, 2023 7 min read
How to fine-tune a Transformer (pt. 2, LoRA)

How to fine-tune a Transformer (pt. 2, LoRA)

In part 1 of this series, I fine-tuned a Transformer using techniques straight from Universal Language Model Fine-tuning for Text Classification published in 2018. But so much has happened in the last 5 years! My plan was to read a couple of papers next, but I stumbled across LoRA: Low-Rank
Aug 12, 2023 8 min read
How to fine-tune a Transformer?

How to fine-tune a Transformer?

I only started to learn about LLMs and in this blog post, I share how I would approach fine-tuning a Transformer today. Which of the techniques I learned years ago still work in the era of the Transformer? Also, toward the end of the blog post, I address the training
Aug 6, 2023 5 min read
SLURM survival guide

SLURM survival guide

This post is written from the perspective of someone learning to run SLURM jobs. There might be some inaccuracies but the idea is to get you up and running fast. Unfortunately, the only material on SLURM I have been able to find was written by MLOps folks for MLOps folks
Jul 31, 2023 4 min read
How to evaluate an LLM on your data?

How to evaluate an LLM on your data?

Being able to evaluate the outputs of an LLM model on your test set is a very valuable problem to solve. Imagine this scenario. You work at ACME Inc and one morning you pour yourself a great cup of coffee, open Slack, and see this: Employee #342526, we need you
Jul 15, 2023 7 min read
Use ChatGPT inside Jupyter Notebook
personal project

Use ChatGPT inside Jupyter Notebook

Bringing the new tool as close as possible to where people already do their work is key.
Apr 3, 2023 3 min read
An IDE for the era of AI

An IDE for the era of AI

So much code that I would have to write by hand automagically appears on my screen!
Apr 3, 2023 4 min read
There is something weird about the current generation of AI — better pay attention

There is something weird about the current generation of AI — better pay attention

Hype aside, there is something very uncanny about the most recent generation of AI models.
Mar 27, 2023 3 min read
How to reach the top of the imagenette leaderboard?

How to reach the top of the imagenette leaderboard?

How to make your NNs more shift-invariant? What are some hyperparameter changes worth considering when training with a limited budget of epochs?
Aug 18, 2021 5 min read
Going From Not Being Able to Code to Deep Learning Hero

Going From Not Being Able to Code to Deep Learning Hero

A detailed plan for going from not being able to write code to being a deep learning expert. Advice based on personal experience.
Aug 18, 2021 9 min read
How to build a Deep Learning system that will answer questions about the Harry Potter universe?

How to build a Deep Learning system that will answer questions about the Harry Potter universe?

Riva is a set of APIs into a very complex, very well staffed AI research organization.
Aug 6, 2021 4 min read
20 Years of Tech Startup Experiences in One Hour by Jeremy Howard
notes

20 Years of Tech Startup Experiences in One Hour by Jeremy Howard

There is no such thing as business... there is only such a thing as making things people want and selling them to them.
Jul 29, 2021 3 min read
How to use the power of the community to learn faster

How to use the power of the community to learn faster

Community is the most powerful force behind online learning. It is the reason why MOOCs have a limited impact and tight-knit communities like fast.ai consistently produce unbelievable results.
May 21, 2021 13 min read
How to train and validate on Imagenet
howto

How to train and validate on Imagenet

Training on Imagenet is something that is completely trivial after you do it once, but if you are just someone on the Internet without such prior experience, it is an insurmountable task. Up until a couple of days ago, I didn't even know how to get the data!
May 8, 2021 6 min read
Machine Learning and Testing

Machine Learning and Testing

The rewards of testing can be immense, but so can be the price that one would need to pay for testing poorly.
Mar 8, 2019 5 min read
How to train your neural network

How to train your neural network

Evaluation of cosine annealing.
Mar 12, 2018 5 min read
Why take the log of a continuous target variable?

Why take the log of a continuous target variable?

In this article, we’ll look at a simple but useful concept that often gets overlooked.
Mar 5, 2018 4 min read
How to do machine learning efficiently

How to do machine learning efficiently

The only way to maintain your sanity in the long run is to be paranoid in the short run.
Jan 22, 2018 5 min read
Page 1 of 2 Older Posts →
Radek Osmulski © 2025
Powered by Ghost