Join us for an exclusive live demo of Group Relative Policy Optimization (GRPO), Predibase’s newest SDK feature and the first fully-managed, serverless RFT SDK.
With GRPO, you can fine-tune open-source models with reinforcement learning—no complex infra, no PhD required. Guide your models using custom reward functions and get optimized results with as little as 10 labeled examples.
What you’ll learn:
✅ How GRPO works and why it’s a game-changer for model customization
✅ A live demo fine-tuning an LLM with just a few examples
✅ How to design and implement effective reward functions to align model behavior with your goals
Whether you're deploying LLMs in production or just exploring reinforcement fine-tuning, this hands-on session will show you how using GRPO takes model performance to the next level.