Chang She & Noah Shpak - The Hierarchy of Needs for Training Dataset Development

Video Available!

Training and fine-tuning models depends critically on how you construct your dataset. Part art, part science, we’ll share with you practical lessons in dataset construction at Character AI and how to build a data platform to support rapid iterative refinement of training data. For LLMs, data scale is much larger and workloads are more diverse. This is especially true for multimodal datasets. To deal with these challenges, we'll show you how LanceDB is used in production to solve many pain-points around the storage, management, and querying of large scale AI data.

Chang She

Chang She
Chang SheCEO

Noah Shpak

Noah is a Research Engineer with a passion for building data systems and ML platforms from the ground up.

He leads the Data Platform team at Character, focusing on accelerating foundation model research, alignment, and product development through internet-scale data mining, prompting tools, and retrieval systems. Making data go vroom while gpus go brrrr is what makes him (and the team) tic!

Noah Shpak
Noah ShpakMember of Technical Staff

Buy Tickets

We have now sold out of Early Bird tickets; General Admission has also sold out.
Please join us online for the free livestream.

Buy Tickets SOLD OUT!