Ievgen Vakulenko - Accelerating Mixture of Experts Training With Rail-Optimized InfiniBand Networking in Crusoe Cloud

Video Available!

State-of-the-art machine learning models are increasingly using techniques like mixture of experts that enable larger-scale models to be trained more efficiently by distributing layers of the model across multiple neural networks. This sparse distribution of model state puts increasing pressure on cluster-level networking while training. At Crusoe Cloud, we’ve built a high-performance InfiniBand network that's designed to provide the highest possible performance for these state-of-the-art training techniques. We use a “rail-optimized” design, reducing the number of hops between any set of GPUs in our cluster, accelerating all2all performance, and reducing training time. Learn more about how to utilize Crusoe Cloud rail-optimized networks to accelerate your training workloads.

Ievgen Vakulenko

Ievgen is a product manager at Crusoe, focused on building reliable and scalable AI-cloud infrastructure. He defines and guides the design of large and ultra-large-scale, multi-tenant GPU clusters, enabling customers to use thousands of GPUs simultaneously for ML training and inference. Before joining Crusoe, Ievgen held several different technical and product positions at networking vendors.

Ievgen Vakulenko
Ievgen VakulenkoProduct Manager

Buy Tickets

We have now sold out of Early Bird tickets; General Admission has also sold out.
Please join us online for the free livestream.

Buy Tickets SOLD OUT!