Ievgen Vakulenko - Accelerating Mixture of Experts Training With Rail-Optimized InfiniBand Networking in Crusoe Cloud

State-of-the-art machine learning models are increasingly using techniques like mixture of experts that enable larger-scale models to be trained more efficiently by distributing layers of the model across multiple neural networks. This sparse distribution of model state puts increasing pressure on cluster-level networking while training. At Crusoe Cloud, we’ve built a high-performance InfiniBand network that's designed to provide the highest possible performance for these state-of-the-art training techniques. We use a “rail-optimized” design, reducing the number of hops between any set of GPUs in our cluster, accelerating all2all performance, and reducing training time. Learn more about how to utilize Crusoe Cloud rail-optimized networks to accelerate your training workloads.

Video Available!

Buy Tickets

We have now sold out of Early Bird tickets; General Admission has also sold out.
Please join us online for the free livestream.

Buy Tickets SOLD OUT!