bdbdbdb 8 hours ago

What does this actually mean for LLMs? Cheaper training?

  • MarkusQ 6 hours ago

    Yes. Provided it works as well as they claim.

    Not only cheaper, but (since in this case money ≈ hardware-cost × time), faster. They claim that training time can even approach inference time:

    > EGGROLL's efficiency results in a hundredfold increase in training throughput for billion-parameter models at large population sizes, nearly reaching the throughput of pure batch inference