Oct '24 (edited) • 💬 General
Move over Groq, Cerebras fastest at Inferences
"""
  • Cerebras' Wafer-Scale Engine has outperformed Groq in delivering the fastest AI inference.
  • Cerebras Inference clocks up to 1,800 tokens per second while running the 8B model and 450 tokens per second on the 70B model.
  • In comparison, Groq reaches up to 750 T/s and 250 T/s while running 8B and 70B models, respectively.
  • """
Now Llamma 70b over 2000 tokens / sec.
6
2 comments
Anaxareian Aia
7
Move over Groq, Cerebras fastest at Inferences
Data Alchemy
skool.com/data-alchemy
Your Community to Master the Fundamentals of Working with Data and AI — by Datalumina®
Leaderboard (30-day)
Powered by