The emergence of Groq, a novel artificial intelligence (AI) model, has captivated social media audiences with its remarkable speed and innovative technology that potentially eliminates the necessity for GPUs.
Groq quickly gained popularity when its public benchmark tests circulated widely on the social media platform X, showcasing its superior computation and response speed compared to the well-known AI chatbot ChatGPT.
The first public demo using Groq: a lightning-fast AI Answers Engine.
It writes factual, cited answers with hundreds of words in less than a second.
More than 3/4 of the time is spent searching, not generating!
The LLM runs in a fraction of a second.https://t.co/dVUPyh3XGV https://t.co/mNV78XkoVB pic.twitter.com/QaDXixgSzp
— Matt Shumer (@mattshumer_) February 19, 2024
This achievement can be attributed to Groq’s development team creating a custom application-specific integrated circuit (ASIC) chip tailored for large language models (LLMs), enabling it to produce approximately 500 tokens per second. In contrast, ChatGPT-3.5, the commonly available variant of the model, manages around 40 tokens per second.
Groq Inc., the company behind this model, asserts that it has introduced the first language processing unit (LPU) to power its model, diverging from the conventional reliance on scarce and expensive graphics processing units (GPUs) for AI model execution.
Wow, that’s a lot of tweets tonight! FAQs responses.
• We’re faster because we designed our chip & systems
• It’s an LPU, Language Processing Unit (not a GPU)
• We use open-source models, but we don’t train them
• We are increasing access capacity weekly, stay tuned pic.twitter.com/nFlFXETKUP— Groq Inc (@GroqInc) February 19, 2024
Although the origins of the Groq project date back to 2016, its recent surge in attention coincides with the rise of Elon Musk’s AI model, Grok, spelled with a “k.” The original Groq developers addressed Musk’s naming choice in a blog post, urging him to select a distinct name due to potential confusion.
Despite Groq’s social media prominence, neither Musk nor the Grok page on X has addressed the similarity in names between the two models. Nevertheless, users on the platform have begun drawing comparisons between the LPU model and established GPU-based models.
AI development practitioners consider Groq a “game changer” for applications requiring low latency, indicating its ability to swiftly process requests and deliver responses.
Additionally, there is speculation that Groq’s LPUs could significantly enhance AI applications’ performance compared to GPUs, potentially serving as a viable alternative to Nvidia’s high-performance hardware like the A100 and H100 chips, which are currently in high demand.
side by side Groq vs. GPT-3.5, completely different user experience, a game changer for products that require low latency pic.twitter.com/sADBrMKXqm
— Dina Yerlan (@dina_yrl) February 19, 2024
This trend aligns with a broader industry movement where major AI developers are exploring in-house chip development to reduce dependency on Nvidia’s offerings. OpenAI, for instance, is reportedly pursuing substantial funding to develop its own chip technology, aiming to address scalability issues with its products.