Think small to achieve big!
Why Huge LLMs Aren’t the Solution for Everything
Imagine you need to hire someone for a simple task: sorting mails in your company inbox into four categories—spam, leads, orders, and support. Who would you pick? An office assistant? Or someone who studied quantum physics, is an expert on ancient history, can quote every Bible verse perfectly, and also knows every paragraph Darwin ever wrote?
Sounds absurd, right? Yet, that’s exactly what we do these days. For such simple tasks, we often employ Large Language Models (LLMs)—and not just any LLMs, but the largest ones available, like ChatGPT. We pass trivial spam emails from Nigerian princes through trillions of parameters—parameters containing Shakespeare’s works, every rocket science formula, recipes for nearly every meal ever cooked, and a deep understanding of Immanuel Kant’s writings—just to extract the obvious conclusion: “Yeah, it’s spam.”
Ridiculous, isn’t it?
So, why do we do it? Because it’s convenient. Add an OpenAI API key, write a two-line prompt, and you’re done. A shiny product is born. It’s like shooting a dove with an atomic bomb—just because you have one lying around.
The cost of convenience: That Nigerian prince’s email? It produced about 3 grams of CO₂ and consumed 3 Wh of energy—enough to light a room for 15 minutes. Still sound trivial? Scale this up to a 100-employee organization categorizing their emails daily. In one year, you’d generate 8.8 metric tons of CO₂, burn 12,600 kWh of energy, and spend approximately $200,000.
All this, because a spam email passed through trillions of parameters that the world’s largest models needed to make sense of quantum physics, protein folding, and the lyrics of million of songs before concluding the obvious.
So, what’s the alternative?
Experts. A more suitable small model, like DistilBERT, would need just around 0.00003 times the resources. It would consume only 0.48 kWh over an entire year, produce just 52.8 grams of CO₂, and cost a tiny fraction of the money.
Yes, from a software developer’s perspective, it’s tempting to go big. Hyper-large LLMs require little effort—a few lines of code, and you’ve got a functioning product. But in the end, it’s your customers who pay the massive bill—and not even to you. What have we truly achieved? A flashy AI that serves neither us nor our customers, but instead drives revenue straight into the pockets of OpenAI, Microsoft, Amazon, and the like.
Congratulations! We’ve created business for big business.
At ObviousFuture, we take a different path. We invest heavily in engineering, expertise, and optimization to develop custom models embedded into intelligent compounds. It’s not easy, and it’s not cheap. This approach demands a deep understanding of machine learning, right down to the mathematical foundations. It takes creativity, effort, and significant engineering to train models and build systems so efficient they can even run on on-premise installations.
Why bother? Because in the long run, this approach pays off—for us and our customers. Our investments deliver ML solutions that are not only sustainable and cost-efficient but also make solid business sense.
As software and solution developers, it’s our responsibility to invest. And investing in machine learning means going the extra mile—not just downloading a model from Hugging Face or relying on a common API-prompt, but doing what we’re paid to do: work hard.
Emerging technologies always evolve through two phases: possibility and efficiency. The first airplanes were revolutionary simply because they proved, “Wow, it can fly!” But soon, these early breakthroughs led to practical questions:
“How much does it cost?” “How efficient is it?” “How can it help me generate revenue?”
We are already transitioning out of AI’s “possibility phase” and entering a new era: the “efficiency phase”. While today’s AI still amazes us with groundbreaking features, we are fast approaching a time when success will hinge on engineering efficiency. This will require us to do real ML—not rely on uneconomical mega models, but focus on tailored expert solutions.