Texti #49: AGI is much closer to us: O3

Merry Christmas 🎅🎄

How was your week? Reply in this email, I’d really love to hear it. ↩️

For Open AI this week was over like a rocket, as they decided to show off their results with their new o3 model. Why o3 and not o2, if their precedent model name was o1? Simply because o2, is a trademark telephone company, and OpenAI doesn’t want to pay stupid amounts of money for a model name, lol 😀

TLDR:

Super fast, extremely smart, smarter than anything you’ve seen before, but suuuper slow (up to several minutes), extremely expensive (up to several thousands $ per request 🤯), fails at simple tasks, fantastic at math, coding and research. OpenAI claims 88% score on the AGI benchmark 🤖.

Let me see a bit more:

OpenAI launched their first reasoning model o1 back in September, not even 4 months ago. Since then we’ve seen how this model performs incredibly well at reasoning and how fantastic it does when given complex challenges.

Now a couple of days ago, OpenAI announced their o3 model, which sounds and promises to be simply incredible.

Based on the graph above you can see how o3 model simply make o1 look like a toy for toddlers compared to o3 performance.

Now, you may think that the end is near, the AI singularity is here and we’re doomed ☠️. Well don’t worry this is not the case, yet 😅. O3 isn’t perfect, it still struggles with simple tasks a human can easily do, for example playing tic tac toe, or trying to over-reason something that doesn’t need so much reasoning. O3 is still a mathematical algorithm that tries to solve math like problems, at which it is outstanding, and does excellent. It managed to set a new record on EpochAI’s Frontier Math benchmark, solving 25.2% of problems, while other models barely exceeded 2%, which is an incredible result.

It also managed to solve a crazy amount of coding challenges in the benchmarks, making it one of the best performers so far.

But all this comes with a caveat, it is extremely slow, it explains you everything it does, re-promting itself, and guiding itself in the right direction, and it can take up to several minutes to give you an answer, and all of those minutes it is computing heavily and uses tooons of resources, as many as several crypto farms. Which results in costing a lot of money, in the end a single question may cost you several thousands US dollars.

This single reason raises the following question: Should OpenAI continue with the same trajectory? Should it continue with the reasoning, or should it change the approach to something else? Well many consider the later, minutes of processing and high cost, limit it to an extremely niche target group. As such o3 is available only to a restricted number of researchers who are working and testing its limits.

Next and probably the biggest concern OpenAI has is that o1 already deceives humans and loses control from time to time, going wild and wherever it wants. Which is why OpenAI is trying to put even more safe-guards, and trying to test and limit its playground.

As closing words I would like leave you with the following:
If you’re in your 30s or more, you’ve been through an entire change of epochs, from mechanical geniuses to electronic marvels. You should be used with stuff that is often inexplicable, and you’re rolling with them. Well this is another step towards something new. Congrats you’re seeing yet another change.

If you are younger, then I’m envious, your future is going to be incredible, you’ll see things many can’t even dream about.

Merry Christmas everybody 🎄🎁🎅

P.S use code: TextiChristmas24 and get 1 month of free texti with any package. This is a personal gift from me, to you all 🎁. Valid until 13th of January.

That's it folks, see you next week ❤️️️️️️️

Happy Prompting!

Remember to invite your friends to subscribe at https://newsletter.texti.app

{{ address }}
Unsubscribe · Preferences