Texti #52: Google plays catch-up

Hey dear texti lovers, here’s a quick recap over the last couple of weeks.

Elon tried to purchase a large amount of shares of OpenAI, disguised as DOGE (Department Of Government Efficiency). Offended 👿 Sam, replied that ClosedAI, sorry I meant OpenAI is not for sale! 😈
OpenAI forced by the Chinese people competition has shared and opened to the public a new model o3-mini, which is their best and most advanced model yet, to beat deep seek, in reasoning by an extraordinarily small margin
Facebook caught that they torrented (pirated illegally) over 81.7 terabytes of data to train their open source llama models, (which is millions of books from millions of authors), but worry not they said that it wasn’t pirating it was just for analysis and investigation purposes. Basically if you pay a good enough bribe to the US, you won’t live your life in jail for whatever you do. Artem Vaulin - was the author of KickAss torrents website, and he’s now in jail for life. But Zuckerberg for the fact that illegally used Millions of books for monetization is out there working out in Hawaii. Anyways life is unfair, and everybody knows that, don’t expect anything else.
Elon has announced Grok 3, and it beats everyone! (in a test invented by them, excluding o3 from OpenAI).
Lastly Google launched their Gemini 2.0, fully revised, reasoning model, that is fast and can do much more than just think. It can also google, analyse, watch youtube videos, and much more.

Last weeks there were also a couple of other incredible announcements, which are not really AI-related but definitely worth mentioning.

CEA (Atomic Energy Commission) ran nuclear fusion for a world breaking time - 22 minutes! Which is a major milestone in fact. It means we’re actually advancing seriously into nuclear fusion and actual experiments. That’s nuts! We might even survive to the day when everybody’s gonna have a nuclear fusion reactor at home, that will generate infinite energy. Or we’ll have planes/boats on nuclear fusion which will make transport so much cheaper. 🧑‍🔬

Microsoft announced Majorana 1.0, which is a quantum processor, working with q-bits. Here’s a little note, microsoft is a bit late to the competition here, as google has shown off their own quantum computer, but it’s heavily math oriented, and microsoft promises to be a bit more multi-purpose. That said, Microsoft doesn’t have a clean record, so they might just over-hype, regardless, I really hope this is going to be true as this means we’re close to make incredible progress.

Now I want to talk about Gemini today, for a specific reason. I think it’s in depth integration with Google Suite and the fact that for me personally google surrounds all my work, I have the ability to use it across different products.

Gemini can read my emails and summarize them, it can watch youtube videos, it can do research, it can search through your google drive, it can do so many things.

Gemini has the biggest context window, meaning that it can process most documents and content at the same time and provide you an answer based on that, without relying on RAG systems or whatever else.

What I’m trying to say is that google has a serious edge over everybody else when it comes to the user eco-systems.

I decided to play around with it and the UI is pretty sweet. For example I found an old google sheet where I did some estimations for client proposals, and asked gemini to summarize it.

Now that Gemini knew what the project was about, I asked it for recommendations, naturally it proposed some, which clearly have to be delegated to an AI. Here the lies began ☹️

That made me real sad. I think Gemini was just hallucinating and instead of being able to act on its promises it just imagined it can do things when in fact it cannot 😢

But gemini 2.0 isn’t only about text, it can now also process images, videos, audio etc, so I gave it a try.

I provided it a PDF with about 30 pages of content in it, tables mainly, and when I asked to sum-up the values it failed miserably, it said it needs time to re-evaluate it’s decision, but when I asked when will I know the answer, it stalled and then said I don’t know what you’re talking about.

To be frank, LLMs always have errors with precision, like adding numbers is a complex task for an approximation algorithm, ultimately it’s not really intelligent, it’s just approximating values. So a task like this would require some extra input and functionality to coded separately.

Since approximation is the name of the game, I went this way with the next task. I uploaded an image and asked about the ceramic paintings. Specifically the following image, to make sure I know it talks about the right place, I also asked it to point me to the location of the place. Given the nature of LLMs to be good at approximations, it should point me in the right direction.

And it was correct, it offered me 2 options highlighting option number one, which is Chapel of Souls in Porto, Portugal. Second option was Porto Cathedral, it’s also close enough given the lack of context of the image. That’s a cool one. This is how you can cheat at geo-guesser. 😉

I didn’t want to try video and other stuff, because well the concept is pretty clear to me, it approximates greatly, but it’s a little hard to compute the hard facts. There’s still some way to go, and depending on use-case some other things could be computed greatly using alternative solutions.

Final words for today, while the magic is still in the works, today, now we observe how magic is happening. Today is the worst yet, and the future is going to be astonishing. 🚀

That's it folks, see you next week ❤️️️️️️️

Happy Prompting!

Remember to invite your friends to subscribe at https://newsletter.texti.app

{{ address }}
Unsubscribe · Preferences