In a blog post earlier this year, I stated, “Advances in AI in 2024 will dwarf what we saw in 2023.”

In various seminars, I showed an image that was created in Midjourney 1.0 such as this one:

The prompt was:”New York City submerged like Atlantis. Fish, whales, sea turtles and sharks swim through the streets of New York.”

I then used the exact same prompt using the current version of Midjourney and would get a result like this:

Besides the resolution, this is an amazing difference in just one year’s time.

Then, I would talk about how this is easy to see because graphics are visual. But the advances in text generation during this past year were just as great as in text to image generation. And then I would make my prediction: 2024 is going to blow this away! Imagine if you could feed this prompt into an AI and have it generate a video of this scene!

Well, we don’t have to wait a whole year to see that this is already happening. OpenAIi has demonstrated Sora, a text to video generator. Using this same prompt, here is the result:

YouTube player

But like I said, text to text generation is growing at the same rate.

A few weeks ago, OpenAI (the makers of ChatGPT) said they were testing the ability for ChatGPT to remember things you discuss to make future chats more helpful. Previously, ChatGPT didn’t remember anything – Each request sends the entire conversation history. So “memory” is limited by the prompt size, which is the size of the input prompt (the memory or history plus the latest request) plus the output size. ChatGPT will now “remember” previous chats rather than creating new conversations each time. This is a big leap forward, but the next feature is going to blow you away.

Google retired their previous chatbot, Bard, and has release Gemini 1.0 Ultra. But they are testing Gemini 1.5 (currently only available to developers and enterprise customers). Gemini 1.5 has an enormous context window, which means it can handle much larger queries and look at much more information at once. That window is a whopping 1 million tokens, compared to 128,000 for OpenAI’s GPT-4 and 32,000 for the current Gemini Pro. Tokens are a tricky metric to understand so to make it simpler: “It’s about 10 or 11 hours of video, tens of thousands of lines of code, or 750,000 words (about the size of all the Harry Potter books combined).” The context window means you can ask the AI bot to generate all of that content at once.

Did you catch the significance of that? We will soon be able to have an AI write an entire book for us without having to guide it along a few paragraphs at a time.

Notice that Gemini is not better than ChatGPT. It does have better internet access (although it is rumored that OpenAI (the creator of ChatGPT) is working on a search engine to try to compete with Google. But Gemini disabled it’s image generator feature because of built in biases. It is said that it creates better code writing skills, but I have not found this to be true. Gemini does not have a custom GPT feature like ChatGPT does. Everyone is still trying to outdo each other, and of course, competition is always good for the consumer.

But the advances in AI that I thought might take a year are already coming to fruition. It’s going to be a wild ride!

What is the significance of this article for ecommerce website owners and small entrepreneurs? Why not try an experiment and ask the chatbot in the lower right corner of this site!