There has been a lot of hype recently about how Dall-E is going to make it so you don’t have to pay for a subscription to Midjourney or Leonardo to get photo quality images. But is Dall-E really ready to take on Midjourney?
DALL-E is an artificial intelligence (AI) system created by OpenAI that can produce realistic images from text prompts. The name DALL-E is a blend of Salvador Dali, the famous artist, and Pixar’s WALL-E movie.
The reports are that Dall-E can produce images that are as good or better than those being produced by Midjourney. Further, it is going to be built into ChatGPT 4, so you will be able to refine your output just like you do with text when using ChatGPT. And, it will be able to generate real text, not just alien text, much like Ideogram. All that would be a dream come true of course if it as good as the marketing proclaims.
Here is the promo video:
So how can you get your hands on this and start playing with it? There are 2 ways that I am aware of:
- Have a paid subscription to ChatGPTplus (you can either pay OpenAI or pay Midjourney, there is no free lunch). And, not all plus users have access yet, although it is supposed to be rolling out as a Beta this month. This will be the real test because option 2 is limited.
- Use the Bing Image Creator. As the primary investor in OpenAI, Bing Image Creator has already started using Dall-E. However, even though it is using Dall-E 3 as its engine, it does not have the image editing tools or chat features, which is supposed to be the Midjourney killer. It is however, free.
Let’s compare the two:Using the exact same prompt of “a detailed photo of a log house surrounded by pine trees” here are the results:
|Dall-E 3 (using Bing Image Creator)||Midjourney|
By comparison, here are the results from Dall-E 2.0
|Dall-E 2 (unedited)||Prompt changed to:|
“a zoomed out, detailed photo of a log house, showing the entire house, surrounded by pine trees”
Clearly, Dall-E 3 is better than Dall-e 2.0, although I question whether it is better than Midjourney.
I also thought it would be fun to try this in Leonardo and StableDiffusion for comparison.
But what if we try something other than a photo, perhaps a logo. Here is the prompt I used: “a simple logo for a house painting company named “Designs and Colors Painting”. vector style. plain white background.”
|Dall-E 3 (unedited)||Midjourney (also unedited)|
Now the results paint a different story (yes, that was intentional). Clearly, Dall-E is making steps in the right direction to get the text right, whereas Midjourney is still struggling with alien text.
The other thing that Midjourney has always struggled with is human hands. Let’s compare these as well.
|Dall-E 3||Midjourney (the left hand still looks funny)|
I am excited for the direction that generative AI image creation is going. Dall-E is making huge strides. I’m not ready to give up my Midjourney subscription (I also have a ChatGPTplus subscription). As it stands with just Bing Image Creator, it is just another option, but I am anxious to try the chat option with Dall-E when they make it available to me as I think this could be a game changer. What do you think?