Artificial Intelligence has proven to be quite a useful tool in many businesses over the past couple of years. From process automation, improved data analytics, and even employee and customer engagement, AI has touched upon multiple areas in business to make it easier for internal and external users to accomplish what they need to do in a short amount of time.

 

Recently, a new trend has appeared when it comes to the use of AI, specifically AI art. With the advancement of technology, where machines are able to understand language and create images from the use of that language, the foundations for AI image generation were established. Currently, there are several companies that have made a lot of headway in this direction, where users need to only input descriptive text and the AI will interpret it to create the image. 

 

Some of the more popular companies active in this space are the following:

 

Google Imagen

Going through Google’s Imagen website will allow you to see the thought processes and technical aspects of how the team created Imagen. Unlike the other two in this list, Google provides some preset text for the user to choose from which the AI will then use and create an image.

There doesn’t seem to be a place on the site where users can actually place their own texts, but the intent of the site seems to be geared more towards an explanation of the work done by the Google team rather than a site for random image creation. Still, Google’s work is so far the most photorealistic, where it’s able to outperform the rest when it comes to the COCO (Common Objects in Context) test.

Below are some of the other samples that Imagen is able to produce.

DALL-E / DALL-E 2

OpenAI is the company behind DALL-E. DALL-E 2 is the latest iteration, launched around January 2021, where users are invited to write descriptive texts which the AI will use as its basis to create the image. In this platform, outputs can look photorealistic or have a painted/artistic look to them, depending on what the user initially indicated.

Aside from creating images, DALL-E 2 also has the capability to add specific elements to an existing photo and make several variations of it.

Those who wish to try out the service need to join a waitlist at https://labs.openai.com/waitlist for the meantime. There is however a mini DALL-E that was also launched, where it’s able to capture the text-to-image capabilities of DALL-E, but not to the photorealistic level. It is however enough if the user just wants to use it as a tool for imagining.

Midjourney AI

The newest of those in this list is Midjourney AI. It’s still in an open beta phase today since launching around February 2022, and it currently boasts almost a million people in its Discord server. Similar to Dall-E, Midjourney uses the text prompts provided by users to create a set of 4 images. The user can then decide whether he wants to redo the whole set, to upscale/enlarge a specific image, or to make a variation from one of the images in the given set.

As the user upscales the image, the more detailed it also gets and a higher resolution of the photo can be downloaded.

One key difference that it also has with Imagen or DALL-E is that the image outputs are intentionally not photorealistic but give off a more artistic feel. This allows its users then to imagine and express themselves, similar to the way that people do artworks whether in a traditional or digital medium.

 

Having these tools definitely opens up a lot of possibilities, especially for people who don’t see themselves as artists or those who find it challenging to express themselves. It also gives many people a chance to co-create with others, especially as they get inspired by the images that other people are also able to create. Businesses will definitely find a use for these as well whether in ideation sessions or once it has advanced enough, even possibly in their actual marketing materials. 

 

Of course, there are still challenges and issues that need to be tackled as this kind of technology continues to evolve and develop. With the rise of fake news and deepfakes, guidelines or constraints most probably need to be embedded into the system so that those with malicious intent won’t be able to use these tools to spread more disinformation. There’s also the matter of copyright on who owns the images created: is it the user who provided the prompts or is it the company whose AI capabilities enabled those images to be formed?

 

There are still many things that most likely need to be defined so that everyone feels safe in using these tools and that conflicts could be avoided in the long run. For now, as the image-generation AI is still in its early stages, it’s very much exciting to see where the technology is headed and the possible use cases in the future both for individual and business purposes.

Author : Angela “Krinkle” Garcia.