Avocado Chairs – An Advancement of AI Technology

In 2020, it was witnessed that machine learning has grown tremendously, especially in the case of NLP (natural language processing). NLP is a branch of AI technology that conducts research and explores how machines can read and comprehend human languages.

GTP-3: The largest NLP model

OpenAI, a leading artificial intelligence lab, has created GPT-3 (Generative Pre-trained Transformer 3), an NLP language model containing 175 billion parameters. This NLP model is the largest till date. GPT-3 performed a wide range of tasks like generating texts of different types that resembled texts which human beings could create. There was hardly any visible difference between the texts generated by GPT-3 and the ones created by humans. GTP-3 was able to create short stories, poems, and even technical manuals. In addition, it was also able to crack simple math problems, and generate programming codes as well. With GTP3-, OpenAI has proved to the world that it is possible to train a single deep-learning model for using a language in a variety of different ways just by giving it numerous texts.

New Models – DALL.E and CLIP

Recently, OpenAI has released two models – DALL.E and CLIP. These models make use of a combination of language + images to make AI more efficient in understanding words and their meaning.

DALL.E

DALL.E is a toned-down version of GTP-3 having 12 billion parameters.
It has been trained to generate images from text description such as words and captions.
Both text and images are received by DALL.E as a single stream of data having a maximum of 1280 tokens.
It is trained using the maximum likelihood estimation method to generate all other tokens successively.
Tokens for both image and text concepts are present in the vocabulary of DALL.E.
The training procedure for DALL.E enables it to not only create an image from scratch but also to generate any rectangular portion of an existing image, extending to the bottom-right corner.
DALL.E is able to create logical images for a different variety of sentences.

Here is what DALL.E is capable of doing.

Creating spectacular avocado chairs

“An armchair of avocado shape”.

Image source: Openai

When given this text caption, DALL.E was able to come up with some outstanding armchairs carrying the shape of an actual avocado. The armchair designs were simply remarkable just like a human designer would make them.

The phrase ‘in the shape of’ made DALL.E come up with so many beautiful avocado armchairs. DALL.E related the shape of a half avocado to the back of the armchair, and avocado pit to the cushion.

The armchairs created by DALL.E looked like avocados and chairs. DALL.E has the power to take two unrelated concepts and produce functional results out of them.

CLIP (Contrastive Language-Image Pre-training)

It is an image recognition system that has learned to recognize images from natural language supervision.
CLIP is able to identify visual concepts from images and their captions.
It learns about images from the description rather than single word labels.
CLIP is trained to identify the correct caption from a random selection of 32,768 captions.
This neural network efficiently connects texts and images.

The models like DALL.E and CLIP created by OpenAI will make AI algorithms better at understanding words and what they actually mean. It is a fact that these language models have limitations at present. However, considering the research and work that is going on in the field of AI technology, the day is not far when AI will completely understand all the concepts in the human world. Language models powered by AI are the future.

Do you wish to learn thorough concepts about Artificial technology? Take a step forward and explore our broad range of courses and resources with today!

Cart