Image Description

Generate descriptions of images

Image Description

The following demonstration uses OpenAI's GPT-4o mini and Anthropic's Claude 3.5 Sonnet to generate descriptions of images.

Instructions

Select a sample image below, upload one of your own or take a picture using your webcam.
Once the image has been analysed the results will be shown below. The generated descriptions will be shown within tabs beneath your chosen reference image.

More information about this demo

As continual improvements and advances are being made in the field of Generative AI we have witnessed the emergence of a number of Multimodal Large Language Models (LLMs) like those utlilized within this demo.

At the time of writing this description GPT-4o mini is a OpenAI's smallest, most lightwweight and affordable model that accepts multimodal input, which allows for the input of both text and images resulting in textual outputs. It is well suited for the type of task outlined within this demo.

Anthropic's Claude 3.5 Sonnet is their most intelligent and capabale model. It's training data cut off is April 2024 and it offers a middle ground between the powerful high perfomance of their Opus Model and the speed of their fastest model Haiku, whilst allowing multimodal inputs.

Things to consider

We can start to see from demos such as this one how AI can be an assistive tool, especially when it comes to accessibility. The success of how these models perform at certain tasks is intrinsically linked to the quality of the prompt you provide. When creating a chat instance with these models it is possible to provide some context for how you want the agent to perform; providing context to the agent, e.g. 'You are a helpful agent that aims to provide accessibility information about the contents of an image, you should include information about the position of each element of the image and what the overall impression left by the image is'. You can also include information about the format in which you would like the information returned. It is is essential to remember that whilst it can be a great time saver to use AI tools for task such as this it is important that a the quality of AI provided answers are assessed thoroughly by a human to ensure it is providing accurate information.