суббота, июня 17, 2023

AI Street View Hallucinations

I have been a huge admirer of OpenStreetMap Haiku for a number of years. OpenStreetMap Haiku is a clever map that can write a short poem about any location in the world based on the OpenStreetMap data for that location. 

Share your location with OpenStreetMap Haiku and it will generate a unique haiku using data gleaned from OpenStreetMap using Overpass Turbo. I thought that it might be interesting to use the same process to create AI Street View images of locations based purely on the OpenStreetMap data for that location. Luckily for me the blog post OpenStreetMap Haiku: Using OSM and Overpass for generative poetry does a very good job of explaining how to frame an Overpass Turbo query to retrieve all the OSM data in a radius around a specified location.

If you are very clever and rich you could create an interactive map which could create an AI generated image for any clicked on location by using the Overpass Turbo AI and feeding the resulting data as a prompt into Midjourney or DALL·E 2.

I'm neither rich nor clever. I can't afford to make thousands of queries to an AI image generator. So instead of making a map like OpenStreetMap Haiku for AI Street View I decided instead to just manually enter a few prompts (based on Overpass Turbo queries for specific locations) into Bing's free Image Creator.

Here are my first AI generated Street View hallucinations generated by Bing's Image Creator using prompts from Overpass Turbo queries:
 
West 23rd Street & 8th Avenue, New York, NY, USA

Google Street View (left) and Bing Image Creator (right)


In Crissy Field, San Francisco looking at the Golden Gate Bridge

Google Street View (left) and Bing Image Creator (right)


On Westminster Bridge, London looking at Big Ben and the Houses of Parliament

Google Street View (left) and Bing Image Creator (right)

After attempting only a handful of AI Street View hallucinations I can report that it is easier to create more accurate images for very well known locations than it is for lesser known backstreet locations. Presumably this is because DALL-E (which Bing Image Creator uses) has a huge pool of training images for the most well-known and most photographed locations. 

The New York AI Street View above was created using the prompt "A road junction in New York with traffic signals. A subway entrance. A Starbucks coffee shop". This prompt was compiled using OSM data extracted from Overpass Turbo for West 23rd Street & 8th Avenue. In generating the AI image from this prompt Bing Image Creator is creating a kind of generic New York road junction based on its training images. The AI image has only a passing resemblance to the real junction at West 23rd Street & 8th Avenue. The generated Street View images for Westminster Bridge and the Golden Gate Bridge are, as you might expect, much more accurate than the image generated for a random New York junction.

Комментариев нет: