Saturday, December 28, 2024

The AI Map Benchmark Test

a map showing the capital cities within a 500-mile radius of Paris.
Godview is a new AI 'map with an LLM'. It is the latest in a number of interactive maps that have been released in 2024 which use AI models to allow the user to use natural language queries when carrying out geographical searches.

I have decided it is time I had a series of ten questions which I can use to bench-test AI maps - to test what spatial queries that they can handle and which they struggle with. Because I am lazy I asked ChatGPT to "devise ten questions which would work well as a benchmark test of AI maps using an LLM. Also explain what aspects each question is designed to test".

ChatGPT gave me ten benchmark questions which can be used to road-test an AI powered map. I used  those 10 questions to road-test Godview:-

1. Show me the capital cities within a 500-mile radius of Paris. (geographical understanding)

Godview responded with a map of 'London, Amsterdam, Berlin, Brussels, Luxembourg City and Bern.

2. What’s the fastest route from my location to the nearest national park that allows camping? (route planning)

Godview responded with an empty map. I think that this may be a general issue in Godview. As a follow-up question I asked Godview to "show my current location on the map". It failed to do so - which presumably means that Godview does not currently have the capability to locate the user. This should be easy for the developer to remedy.

I then amended the original question to include a named location - "What’s the fastest route from Times Square, New York to the nearest national park that allows camping?" 

In response Godview showed me a map with a marker for 'Delaware Water Gap National Recreation Area'. The marker included an address but did not show me a route to the address from Times Square, New York. So it appears that Godview does not handle route planning. 

3.  Find neighborhoods in New York City known for jazz music and good late-night food. (semantic comprehension)

In response Godview suggested 6 New York neighborhoods (Harlem, Hell's Kitchen, Koreatown, Greenwich Village, Lower East Side and East Village). 

4. Where can I go hiking near San Francisco with ocean views and easy trails for beginners? (contextual search) 

Godview responded with suggestions which included the Gray Whale Cove Trail, the Muir Beach Overlook Trail and Mori Point.

5. What are some family-friendly events happening in London this weekend near Hyde Park? (temporal awareness)

Godview responded with suggestions including Kids' Weekend Workshop at Science Museum, Weekend Children's Activities at V&A Museum, and Family Activities at Natural History Museum. These seem like good suggestions but seem generic and not specific to this weekend - so I suspect Godview does not have access to local event listings.

6. If I’m in Tokyo and want to visit Mount Fuji, where are the best spots for photos at sunset? (logical reasoning - a challenge to combine spatial reasoning, photography insights, and specific timing).

Godview suggested a number of locations, including Lake Yamanaka, the Chureito Pagoda and Lake Kawaguchiko.

7. Find historical landmarks in Rome that are less crowded on weekdays. (historical and cultural context)

To which Godview listed a number of suggestions (such as the Protestant Cemetery) which I had personally never heard of. I therefore suspect that they are good suggestions for historical landmarks that are not on the most popular tourist trails. 

8. Show me scenic train routes in Europe that pass through multiple countries. (Multimodal Awareness)

Godview responded with a number of well-known scenic train routes. Most of these such as the Glacier Express and the Golden Pass Line did not pass the criteria of passing through multiple countries. In fact, of Godview's suggestions only the Orient Express met this 'multiple countries' requirement.

9. Where is the nearest hospital or urgent care center to my current location? (emergency use case)

As already established Godview doesn't seem to be able to detect the user's current location. However when I added my postal address to the query Godview correctly identified my nearest hospitals.

10. Plan a road trip from Chicago to Los Angeles, stopping at national parks, and avoiding highways. (multi-stop query handling)

Godview responded by showing me a number of markers highlighting national parks between Chicago and Los Angeles but did not provide a route for the road-trip.

ChatGPT's road-test questions worked well in identifying some of the strengths and weaknesses of Godview. 

Currently Godview cannot detect the user's location, does not handle route planning and does not have access to local event listings. However Godview does have good geographical and semantic understanding. It can handle complex contextual searches and also has a good grasp of historical and cultural contexts.

No comments: