HyperAI

Geospatial Mechanistic Interpretability of Large Language Models

Stef De Sabbata, Stefano Mizzaro, Kevin Roitero
Release Date: 5/13/2025
Geospatial Mechanistic Interpretability of Large Language Models
Abstract

Large Language Models (LLMs) have demonstrated unprecedented capabilitiesacross various natural language processing tasks. Their ability to process andgenerate viable text and code has made them ubiquitous in many fields, whiletheir deployment as knowledge bases and "reasoning" tools remains an area ofongoing research. In geography, a growing body of literature has been focusingon evaluating LLMs' geographical knowledge and their ability to perform spatialreasoning. However, very little is still known about the internal functioningof these models, especially about how they process geographical information. In this chapter, we establish a novel framework for the study of geospatialmechanistic interpretability - using spatial analysis to reverse engineer howLLMs handle geographical information. Our aim is to advance our understandingof the internal representations that these complex models generate whileprocessing geographical information - what one might call "how LLMs think aboutgeographic information" if such phrasing was not an undue anthropomorphism. We first outline the use of probing in revealing internal structures withinLLMs. We then introduce the field of mechanistic interpretability, discussingthe superposition hypothesis and the role of sparse autoencoders indisentangling polysemantic internal representations of LLMs into moreinterpretable, monosemantic features. In our experiments, we use spatialautocorrelation to show how features obtained for placenames display spatialpatterns related to their geographic location and can thus be interpretedgeospatially, providing insights into how these models process geographicalinformation. We conclude by discussing how our framework can help shape thestudy and use of foundation models in geography.