PhD in Geospatial Data Science - How AI Learns a City: Open Representation across open data sources

About the Project

This PhD offers a 3-year fully funded research project that investigates how transformer models learn an internal representation of urban space from open geospatial place data, and how that learned representation changes when the same city is described by different data sources. Cities are more and more interpreted through open, crowdsourced and crawled and geospatial data. Yet the composition and quality of these data can vary: they are shaped by participation, incentives, and refresh cycles. This project will build comparable urban language models from open place datasets, initially using open source alternatives like Overture and Foursquare. By converting neighbourhood context into sequences of places, categories, or urban functions, the research will build and compare models that learn patterns of spatial proximity, social similarity, centres and corridors, and broader urban ecologies. The goals of this project are twofold: first, we will build an understanding of how cities can be represented using state-of-the-art machine learning and artificial intelligence; second, we will assess which kinds of urban structure are stable across sources and which are dependent upon sources. The knowledge gained during the PhD will have particular relevance for the production of urban “digital twins”: embeddings can serve as a compression algorithm that allows for parsimonious “twinning”, and bias and missingness represent important challenges for the success of these digital representations of cities. Part of this PhD will explore these applications. Methodologically, the project will combine geospatial data science, spatial statistics, urban analytics, and explainable machine learning. A central aim is to develop transparent and explainable diagnostics for where learned urban representations fail, who those failures affect, and how those failures relate to data quality, data provenance, and uneven urban visibility. By showing how AI learns different urban realities from different geospatial data sources, the project will connect fundamental questions in representation learning with practical questions of geospatial data quality, provenance, and responsible urban AI. The project will produce both methodological and substantive contributions. Methodologically, it will develop new ways to evaluate embeddings and foundation models, with a particular focus on bias, uncertainty, and missingness. Empirically, it will generate new evidence on how open urban data infrastructures represent neighbourhoods, chains, local businesses, and everyday urban life. Outputs will include reproducible tools for learning and representing cities using transformers, with the potential to explore other modelling approaches like graph neural networks, as well as techniques for assessing these models.

Enquiries/requests for further information are welcome: please contact ana.basiri@glasgow.ac.uk or Andrew.Renninger@glasgow.ac.uk

Find Your Best Opportunity

Tell them AcademicJobs.com sent you!