The current proposal for how to handle location data, largely shaped by the ATGeo Working Group discussions (e.g., ATGeo WG: Designing for interoperability with a common data model for places and ATgeo WG: Next Steps for Garganorn and Geographic Data in ATProtocol), defines a “gazetteer service” as an API server that provides place data using ATGeo Lexicon interfaces. The goal is to support location queries and enable AT Protocol applications to employ, create, update, reference, and exchange geospatial data, with a strong focus on interoperability.
The idea is that a gazetteer server would translate off-protocol reference place data into Lexicon-specified AT Protocol objects and serve them via XRPC, without requiring the data to be stored in a Personal Data Server (PDS). Garganorn is a demo implementation acting as a data relayer. Even with the most performant and optimized architecture possible today, the fundamental problem with this approach is it means creating a *centralized * service for location, amidst the decentralized architecture of ATProto. With this model, there is no good solution for sharing location data or derivatives. In the worst case, developers end up with a model that is very similar to commercial, proprietary APIs, where queries are made at runtime and data is stored and ranked by each individual app developer. In the words of seabass.bsky.social, simply duplicating address records “thousands and thousands of times throughout ATProto” feels wrong
The solution I proposed in my last post—to develop a Federated Location Data Store—is, I believe, the intuitive, missing data structure that resolves this incongruity by creating a traversable, decentralized and dedupable catalog of public location data that is maintained by app developers and others with a specific ATProto use case in mind. Furthermore, it aligns with ATProto architecture and philosophy, and provides for common Bluesky app development use cases like search and geotagging, geofencing and topology, respects the terms of data provider licenses, and in line with the architectural principles behind ATProto, treats location as a first-class primitive and represents a core element of the decentralized (geospatial) web.
However, the problem of creating a canonical and authoritative index of places goes deeper than redundant querying and storage. Deduplication of location data, particularly across different data sources, is an impossible problem to solve, even excluding certain types of custom or bespoke definitions, given that they may have slightly different names, location coordinates, and may come from different data sources. This is arguably the reason why walled gardens of proprietary location data exist: you can’t normalize “place” or “location”, so the best you can do is put a lot of effort into creating your own version of it. Furthermore, contrary to the sometimes stated assumption of the ATGeo Working Group, location data changes *all the time *. Data freshness is a scandalously notorious problem among POI data vendors, and public venues like malls, concert and sports arenas, and conference venues move walls and rearrange their spaces regularly, sometimes with patterns, sometimes without.
One possible way to merge the work being done by the currently active members of the ATGeo Working Group with this proposal would be to create a “gazetteer LDS". This could bootstrap efforts to federate private LDS instances by hosting collections of places and geofences sourced from publicly-available data and making it available across the distributed geospatial web.
Whereas there has been much emphasis in ATGeo Working Group discussions on remaining data provider agnostic when building a gazetteer, when building a gazetteer LDS, I would suggest starting with one rather than trying to harmonize across them all. Place is socially constructed, and locations have different representations depending on what you need them to mean. It simply is not possible to make location data that has been curated, created and maintained perfectly interchangeable, and I would argue should not be the goal. In many ways, a collection of location data, or a gazetteer is the schema more so than it is the place or location, thus representations by different providers are not interchangeable unless they were designed to be.
Ok. So then which one you ask? One obvious choice might be to port Who’s on First to STAC and host it, a web-available and downloadable gazetteer with stable IDs and tested schemas. However, for myriad reasons this could be a substantial effort, and more than the current group of volunteers and enthusiasts should undertake.
The other option is Overture Maps. Among the providers discussed in this post, Overture Maps is the most comprehensive and permissive. This is likely the best alternative for creating a centralized gazetteer. Overture favors CDLA Permissive 2.0 for places. That means ATProto devs can build shared catalogs, remix, and redistribute, whereas there is more ambiguity in the Apache license Foursquare uses about whether all operations are permitted. Furthermore, it is very likely that Foursquare will be folded into Overture Maps at some point. The Maintainers of Overture Maps are actively developing and experimenting with STAC (e.g. here is the catalog of Overture releases) and very likely will be open and enthusiastic to working with ATProto folks, so all in all it is a much easier lift.
Additional reasons, and here I don’t know who needs to hear this at this point, but, Foursquare primarily focuses on business and municipal listings, and point geometries only, whereas Overture Maps curates a broad variety of places and potential geofences, including administrative designations and buildings. Furthermore, Overture Maps is under active development, and has put a lot of thought into building stable ids and stable infrastructure, with polygons suitable for geofencing and not just points within municipalities, rather also administrative boundaries, buildings, and a host of other features.
Whew! This concludes by post blitz on location data in ATProto. I’ll be off the grid for a week, after that happy to talk more about this proposal, or else stay tuned for a future post, where I plan to outline in greater depth the schema patterns and implementation details behind Location Data Stores, with working examples.
In the meantime, tell me what you think ![]()