In my previous post, I outlined four key applications of location data for ATProto apps. In this post, we will discuss cases where location data should remain the domain of individual app developers, as well as why location data does remain the domain of individual app developers.
First, Location data should remain the domain of individual app developers for these important reasons:
- Data licensing restrictions on mapping SDKs and APIs: see table below, but if you are using Mapkit or Google Maps the licenses restrict you from extracting, scraping, or storing the data, and from mixing and matching APIs and services, thus it is best practice to use these resources within the specific platforms for which they are designed and not beyond.
- Sensitivity around personally identifiable information: as a best practice, developers are generally discouraged from storing any location information, such as a latitude and longitude and obtained from EXIF metadata or mobile phones. This is primarily due to emerging regulatory concerns. That means sharing it within the ATmosphere is out of the question, unless and until there are community-supported tools and processes in line with privacy regulations (the strongest of which are coming out of the EU and California currently) for storing anonymized user data that is never traceable back to a person. This is a big challenge with significant risks involved, and is not currently on any community roadmaps.
- UX concerns around showing locations on a map: even if you are not storing or sharing anything, within the Bluesky ecosystem, developers should be discouraged from displaying personally identifiable information on a map without careful attention to UX and privacy. The current backlash against Instagram is just one recent example of how even major tech companies get this wrong: if you don’t handle the display of user location with an extreme amount of caution and sensitivity, even if nothing bad happens as a result, many or even most users will be spooked and simply turn location services off, eliminating this whole category of capabilities.
I will propose ways to deal with personally identifiable information in a later post. UX concerns around showing locations on a map is a deeper dive for another time.
To underscore why the data licensing restrictions are a significant tradeoff, the table below provides a summary of the common data providers you might use to build out these applications that contrasts the license restrictions with data structures and use patterns. The TL;DR here is that the proprietary data sources are much easier for an individual app or developer to work with, but there are tradeoffs in terms of what you can aggregate, abstract or share.
| Provider | Restrictiveness Ranking | Data Structure / Schema | How to Use / Extract |
|---|---|---|---|
| Overture Maps | Least restrictive – open under CDLA for Places (ODbL for some other layers if they come from OpenStreetMap) | Open schema: Places, Buildings, Transportation, Admin Boundaries. Stable IDs + linkages. | Bulk download, conflate, manage own DB. |
| Foursquare Open Source Places | Permissive – open dataset (Apache 2.0) | Structured POIs with fsq_place_id, categories, hierarchy, geometry, metadata. | Bulk download (Parquet), conflate, manage own DB. |
| Apple Maps (MapKit / Places API) | Restrictive – cannot store, redistribute or mix with external DBs | Apple placeId + categories, opaque schema. | Query via MKLocalSearch. Supports forward and reverse geocoding. IDs must remain internal. |
| Google Places API | Restrictive – cannot store, redistribute or mix with external DBs | Google place_id, categories, metadata. | API (places/textSearch, places/details), supports forward and reverse geocoding |
| Mapbox Places / Geocoding API | Restrictive — similar to Google/Apple; cannot store, redistribute or mix with external DBs | id, text, place_type, geometry, properties | API (/geocode) supports forward and reverse geocoding |
| Yelp Fusion API | Most restrictive — cannot store or redistribute POIs, reviews, or ratings; even ephemeral storage is highly constrained | Businesses w/ Yelp IDs, categories, attributes, reviews. | Lookup via API (/businesses/search, /businesses/:id), business search only, not a geocoder |
Some further salient observations:
- Location data providers like Overture Maps and Foursquare Open Source Places expect you to spin up your own server and host their data in a database, but under open-source license, so you are free to mix and match.
- Mapping APIs like Apple Mapkit, Yelp and Google Maps / Places APIs are intended for use exclusively by individual developers through calls at runtime, and the ToS explicitly restrict you from mixing and matching services (e.g. you can’t show Google Places on a Mapkit map and vice versa, or mix them together for purposes of search geotagging, geofencing, etc.).
- Mapbox has contributed a lot to the open source mapping world, but their Places API is proprietary, just as restrictive, and roughly comparable to Google or Apple.
- Similarly Yelp has great categories and attributes that make it ideal for location search (sans maps) and geotagging but the licensing is restrictive in ways that are similar to Apple and Google. A notable difference is that it will not resolve addresses to places (as you would expect of a geocoder) and has limited functionality around what it returns given a latitude and longitude (e.g. reverse geocoding)
For you as an individual developer, the convenience of the proprietary APIs may outweigh the value of the less restrictive licenses. This is unlikely to change entirely, but as I’ll describe in the remainder of this post and dive into more deeply in a future post, there are opportunities to build infrastructure around locations as first-class primitives in a decentralized web.