PyGeoHash 2.1.0: Modernizing a Geospatial Python Library
Back in 2015, I created PyGeoHash as a Python 3 compatible fork of the GPL3-licensed geohash library. After nearly a decade of neglect, I’m excited to announce the release of PyGeoHash 2.1.0, which brings significant modernization and new features to this lightweight geospatial library.
What is a Geohash?
Before diving into the updates, let me briefly explain what geohashes are and why they’re useful.
Geohashes are a hierarchical spatial data structure that encode geographic coordinates (latitude and longitude) into short strings of letters and digits. They were invented by Gustavo Niemeyer in 2008 and have since become a popular way to represent and work with location data.
The beauty of geohashes lies in their simplicity and utility:
Precision Control: By adjusting the length of the geohash string, you can control the precision of the location. A longer string represents a smaller area.
Proximity Inference: Similar geohash prefixes indicate geographic proximity. This makes geohashes excellent for quick proximity searches.
Compact Representation: Geohashes provide a compact way to represent coordinates, which is useful for storage and transmission.
Grid System: Geohashes implicitly define a grid system over the Earth, making them useful for spatial indexing and binning.
For example, the geohash ezs42
represents an area that includes the coordinates 42.6°N, 5.6°W. If we extend it to ezs42e
, we get a smaller area within that region.
You can also do very fast approximate distance math by comparing the first N characters of two geohashes. For example to find all records within a radius of geohash ezs42ash3
you can simply look for any that start with ezs42*
.
Why PyGeoHash?
PyGeoHash was created to provide a simple, dependency-free way to work with geohashes in Python 3. The original geohash library was Python 2 only, and I wanted to ensure this useful functionality was available to Python 3 users.
We used this heavily at Predikto, where we often dealt with geospatial data in Elasticsearch and needed performant approximate queries.
The library offers:
- Zero required dependencies (works with just the Python standard library)
- Optional support for faster math with numba
- Optional support for visualization with matplotlib
- A simple, intuitive API
- Lightweight implementation with minimal overhead
- Robust geohash operations
What’s New in PyGeoHash 2.1.0?
The latest release brings several exciting updates:
1. Bounding Box Module
The new bounding box module provides functionality for working with geographic bounding boxes, which are rectangular areas defined by minimum and maximum latitude and longitude coordinates. This is particularly useful for:
- Defining search areas
- Clipping or filtering geographic data
- Visualizing regions on maps
2. Visualization Module
One of the most exciting additions is the new visualization module, which allows you to:
- Plot geohashes on static maps using matplotlib
- Create interactive maps with Folium
- Visualize geohash grids at different precision levels
- Customize colors and styles for better data representation
This makes it much easier to understand and communicate geospatial data encoded as geohashes.
3. Improved Documentation
We’ve updated docstrings across the library to provide better guidance on how to use each function and module. This should make it easier for new users to get started and for experienced users to discover advanced functionality.
Modernizing the Project Structure
In addition to new features, version 2.0.3 (released just before 2.1.0) brought significant modernization to the project structure:
PyProject.toml-Based Configuration
We’ve moved from the traditional setup.py
to a modern pyproject.toml
-based project structure, following current Python packaging best practices. This makes the project more maintainable and easier to work with using modern Python tools.
Ruff and Tox Configurations
We’ve added configurations for Ruff (a fast Python linter) and Tox (a testing automation tool), which help maintain code quality and ensure compatibility across different Python versions. We are testing against all python versions 3.8 and above.
GitHub Actions Workflows
Continuous integration has been set up using GitHub Actions, automating testing and deployment processes to ensure reliability with every change.
How AI Helped Modernize PyGeoHash
An interesting aspect of this update is how modern AI tools played a crucial role in revitalizing the project. Cursor and Claude 3.7 were instrumental in helping to quickly:
- Refactor the codebase to use modern Python practices
- Set up GitHub Actions CI workflows
- Migrate to a pyproject.toml-based configuration
- Implement new features like the visualization module
- Improve documentation and docstrings
The combination of AI assistance and human guidance allowed me to make significant improvements to the library in a fraction of the time it would have taken otherwise. None of these updates are particularly complex, but they would have taken me much longer to do manually. The time consuming work of lirbary maintenance is a great candidate for AI assistance.
In fact, I’ve left my Cursor rulesfiles in the public repository as a reference for others who might want to see how AI can be integrated into their development workflow.
Getting Started with PyGeoHash
If you’re interested in trying out PyGeoHash, installation is straightforward:
# Basic installation
pip install pygeohash
# With visualization support
pip install pygeohash[viz]
And here’s a quick example of how to use it:
import pygeohash as pgh
# Encode coordinates to geohash
geohash = pgh.encode(latitude=42.6, longitude=-5.6)
print(geohash) # 'ezs42e44yx96'
# Control precision
short_geohash = pgh.encode(latitude=42.6, longitude=-5.6, precision=5)
print(short_geohash) # 'ezs42'
# Decode geohash to coordinates
lat, lng = pgh.decode(geohash='ezs42')
print(lat, lng) # '42.6', '-5.6'
# Calculate approximate distance between geohashes (in meters)
distance = pgh.geohash_approximate_distance(geohash_1='bcd3u', geohash_2='bc83n')
print(distance) # 625441
Looking Forward
I’m excited to see how people use PyGeoHash in their geospatial projects. The library remains lightweight and dependency-free at its core, while offering optional visualization capabilities for those who need them.
If you’re working with location data in Python, give PyGeoHash a try and let me know what you think. Contributions, bug reports, and feature requests are always welcome on the GitHub repository.
Here’s to another decade of simple, effective geospatial tools in Python!