- Published on
Starting GovDataHub — Collecting Public Data in One Place
- Authors

- Name
- Bryan Beltran
Government data is public. Finding it in a form you can actually use is another story.
I started GovDataHub because I kept hitting the same friction: interesting datasets scattered across federal agencies, inconsistent formats, and no single place to bookmark what I had already explored.
This is a dev log, not a launch announcement. The project is in progress.
The problem I am solving
When I research a topic — housing, transit, agriculture, whatever — I end up with:
- A dozen browser tabs across
.govdomains - CSV downloads with incompatible schemas
- No memory of which dataset I already evaluated and rejected
GovDataHub is my attempt to build a personal index: sources I care about, metadata that helps me compare them, and a path toward exploration without starting from zero each time.
Design constraints
Personal tool rules apply:
- Start small — One domain I actually research beats a generic crawler for everything
- Provenance matters — Every dataset needs source URL, retrieval date, and license notes
- Boring storage first — SQLite or Postgres before anything clever
I am not trying to replace Data.gov. I want a workspace that respects how individual developers and researchers actually work.
Technical direction
Stack is still settling. Likely Python for ingestion scripts, a simple API layer, and a frontend when the data model stops changing every week.
Early milestones:
- Catalog schema (source, title, format, update cadence)
- One end-to-end ingest pipeline for a single agency feed
- Search and filter over what has been collected
Related work on this site
GovDataHub sits alongside other side projects — SeedStarter for gardening, Browser Listener for page instrumentation experiments. Each solves a narrow problem I have repeatedly.
I will post updates as the catalog and ingestion pieces land. If public data organization is your kind of rabbit hole, the repo is on GitHub.