Part of the Scheduled Map Rebuild Workflows guide.
Operative rule: commit the GeoJSON only when git diff --staged --quiet returns non-zero — committing an identical file wastes runner minutes, inflates repository history, and triggers unnecessary downstream cache invalidation.
How It Works
GitHub Actions evaluates cron expressions in UTC and queues a runner to execute your workflow at the scheduled time. The runner checks out your repository, runs your transformation script, and — if the output differs from the last committed version — pushes a new commit back via the default GITHUB_TOKEN. The entire pipeline is serverless: no persistent VM, no cron daemon, no external scheduler service.
The critical mechanism is the idempotency guard. Because raw spatial sources sometimes return identical data on consecutive nights, a naive git add && git commit would create empty-diff commits that clutter history without adding information. Checking git diff --staged before committing makes the workflow safe to run as often as needed. This pairs naturally with cache invalidation strategies — your CDN or tile cache only needs to be busted when the committed file actually changes.
The diagram below shows the full execution path from cron trigger to a dashboard consuming the committed file.
Production-Ready Workflow Configuration
Place the following YAML in .github/workflows/nightly-geojson.yml. It triggers at 02:00 UTC daily, accepts a manual dispatch for debugging, installs dependencies from a lockfile for reproducible builds, runs the transformation, validates output, and pushes only when the file has changed.
name: Nightly GeoJSON Rebuild
on:
schedule:
- cron: '0 2 * * *' # 02:00 UTC every day
workflow_dispatch: # manual trigger for debugging
permissions:
contents: write
jobs:
rebuild:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with:
persist-credentials: true
- uses: actions/setup-node@v4
with:
node-version: '22'
cache: 'npm'
- name: Install dependencies
run: npm ci
- name: Fetch & Transform
env:
SOURCE_API_URL: ${{ secrets.SOURCE_API_URL }}
API_TOKEN: ${{ secrets.API_TOKEN }}
run: node scripts/rebuild-geojson.js
- name: Validate GeoJSON
run: npx @mapbox/geojsonhint data/output.geojson
- name: Commit & Push
run: |
git config user.name "github-actions[bot]"
git config user.email "41898282+github-actions[bot]@users.noreply.github.com"
git add data/output.geojson
git diff --staged --quiet || \
git commit -m "chore: nightly GeoJSON rebuild [skip ci]"
git push
Key points in this configuration:
persist-credentials: truekeeps the defaultGITHUB_TOKENwrite-capable for the push step.npm ciinstalls frompackage-lock.json, ensuring the same@turf/turfversion runs in CI as locally.[skip ci]in the commit message prevents recursive workflow triggers on the resulting push.- Secrets are injected via
env— never viarunargument strings — so they are redacted in logs.
Transformation Script
The script fetches, maps, and writes the FeatureCollection. Coordinate truncation to five decimal places reduces file size by 30–50% with no perceptible impact on web map rendering, which directly benefits dashboard load times and CDN cache efficiency.
// scripts/rebuild-geojson.js
import fs from 'fs/promises';
import * as turf from '@turf/turf';
import { fileURLToPath } from 'url';
import { dirname, join } from 'path';
const __dirname = dirname(fileURLToPath(import.meta.url));
const OUT_PATH = join(__dirname, '../data/output.geojson');
async function rebuild() {
// 1. Fetch source data
const res = await fetch(process.env.SOURCE_API_URL, {
headers: { Authorization: `Bearer ${process.env.API_TOKEN}` }
});
if (!res.ok) throw new Error(`Source API returned ${res.status}`);
const raw = await res.json();
// 2. Build FeatureCollection (RFC 7946: [lng, lat] coordinate order)
const features = raw.items.map(item => ({
type: 'Feature',
properties: {
id: item.id,
category: item.category,
updated_at: new Date().toISOString()
},
geometry: {
type: 'Point',
coordinates: [item.lng, item.lat] // longitude first
}
}));
const collection = { type: 'FeatureCollection', features };
// 3. Truncate to 5 decimal places (~1 m precision at the equator)
const optimized = turf.truncate(collection, { precision: 5 });
// 4. Write — fs.writeFile truncates the file before writing
await fs.writeFile(OUT_PATH, JSON.stringify(optimized, null, 2), 'utf8');
console.log(`Wrote ${features.length} features to ${OUT_PATH}`);
}
rebuild().catch(err => {
console.error('GeoJSON rebuild failed:', err.message);
process.exit(1); // non-zero exit halts the workflow before commit
});
The catch block calls process.exit(1), which causes the workflow step to fail and prevents the commit step from running — no corrupt or empty file ever reaches the repository.
Alternative: Python Variant
Teams using GeoPandas in the same repository can replace the Node.js script with the following. It uses geopandas 0.14+ and writes output.geojson via the built-in to_file driver.
# scripts/rebuild_geojson.py
import os
import sys
import requests
import geopandas as gpd
from shapely.geometry import Point
SOURCE_URL = os.environ["SOURCE_API_URL"]
API_TOKEN = os.environ["API_TOKEN"]
OUT_PATH = "data/output.geojson"
def rebuild() -> None:
resp = requests.get(
SOURCE_URL,
headers={"Authorization": f"Bearer {API_TOKEN}"},
timeout=30,
)
resp.raise_for_status()
items = resp.json()["items"]
gdf = gpd.GeoDataFrame(
[{"id": i["id"], "category": i["category"]} for i in items],
geometry=[Point(i["lng"], i["lat"]) for i in items],
crs="EPSG:4326", # always store in WGS 84 for web maps
)
# Round coordinates to 5 decimal places before export
gdf.geometry = gdf.geometry.apply(
lambda geom: geom.__class__(
*[round(c, 5) for c in geom.coords[0]]
)
)
gdf.to_file(OUT_PATH, driver="GeoJSON")
print(f"Wrote {len(gdf)} features to {OUT_PATH}")
if __name__ == "__main__":
try:
rebuild()
except Exception as exc:
print(f"Rebuild failed: {exc}", file=sys.stderr)
sys.exit(1)
Adjust the crs parameter if your source data uses a projected coordinate system — always reproject to EPSG:4326 before writing GeoJSON destined for a web map. CRS & Projection Management covers the reprojection step in detail.
Verification Steps
Run these checks locally before relying on the scheduled run:
- Copy
.env.exampleto.envand populateSOURCE_API_URLandAPI_TOKEN. - Run
npm ci && node scripts/rebuild-geojson.js(orpip install -r requirements.txt && python scripts/rebuild_geojson.py). - Validate the output:
npx @mapbox/geojsonhint data/output.geojson— a clean exit means no structural violations. - Inspect the file: confirm
typeis"FeatureCollection", thatfeaturesis a non-empty array, and that coordinates are[longitude, latitude](not reversed). - Stage and diff:
git add data/output.geojson && git diff --staged— verify only the expected properties changed. - Push to a feature branch and check the Actions tab to confirm the workflow triggers on
workflow_dispatchbefore relying on the nightlycron.
Common Errors & Fixes
The workflow runs but no commit appears in the repository
The git diff --staged --quiet guard exited with code 0, meaning the fetched data is identical to the last committed version. This is correct behaviour. If you expected a change, add a debug step — git diff HEAD data/output.geojson — to confirm the file content before the diff check runs.
geojsonhint reports “right-hand rule violation”
RFC 7946 requires polygon exterior rings to follow the right-hand rule (counter-clockwise winding). Add turf.rewind(collection, { reverse: true }) before turf.truncate in the Node.js script, or call gdf.geometry = gdf.geometry.apply(lambda g: g) after importing shapely.ops.orient and orienting each polygon to orient(g, 1.0) in the Python variant.
The runner times out fetching from the source API
The default step timeout is 6 hours but network calls can hang indefinitely. Add a timeout-minutes: 10 key to the rebuild job. In the fetch call, pass an AbortSignal with AbortSignal.timeout(25_000) (Node.js 18+) or set timeout=30 in requests.get (Python) so the process exits cleanly rather than hanging.
Duplicate features appear after several nightly runs
The script is appending rather than overwriting. fs.appendFile and Python’s open(path, 'a') both add to existing content. Use fs.writeFile (Node.js) or open(path, 'w') (Python) — both truncate the file before writing.
Related
- Scheduled Map Rebuild Workflows — parent guide covering scheduler options and rebuild frequency trade-offs
- Cache Invalidation Strategies — coordinate CDN and browser cache expiry with the nightly commit
- Clearing Browser Tile Cache After Python Data Updates — browser-side cache busting after a GeoJSON commit lands
- Triggering Map Refresh via Supabase Webhooks — event-driven alternative to scheduled rebuilds
- Data Refresh & Automation Pipelines — broader pipeline architecture context