feat: add NGWMN getters as an ogc sibling; extract a shared OGC engine#324
Draft
thodson-usgs wants to merge 1 commit into
Draft
feat: add NGWMN getters as an ogc sibling; extract a shared OGC engine#324thodson-usgs wants to merge 1 commit into
ogc sibling; extract a shared OGC engine#324thodson-usgs wants to merge 1 commit into
Conversation
35500b3 to
9e4c0ca
Compare
ogc sibling; extract a shared OGC engine
Ports the NGWMN functions from the R dataRetrieval PR (DOI-USGS/dataRetrieval#904) and, per review, refactors the Water Data OGC machinery into a shared engine so NGWMN and Water Data are sibling layers on top of it rather than NGWMN depending on Water Data. Architecture ------------ dataretrieval/ogc/ generic OGC engine (no API-specific config): chunking.py (moved from waterdata/) the multi-value chunker filters.py (moved) cql-text filter splitting progress.py (moved from waterdata/_progress.py) engine.py request build, paginate, parse, finalize, the chunked get_ogc_data entry point, arg handling dataretrieval/waterdata/ thin Water Data layer on the engine: utils.py service->id map, stats API path, profile checks, WATERDATA_DIALECT, and a get_ogc_data wrapper that injects the Water Data defaults (re-exports engine symbols so api.py/ratings.py are unchanged) dataretrieval/ngwmn.py sibling module: get_sites, get_water_level, get_lithology, get_well_construction, get_providers — imports the engine from dataretrieval.ogc only The engine is API-agnostic: `get_ogc_data(args, service, output_id, *, base_url, extra_id_cols, dialect)`. An `OgcDialect(cql2_services, date_only_services)` (threaded via a context variable, like the base-url context) carries the per-API quirks — Water Data POSTs CQL2 for monitoring-locations and renders `daily` time args date-only; NGWMN needs neither. `ogc.engine` and `dataretrieval.ngwmn` both import with zero `dataretrieval.waterdata` dependency. NGWMN response-shape fixes in the engine (the NGWMN API differs from the main one): key the empty-result short-circuit off `features` rather than the `numberReturned` NGWMN omits; and tolerate observation features that carry no `geometry` key. PEP naming: the engine now snake_cases any non-snake column in finalize, so the package always returns PEP-8 column names regardless of the upstream API (a no-op today since both APIs are already snake_case, but enforced). Tests: live NGWMN tests for all five getters (tests/ngwmn_test.py); a `_to_snake_case` unit test; mock.patch sites repointed to ogc.engine; a module-level fixture activates WATERDATA_DIALECT for the direct _construct_api_requests unit tests. 285 unit tests pass; mypy --strict and ruff clean. waterdata_test.py shows only the 3 known pre-existing live-API drift failures (fixed by DOI-USGS#323), unrelated to this change. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
9e4c0ca to
bf02ce3
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Ports the NGWMN functions from the R
dataRetrievalPR (DOI-USGS/dataRetrieval#904) and, per review, refactors the Water Data OGC machinery into a shared engine so NGWMN and Water Data are sibling layers on top of it — NGWMN does not depend on Water Data.Architecture
The engine is API-agnostic:
get_ogc_data(args, service, output_id, *, base_url, extra_id_cols, dialect). AnOgcDialect(cql2_services, date_only_services)(threaded via a context variable, like the base-url context) carries per-API quirks — Water Data POSTs CQL2 formonitoring-locationsand rendersdailytime args date-only; NGWMN needs neither. Bothogc.engineanddataretrieval.ngwmnimport with zerodataretrieval.waterdatadependency.The multi-value chunker (recently fixed in #322) is generic and applies to NGWMN unchanged — verified that a forced-small-budget multi-site NGWMN query chunks and unions correctly.
Engine fixes (NGWMN's API differs from the main one)
featuresrather than thenumberReturnedthat NGWMN omits (otherwise pages with data were silently dropped).geometrykey (GeoDataFrame.from_featurescan't index a missing key).PEP naming
The engine snake_cases any non-snake column in
finalize, so the package always returns PEP-8 column names regardless of the upstream API — a no-op today (both APIs are already snake_case) but enforced going forward.Tests
Live NGWMN tests for all five getters (
tests/ngwmn_test.py); a_to_snake_caseunit test;mock.patchsites repointed toogc.engine; a module fixture activatesWATERDATA_DIALECTfor the direct_construct_api_requestsunit tests. 285 unit tests pass,mypy --strictandruffclean.Note
CI will show 3 pre-existing failures (
test_get_daily_properties/_id,test_get_continuous) — the live-API drift fixed by #323, not introduced here (branch is offmain). They go green once #323 merges.🤖 Generated with Claude Code