fix(waterdata): drop id from wire properties so daily/continuous don't 400#323
Draft
thodson-usgs wants to merge 1 commit into
Draft
fix(waterdata): drop id from wire properties so daily/continuous don't 400#323thodson-usgs wants to merge 1 commit into
properties so daily/continuous don't 400#323thodson-usgs wants to merge 1 commit into
Conversation
…s don't 400
The USGS OGC API now rejects the feature `id` as a selectable `properties`
value on the `daily` and `continuous` collections (HTTP 400
"InvalidParameterValue. unknown properties specified"; `latest-daily` still
accepts it). `_switch_properties_id` translated the user-facing id column
(e.g. `daily_id`) to `id` and *kept* it in the wire `properties` list, so any
call passing the id column in `properties` started failing — breaking
`test_get_daily_properties` and `test_get_daily_properties_id`.
The feature `id` is always returned and is renamed to the service-specific id
column in `_arrange_cols`, so it never needs to be requested. Drop every id
alias (`id`, the service id like `daily_id`, and its singular form) plus
`geometry` (controlled by `skip_geometry`) from the wire list, matching the R
dataRetrieval package's `switch_properties_id`. User-facing behavior is
unchanged: callers still pass `daily_id`/`id` in `properties` and get the same
output columns; only the request the chunker sends is corrected.
Also update the stale `test_get_continuous` geometry assertion: with
`skip_geometry` unset the server now includes geometry by default, which the
library already documents ("the server defaults to including geometry"). The
user-facing default is intentionally left unchanged.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
thodson-usgs
added a commit
to thodson-usgs/dataretrieval-python
that referenced
this pull request
Jun 12, 2026
Ports the NGWMN functions from the R dataRetrieval PR (DOI-USGS/dataRetrieval#904) and, per review, refactors the Water Data OGC machinery into a shared engine so NGWMN and Water Data are sibling layers on top of it rather than NGWMN depending on Water Data. Architecture ------------ dataretrieval/ogc/ generic OGC engine (no API-specific config): chunking.py (moved from waterdata/) the multi-value chunker filters.py (moved) cql-text filter splitting progress.py (moved from waterdata/_progress.py) engine.py request build, paginate, parse, finalize, the chunked get_ogc_data entry point, arg handling dataretrieval/waterdata/ thin Water Data layer on the engine: utils.py service->id map, stats API path, profile checks, WATERDATA_DIALECT, and a get_ogc_data wrapper that injects the Water Data defaults (re-exports engine symbols so api.py/ratings.py are unchanged) dataretrieval/ngwmn.py sibling module: get_sites, get_water_level, get_lithology, get_well_construction, get_providers — imports the engine from dataretrieval.ogc only The engine is API-agnostic: `get_ogc_data(args, service, output_id, *, base_url, extra_id_cols, dialect)`. An `OgcDialect(cql2_services, date_only_services)` (threaded via a context variable, like the base-url context) carries the per-API quirks — Water Data POSTs CQL2 for monitoring-locations and renders `daily` time args date-only; NGWMN needs neither. `ogc.engine` and `dataretrieval.ngwmn` both import with zero `dataretrieval.waterdata` dependency. NGWMN response-shape fixes in the engine (the NGWMN API differs from the main one): key the empty-result short-circuit off `features` rather than the `numberReturned` NGWMN omits; and tolerate observation features that carry no `geometry` key. PEP naming: the engine now snake_cases any non-snake column in finalize, so the package always returns PEP-8 column names regardless of the upstream API (a no-op today since both APIs are already snake_case, but enforced). Tests: live NGWMN tests for all five getters (tests/ngwmn_test.py); a `_to_snake_case` unit test; mock.patch sites repointed to ogc.engine; a module-level fixture activates WATERDATA_DIALECT for the direct _construct_api_requests unit tests. 285 unit tests pass; mypy --strict and ruff clean. waterdata_test.py shows only the 3 known pre-existing live-API drift failures (fixed by DOI-USGS#323), unrelated to this change. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
thodson-usgs
added a commit
to thodson-usgs/dataretrieval-python
that referenced
this pull request
Jun 12, 2026
Ports the NGWMN functions from the R dataRetrieval PR (DOI-USGS/dataRetrieval#904) and, per review, refactors the Water Data OGC machinery into a shared engine so NGWMN and Water Data are sibling layers on top of it rather than NGWMN depending on Water Data. Architecture ------------ dataretrieval/ogc/ generic OGC engine (no API-specific config): chunking.py (moved from waterdata/) the multi-value chunker filters.py (moved) cql-text filter splitting progress.py (moved from waterdata/_progress.py) engine.py request build, paginate, parse, finalize, the chunked get_ogc_data entry point, arg handling dataretrieval/waterdata/ thin Water Data layer on the engine: utils.py service->id map, stats API path, profile checks, WATERDATA_DIALECT, and a get_ogc_data wrapper that injects the Water Data defaults (re-exports engine symbols so api.py/ratings.py are unchanged) dataretrieval/ngwmn.py sibling module: get_sites, get_water_level, get_lithology, get_well_construction, get_providers — imports the engine from dataretrieval.ogc only The engine is API-agnostic: `get_ogc_data(args, service, output_id, *, base_url, extra_id_cols, dialect)`. An `OgcDialect(cql2_services, date_only_services)` (threaded via a context variable, like the base-url context) carries the per-API quirks — Water Data POSTs CQL2 for monitoring-locations and renders `daily` time args date-only; NGWMN needs neither. `ogc.engine` and `dataretrieval.ngwmn` both import with zero `dataretrieval.waterdata` dependency. NGWMN response-shape fixes in the engine (the NGWMN API differs from the main one): key the empty-result short-circuit off `features` rather than the `numberReturned` NGWMN omits; and tolerate observation features that carry no `geometry` key. PEP naming: the engine now snake_cases any non-snake column in finalize, so the package always returns PEP-8 column names regardless of the upstream API (a no-op today since both APIs are already snake_case, but enforced). Tests: live NGWMN tests for all five getters (tests/ngwmn_test.py); a `_to_snake_case` unit test; mock.patch sites repointed to ogc.engine; a module-level fixture activates WATERDATA_DIALECT for the direct _construct_api_requests unit tests. 285 unit tests pass; mypy --strict and ruff clean. waterdata_test.py shows only the 3 known pre-existing live-API drift failures (fixed by DOI-USGS#323), unrelated to this change. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem (CI is red on
main)The live
dailyandcontinuousintegration tests began failing around Jun 11–12, independent of any code change:test_get_daily_properties/_id→400: InvalidParameterValue. unknown properties specifiedtest_get_continuous→assert 'geometry' not in df.columns(geometry is present)These are live-API drift, not a regression in the repo (main's last green CI ran Jun 10; the API changed after). They red every PR's
ubuntumatrix until fixed.Root cause
I probed the live API to characterize the change:
properties=id,valueproperties=valuedailycontinuouslatest-dailyidis no longer a selectable property ondaily/continuous(the featureidis always returned anyway)._switch_properties_idtranslated the user-facing id column (daily_id) toidand kept it in the wireproperties, so requests that listed the id column started 400ing. (latest-dailystill toleratesid, which is why the CQL test kept passing.)skipGeometryis unset — which the library already documents (skip_geometry=None"leaves skipGeometry unset (the server defaults to including geometry)").Fix
_switch_properties_idnow drops every id alias (id, the service id likedaily_id, and its singular form) plusgeometryfrom the wireproperties, instead of translating toidand keeping it. The featureidalways comes back and is renamed to the service id column in_arrange_cols, so it never needs to be requested. This matches the RdataRetrievalpackage'sswitch_properties_id(which removesid/geometry).daily_id/idinpropertiesand get the same output columns — only the request the chunker sends is corrected.test_get_continuous: updated the stale geometry assertion to match the documented default (server includes geometry whenskip_geometryis unset). The user-facing default is intentionally not changed.Note on consistency (raised in review)
The R package excludes geometry for
continuousby forcingskipGeometry = TRUEinread_waterdata_continuous(high-frequency data → geometry per row is pure bloat). Python'sget_continuousinstead defaultsskip_geometry=None(server default). Aligning Python would be a user-facing default change, which is deliberately out of scope here — flagging it as a possible follow-up enhancement.Verification
The 3 originally-failing tests pass against the live API, and the related id/geometry tests stay green (
test_get_daily_no_geometry,test_get_cql_properties_id_translation,test_get_latest_daily_properties_geometry, …). Fullwaterdata_test.py(75) pluswaterdata_utils/filters/chunkingsuites pass;ruffandmypy --strict dataretrieval/clean.🤖 Generated with Claude Code