Skip to content

fix: soilweb links#340

Open
jjmaynard wants to merge 8 commits into
mainfrom
fix/soilweb-links
Open

fix: soilweb links#340
jjmaynard wants to merge 8 commits into
mainfrom
fix/soilweb-links

Conversation

@jjmaynard

@jjmaynard jjmaynard commented Oct 1, 2025

Copy link
Copy Markdown
Collaborator

Description

This PR resolves URL mapping issues in the US soil identification system where soil components were receiving incorrect SDE/SEE URLs due to index misalignment caused by component reordering during processing. #294

Problem

  • Soil components were receiving incorrect SDE (Soil Data Explorer) and SEE (Soil Series Explorer) URLs
  • The issue occurred because URL lists were created based on original component order, but the Site list generation used indices from [mucompdata_cond_prob.iterrows()] which had been reordered by multiple sorting operations
  • Index misalignment caused components to receive URLs intended for different components
  • Non-deterministic component ordering in groupby operations added inconsistency

Solution

URL Mapping Overhaul:

  • Replaced index-based URL storage (SDE_URL/SEE_URL arrays) with component key (cokey) based mapping
  • Implemented [cokey_to_urls] dictionary for safe component-to-URL lookups
  • Used component keys for URL retrieval instead of positional indices

Deterministic Ordering:

  • Changed [groupby("cokey", sort=False)]to [groupby("cokey", sort=True)] for consistent component processing
  • Added deterministic component name handling for duplicate soil series

@jjmaynard jjmaynard changed the title Fix/soilweb links Fix: soilweb links Oct 1, 2025
@jjmaynard jjmaynard changed the title Fix: soilweb links fix: soilweb links Oct 1, 2025
@jjmaynard jjmaynard requested a review from garobrik October 1, 2025 00:06
garobrik and others added 3 commits February 26, 2026 16:22
…soil identification

- Replace index-based URL storage with component key mapping to fix URL mismatches
- Add deterministic sorting to groupby operations for consistent component ordering
- Improve component name duplication handling with sorted processing
- Fix Series URL generation logic to properly match components with their URLs

Resolves issues where soil components received incorrect SDE/SEE URLs due to sorting misalignment between URL lists and component data ordering.
Improves code readability by reformatting long sort_values and other function calls across the file. No functional changes were made; only code style and formatting were updated for clarity and consistency.

@garobrik garobrik left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hey jon, i'm seeing the expected URL fix, but i'm surprised to also see that the data scores have changed by up to 10%, is that an expected outcome for this PR as well?

jjmaynard added 2 commits May 14, 2026 12:31
- Update US snapshot fixtures to reflect current deterministic outputs captured from fixture-backed
  test execution.
- Keep snapshot set aligned with branch baseline before subsequent logic changes.
- No production code changes in this commit; snapshot JSON artifacts only.
- Replace positional OSD/ESD alignment paths with normalized cokey-keyed mapping to avoid
  component drift when grouped data orders differ.
- Harden texture infill behavior by supporting partial-missing horizons and filling only missing
  layer values from OSD while preserving available SSURGO values.
- Update getTexture classification handling: return None for missing/NaN sand-clay inputs,
  refresh classification doc details, and adjust condition/choice evaluation for safer mapping.
- Normalize texture modifiers (e.g. very fine/fine/medium/coarse) before getSand/getClay lookups
  so OSD labels map consistently to canonical texture classes.
- Fix information_gain weighted entropy computation by using aligned numeric Series reduction and
  normalizing target_col handling to prevent int/str aggregation TypeError in soil_sim.
- Regenerate point validation artifact for 32.25459,-106.76431 and update related test/output files
  produced by the branch workflow.
@jjmaynard jjmaynard force-pushed the fix/soilweb-links branch 2 times, most recently from fdbb709 to 7936deb Compare May 15, 2026 20:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants