Skip to content

Fix(table_diff): Make --limit per-sample and not across all samples#4727

Merged
erindru merged 1 commit intomainfrom
erin/table-diff-fix
Jun 11, 2025
Merged

Fix(table_diff): Make --limit per-sample and not across all samples#4727
erindru merged 1 commit intomainfrom
erin/table-diff-fix

Conversation

@erindru
Copy link
Copy Markdown
Collaborator

@erindru erindru commented Jun 11, 2025

Prior to this PR, when fetching samples, the sample data was fetched in a single query and then client-side separated out into "source_only", "target_only" and "common".

The --limit was being applied to the overall query which meant that common rows with mismatches could be pushed out by target only rows.

This lead to output like (note that screenshot uses --limit 2 to illustrate the problem):
Screenshot From 2025-06-12 11-05-11

Notice the partial match but then the sample claims that all joined rows match and no source only rows are shown

This PR changes how the sample data is fetched by applying the --limit on a per-sample basis ("source_only", "target_only", "common") and not on an overall basis.

This updates the above output with --limit 2 to look like:
Screenshot From 2025-06-12 11-03-52

@erindru erindru force-pushed the erin/table-diff-fix branch from 99509b8 to 601f0cf Compare June 11, 2025 23:15
@erindru erindru enabled auto-merge (squash) June 11, 2025 23:20
@erindru erindru merged commit 11ce8c8 into main Jun 11, 2025
25 checks passed
@erindru erindru deleted the erin/table-diff-fix branch June 11, 2025 23:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants