Fix(table_diff): Make --limit per-sample and not across all samples#4727
Merged
Fix(table_diff): Make --limit per-sample and not across all samples#4727
--limit per-sample and not across all samples#4727Conversation
izeigerman
approved these changes
Jun 11, 2025
99509b8 to
601f0cf
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Prior to this PR, when fetching samples, the sample data was fetched in a single query and then client-side separated out into "source_only", "target_only" and "common".
The
--limitwas being applied to the overall query which meant that common rows with mismatches could be pushed out by target only rows.This lead to output like (note that screenshot uses

--limit 2to illustrate the problem):Notice the partial match but then the sample claims that all joined rows match and no source only rows are shown
This PR changes how the sample data is fetched by applying the
--limiton a per-sample basis ("source_only", "target_only", "common") and not on an overall basis.This updates the above output with

--limit 2to look like: