Skip to content

Introduce config to allow fail on fallback#12279

Open
felixloesing wants to merge 2 commits into
apache:mainfrom
felixloesing:felixloesing/fail_on_fallback_config
Open

Introduce config to allow fail on fallback#12279
felixloesing wants to merge 2 commits into
apache:mainfrom
felixloesing:felixloesing/fail_on_fallback_config

Conversation

@felixloesing

Copy link
Copy Markdown
Contributor

What changes are proposed in this pull request?

Adds spark.gluten.sql.columnar.failOnFallback (default false). When enabled, the query throws a GlutenException listing the operators and reasons whenever any operator falls back to Spark instead of running natively.

We are rolling out Gluten to only queries that have been determined to be fully offloadable and use this config to guard against query changes after migration.

GlutenFallbackReporter now drives both its log output and the SQL UI summary through a shared GlutenExplainUtils.visitFallbackNodes helper, so all three views (logs, UI, failOnFallback) agree on what counts as a fallback. As a side effect, untagged vanilla operators that were previously silent in the reporter logs are now reported with the reason "Gluten does not touch it or does not support it", matching what the SQL UI has always shown.

How was this patch tested?

  • GlutenFallbackSuite: new fail job on fallback when failOnFallback is enabled test covers both default-off (no throw) and enabled (throws GlutenException mentioning the config key).
    • Existing fallback logging and event tests still pass.
  • Manual: run a query with an unsupported operator under failOnFallback=true and confirm the job fails with a useful operator/reason list.
  • This has been rolled out internally

Was this patch authored or co-authored using generative AI tooling?

Co-authored with Claude Opus 4.8

Adds `spark.gluten.sql.columnar.failOnFallback` (default false). When
enabled, the query throws a GlutenException listing the operators and
reasons whenever any operator falls back to Spark instead of running
natively.

We are rolling out Gluten to only queries that have been determined to
be fully offloadable and use this config to guard against query
changes after migration.

GlutenFallbackReporter now drives both its log output and the SQL UI
summary through a shared GlutenExplainUtils.visitFallbackNodes helper,
so all three views (logs, UI, failOnFallback) agree on what counts as a
fallback. As a side effect, untagged vanilla operators that were
previously silent in the reporter logs are now reported with the reason
"Gluten does not touch it or does not support it", matching what the SQL
UI has always shown.

- GlutenFallbackSuite: new fail job on fallback when failOnFallback is
enabled test covers both default-off (no throw) and enabled (throws
GlutenException mentioning the config key).
  - Existing fallback logging and event tests still pass.
- Manual: run a query with an unsupported operator under
failOnFallback=true and confirm the job fails with a useful
operator/reason list.
@github-actions github-actions Bot added CORE works for Gluten Core DOCS labels Jun 11, 2026
@github-actions

Copy link
Copy Markdown

Run Gluten Clickhouse CI on x86

.doc(
"When true, throw an exception if any operator falls back to Spark" +
" instead of running on the native engine.")
.booleanConf

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add .internal() to mark this config as not intended for end users.

}
}
}
}

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this test is independent of any Spark version, can we move it to the gluten-ut/test module—perhaps as a new test suite—so we only need to maintain a single copy of the test code?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @philo-he for your quick review! I addressed both of your comments. Please take another look.

@github-actions github-actions Bot removed the DOCS label Jun 11, 2026
@github-actions

Copy link
Copy Markdown

Run Gluten Clickhouse CI on x86

@philo-he

Copy link
Copy Markdown
Member

cc @FelixYBW @zhztheplayer

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CORE works for Gluten Core

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants