Skip to content

dxf: document ADD INDEX disk space precheck#23190

Open
expxiaoli wants to merge 3 commits into
pingcap:masterfrom
expxiaoli:fix_69399
Open

dxf: document ADD INDEX disk space precheck#23190
expxiaoli wants to merge 3 commits into
pingcap:masterfrom
expxiaoli:fix_69399

Conversation

@expxiaoli

Copy link
Copy Markdown

What is changed, added or deleted? (Required)

  • Add documentation for the DXF ADD INDEX TiKV disk space precheck.
  • Add the enforce_disk_space_precheck_before_add_index system variable, including its default value, scope, and enforcement behavior.
  • Describe the TiKV capacity thresholds, pseudo-statistics behavior, timeout behavior, and TiDB log fields for observing predicted and ingested SST bytes.
  • Update the system variable reference page.

Which TiDB version(s) do your changes apply to? (Required)

Tips for choosing the affected version(s):

By default, CHOOSE MASTER ONLY so your changes will be applied to the next TiDB major or minor releases. If your PR involves a product feature behavior change or a compatibility change, CHOOSE THE AFFECTED RELEASE BRANCH(ES) AND MASTER.

For details, see tips for choosing the affected versions.

  • master (the latest development version)
  • v9.0 (TiDB 9.0 versions)
  • v8.5 (TiDB 8.5 versions)
  • v8.1 (TiDB 8.1 versions)
  • v7.5 (TiDB 7.5 versions)
  • v7.1 (TiDB 7.1 versions)
  • v6.5 (TiDB 6.5 versions)
  • v6.1 (TiDB 6.1 versions)

What is the related PR or file link(s)?

Do your changes match any of the following descriptions?

  • Delete files
  • Change aliases
  • Need modification after applied to another branch
  • Might cause conflicts after applied to another branch

@ti-chi-bot ti-chi-bot Bot added the first-time-contributor Indicates that the PR was contributed by an external member and is a first-time contributor. label Jul 2, 2026
@ti-chi-bot

ti-chi-bot Bot commented Jul 2, 2026

Copy link
Copy Markdown

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign overvenus for approval. For more information see the Code Review Process.
Please ensure that each of them provides their approval before proceeding.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@pingcap-cla-assistant

Copy link
Copy Markdown

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

@ti-chi-bot ti-chi-bot Bot added missing-translation-status This PR does not have translation status info. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Jul 2, 2026
@expxiaoli expxiaoli requested review from dveeden and likidu July 2, 2026 07:22

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request documents the new system variable enforce_disk_space_precheck_before_add_index and details the TiKV disk space precheck mechanism for ADD INDEX tasks within the Distributed Execution Framework (DXF). The review feedback focuses on improving clarity, avoiding passive voice, and ensuring consistent terminology (specifically using "non-pseudo statistics" instead of "non-pseudo table statistics") across the updated documentation files.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread system-variables.md Outdated
- Default value: `OFF`
- This variable controls whether TiDB rejects a DXF [`ADD INDEX`](/sql-statements/sql-statement-add-index.md) task when the TiKV disk space precheck predicts insufficient TiKV capacity.
- When the value is `OFF`, TiDB still performs the precheck and logs warnings for insufficient TiKV capacity, but it does not reject the DDL job.
- When the value is `ON`, TiDB rejects the DDL job if the precheck predicts insufficient TiKV capacity and the prediction uses non-pseudo table statistics. If the prediction uses pseudo statistics, TiDB logs a warning and does not reject the DDL job. For more information, see [TiKV disk space precheck for `ADD INDEX` tasks](/tidb-distributed-execution-framework.md#tikv-disk-space-precheck-for-add-index-tasks).

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

low

To maintain consistency with the rest of the documentation, use the term 'non-pseudo statistics' instead of 'non-pseudo table statistics'.

Suggested change
- When the value is `ON`, TiDB rejects the DDL job if the precheck predicts insufficient TiKV capacity and the prediction uses non-pseudo table statistics. If the prediction uses pseudo statistics, TiDB logs a warning and does not reject the DDL job. For more information, see [TiKV disk space precheck for `ADD INDEX` tasks](/tidb-distributed-execution-framework.md#tikv-disk-space-precheck-for-add-index-tasks).
- When the value is `ON`, TiDB rejects the DDL job if the precheck predicts insufficient TiKV capacity and the prediction uses non-pseudo statistics. If the prediction uses pseudo statistics, TiDB logs a warning and does not reject the DDL job. For more information, see [TiKV disk space precheck for `ADD INDEX` tasks](/tidb-distributed-execution-framework.md#tikv-disk-space-precheck-for-add-index-tasks).
References
  1. Use consistent terminology. (link)

Comment thread tidb-distributed-execution-framework.md Outdated

For [`ADD INDEX`](/sql-statements/sql-statement-add-index.md) tasks executed by the DXF, TiDB collects a TiKV capacity snapshot before submitting the distributed task, predicts the TiKV index size based on block sampling, table statistics, and the replica count, and then checks whether TiKV has enough remaining disk space for the task. This precheck applies to both local sort and Global Sort execution paths.

TiDB considers TiKV disk space insufficient if either of the following conditions is met:

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

low

Simplify the sentence and avoid passive voice ('is met') by using 'in either of the following cases:'.

Suggested change
TiDB considers TiKV disk space insufficient if either of the following conditions is met:
TiDB considers TiKV disk space insufficient in either of the following cases:
References
  1. Avoid passive voice overuse and unnecessary words. (link)

Comment thread tidb-distributed-execution-framework.md Outdated
SET GLOBAL enforce_disk_space_precheck_before_add_index = ON;
```

When this variable is `ON`, TiDB rejects the DDL job only if the prediction uses non-pseudo table statistics. If the prediction uses pseudo statistics, TiDB logs a warning and does not reject the DDL job. If TiDB cannot collect the TiKV capacity snapshot or complete the prediction within 5 seconds, TiDB logs a warning, skips the precheck, and continues to submit the DDL job.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

low

Improve clarity and conciseness by changing 'non-pseudo table statistics' to 'non-pseudo statistics' for consistency, replacing the repetitive use of 'TiDB' with 'it', and simplifying 'continues to submit' to 'submits'.

Suggested change
When this variable is `ON`, TiDB rejects the DDL job only if the prediction uses non-pseudo table statistics. If the prediction uses pseudo statistics, TiDB logs a warning and does not reject the DDL job. If TiDB cannot collect the TiKV capacity snapshot or complete the prediction within 5 seconds, TiDB logs a warning, skips the precheck, and continues to submit the DDL job.
When this variable is `ON`, TiDB rejects the DDL job only if the prediction uses non-pseudo statistics. If the prediction uses pseudo statistics, TiDB logs a warning and does not reject the DDL job. If TiDB cannot collect the TiKV capacity snapshot or complete the prediction within 5 seconds, it logs a warning, skips the precheck, and submits the DDL job.
References
  1. Use consistent terminology and avoid unnecessary words and repetition. (link)

Comment thread system-variables.md Outdated
- Applies to hint [SET_VAR](/optimizer-hints.md#set_varvar_namevar_value): No
- Type: Boolean
- Default value: `OFF`
- This variable controls whether TiDB rejects a DXF [`ADD INDEX`](/sql-statements/sql-statement-add-index.md) task when the TiKV disk space precheck predicts insufficient TiKV capacity.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only apply to DXF?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will apply to both DXF and classic. Should I write doc now that it will apply for both envs?

@ti-chi-bot

ti-chi-bot Bot commented Jul 3, 2026

Copy link
Copy Markdown

@wjhuang2016: adding LGTM is restricted to approvers and reviewers in OWNERS files.

Details

In response to this:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@expxiaoli

Copy link
Copy Markdown
Author

/hold

@ti-chi-bot ti-chi-bot Bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jul 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. first-time-contributor Indicates that the PR was contributed by an external member and is a first-time contributor. missing-translation-status This PR does not have translation status info. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants