From a70796d48169619b754114450b5508a0377f9b24 Mon Sep 17 00:00:00 2001 From: "xiao.li" Date: Thu, 2 Jul 2026 14:16:00 +0800 Subject: [PATCH 1/3] master: document DXF add index disk space precheck --- system-variable-reference.md | 7 +++++++ system-variables.md | 11 +++++++++++ tidb-distributed-execution-framework.md | 21 ++++++++++++++++++++- 3 files changed, 38 insertions(+), 1 deletion(-) diff --git a/system-variable-reference.md b/system-variable-reference.md index d99498487995f..45417535a2d62 100644 --- a/system-variable-reference.md +++ b/system-variable-reference.md @@ -370,6 +370,13 @@ Referenced in: - [System Variables](/system-variables.md#error_count) - [TiDB 2.1 RC1 Release Notes](/releases/release-2.1-rc.1.md) +### enforce_disk_space_precheck_before_add_index + +Referenced in: + +- [System Variables](/system-variables.md#enforce_disk_space_precheck_before_add_index) +- [TiDB Distributed eXecution Framework (DXF)](/tidb-distributed-execution-framework.md#tikv-disk-space-precheck-for-add-index-tasks) + ### foreign_key_checks Referenced in: diff --git a/system-variables.md b/system-variables.md index 5aa7d2fc61a8c..2a045e6b640ef 100644 --- a/system-variables.md +++ b/system-variables.md @@ -525,6 +525,17 @@ For more possible values of this variable, see [Authentication plugin status](/s - Default value: `0` - A read-only variable that indicates the number of errors that resulted from the last statement that generated messages. +### enforce_disk_space_precheck_before_add_index + +- Scope: GLOBAL +- Persists to cluster: Yes +- Applies to hint [SET_VAR](/optimizer-hints.md#set_varvar_namevar_value): No +- Type: Boolean +- Default value: `OFF` +- This variable controls whether TiDB rejects a DXF [`ADD INDEX`](/sql-statements/sql-statement-add-index.md) task when the TiKV disk space precheck predicts insufficient TiKV capacity. +- When the value is `OFF`, TiDB still performs the precheck and logs warnings for insufficient TiKV capacity, but it does not reject the DDL job. +- When the value is `ON`, TiDB rejects the DDL job if the precheck predicts insufficient TiKV capacity and the prediction uses non-pseudo table statistics. If the prediction uses pseudo statistics, TiDB logs a warning and does not reject the DDL job. For more information, see [TiKV disk space precheck for `ADD INDEX` tasks](/tidb-distributed-execution-framework.md#tikv-disk-space-precheck-for-add-index-tasks). + ### foreign_key_checks - Scope: SESSION | GLOBAL diff --git a/tidb-distributed-execution-framework.md b/tidb-distributed-execution-framework.md index 2ceafdc4a917b..20c6247d378c5 100644 --- a/tidb-distributed-execution-framework.md +++ b/tidb-distributed-execution-framework.md @@ -89,6 +89,25 @@ Adjust the following system variables related to Fast Online DDL: * [`tidb_ddl_error_count_limit`](/system-variables.md#tidb_ddl_error_count_limit) * [`tidb_ddl_reorg_batch_size`](/system-variables.md#tidb_ddl_reorg_batch_size): use the default value. The recommended maximum value is `1024`. +## TiKV disk space precheck for `ADD INDEX` tasks + +For [`ADD INDEX`](/sql-statements/sql-statement-add-index.md) tasks executed by the DXF, TiDB collects a TiKV capacity snapshot before submitting the distributed task, predicts the TiKV index size based on block sampling, table statistics, and the replica count, and then checks whether TiKV has enough remaining disk space for the task. This precheck applies to both local sort and Global Sort execution paths. + +TiDB considers TiKV disk space insufficient if either of the following conditions is met: + +- After subtracting the predicted index size, the remaining TiKV cluster capacity is less than 20% of the total TiKV cluster capacity. +- After subtracting the per-store predicted index size, the remaining capacity of any TiKV store is less than 15% of that store's total capacity. + +By default, [`enforce_disk_space_precheck_before_add_index`](/system-variables.md#enforce_disk_space_precheck_before_add_index) is `OFF`. In this mode, TiDB logs a warning if the precheck predicts insufficient TiKV capacity, but it still submits the DDL job. To reject such DDL jobs before task submission, set this variable to `ON`: + +```sql +SET GLOBAL enforce_disk_space_precheck_before_add_index = ON; +``` + +When this variable is `ON`, TiDB rejects the DDL job only if the prediction uses non-pseudo table statistics. If the prediction uses pseudo statistics, TiDB logs a warning and does not reject the DDL job. If TiDB cannot collect the TiKV capacity snapshot or complete the prediction within 5 seconds, TiDB logs a warning, skips the precheck, and continues to submit the DDL job. + +To observe the predicted and actual TiKV storage usage of a DXF `ADD INDEX` task, check the TiDB logs. TiDB logs prediction fields such as `block_sample_predicted_tikv_index_all_replica_bytes`, `block_sample_predicted_tikv_index_single_replica_bytes`, and `block_sample_mvcc_overhead_total_bytes` at task submission time. After the task succeeds, TiDB also logs fields such as `logical_index_kv_bytes`, `ingested_sst_bytes`, `ingested_sst_bytes_source`, and `ingested_sst_bytes_reliable`. + ## Task scheduling By default, the DXF schedules all TiDB nodes to execute distributed tasks. Starting from v7.4.0, for TiDB Self-Managed clusters, you can control which TiDB nodes can be scheduled by the DXF to execute distributed tasks by configuring [`tidb_service_scope`](/system-variables.md#tidb_service_scope-new-in-v740). @@ -128,4 +147,4 @@ As shown in the preceding diagram, the execution of tasks in the DXF is mainly h * [Execution Principles and Best Practices of DDL Statements](https://docs.pingcap.com/tidb/stable/ddl-introduction) - \ No newline at end of file + From e83348e31f68d207ad16af85e6eb2beae91c468e Mon Sep 17 00:00:00 2001 From: "xiao.li" Date: Thu, 2 Jul 2026 15:34:47 +0800 Subject: [PATCH 2/3] master: address DXF add index precheck review comments --- system-variables.md | 2 +- tidb-distributed-execution-framework.md | 4 ++-- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/system-variables.md b/system-variables.md index 2a045e6b640ef..f7a8bcc42d9a4 100644 --- a/system-variables.md +++ b/system-variables.md @@ -534,7 +534,7 @@ For more possible values of this variable, see [Authentication plugin status](/s - Default value: `OFF` - This variable controls whether TiDB rejects a DXF [`ADD INDEX`](/sql-statements/sql-statement-add-index.md) task when the TiKV disk space precheck predicts insufficient TiKV capacity. - When the value is `OFF`, TiDB still performs the precheck and logs warnings for insufficient TiKV capacity, but it does not reject the DDL job. -- When the value is `ON`, TiDB rejects the DDL job if the precheck predicts insufficient TiKV capacity and the prediction uses non-pseudo table statistics. If the prediction uses pseudo statistics, TiDB logs a warning and does not reject the DDL job. For more information, see [TiKV disk space precheck for `ADD INDEX` tasks](/tidb-distributed-execution-framework.md#tikv-disk-space-precheck-for-add-index-tasks). +- When the value is `ON`, TiDB rejects the DDL job if the precheck predicts insufficient TiKV capacity and the prediction uses non-pseudo statistics. If the prediction uses pseudo statistics, TiDB logs a warning and does not reject the DDL job. For more information, see [TiKV disk space precheck for `ADD INDEX` tasks](/tidb-distributed-execution-framework.md#tikv-disk-space-precheck-for-add-index-tasks). ### foreign_key_checks diff --git a/tidb-distributed-execution-framework.md b/tidb-distributed-execution-framework.md index 20c6247d378c5..dbd7672c59871 100644 --- a/tidb-distributed-execution-framework.md +++ b/tidb-distributed-execution-framework.md @@ -93,7 +93,7 @@ Adjust the following system variables related to Fast Online DDL: For [`ADD INDEX`](/sql-statements/sql-statement-add-index.md) tasks executed by the DXF, TiDB collects a TiKV capacity snapshot before submitting the distributed task, predicts the TiKV index size based on block sampling, table statistics, and the replica count, and then checks whether TiKV has enough remaining disk space for the task. This precheck applies to both local sort and Global Sort execution paths. -TiDB considers TiKV disk space insufficient if either of the following conditions is met: +TiDB considers TiKV disk space insufficient in either of the following cases: - After subtracting the predicted index size, the remaining TiKV cluster capacity is less than 20% of the total TiKV cluster capacity. - After subtracting the per-store predicted index size, the remaining capacity of any TiKV store is less than 15% of that store's total capacity. @@ -104,7 +104,7 @@ By default, [`enforce_disk_space_precheck_before_add_index`](/system-variables.m SET GLOBAL enforce_disk_space_precheck_before_add_index = ON; ``` -When this variable is `ON`, TiDB rejects the DDL job only if the prediction uses non-pseudo table statistics. If the prediction uses pseudo statistics, TiDB logs a warning and does not reject the DDL job. If TiDB cannot collect the TiKV capacity snapshot or complete the prediction within 5 seconds, TiDB logs a warning, skips the precheck, and continues to submit the DDL job. +When this variable is `ON`, TiDB rejects the DDL job only if the prediction uses non-pseudo statistics. If the prediction uses pseudo statistics, TiDB logs a warning and does not reject the DDL job. If TiDB cannot collect the TiKV capacity snapshot or complete the prediction within 5 seconds, TiDB logs a warning, skips the precheck, and submits the DDL job. To observe the predicted and actual TiKV storage usage of a DXF `ADD INDEX` task, check the TiDB logs. TiDB logs prediction fields such as `block_sample_predicted_tikv_index_all_replica_bytes`, `block_sample_predicted_tikv_index_single_replica_bytes`, and `block_sample_mvcc_overhead_total_bytes` at task submission time. After the task succeeds, TiDB also logs fields such as `logical_index_kv_bytes`, `ingested_sst_bytes`, `ingested_sst_bytes_source`, and `ingested_sst_bytes_reliable`. From eba6f23c5e989feba9b61f47f471b8b02c70557b Mon Sep 17 00:00:00 2001 From: "xiao.li" Date: Fri, 3 Jul 2026 13:41:02 +0800 Subject: [PATCH 3/3] master: clarify ADD INDEX disk precheck scope --- sql-statements/sql-statement-add-index.md | 19 +++++++++++++++++++ system-variable-reference.md | 1 + system-variables.md | 4 ++-- tidb-distributed-execution-framework.md | 17 +---------------- 4 files changed, 23 insertions(+), 18 deletions(-) diff --git a/sql-statements/sql-statement-add-index.md b/sql-statements/sql-statement-add-index.md index 07e6afcc199bf..fcf73443f9f7a 100644 --- a/sql-statements/sql-statement-add-index.md +++ b/sql-statements/sql-statement-add-index.md @@ -93,6 +93,25 @@ mysql> EXPLAIN SELECT * FROM t1 WHERE c1 = 3; 2 rows in set (0.00 sec) ``` +## TiKV disk space precheck + +Before executing `ADD INDEX`, TiDB collects a TiKV capacity snapshot, predicts the TiKV index size based on block sampling, table statistics, and the replica count, and then checks whether TiKV has enough remaining disk space. This precheck applies to `ADD INDEX` jobs whether or not the job is executed by the DXF. + +TiDB considers TiKV disk space insufficient in either of the following cases: + +- After subtracting the predicted index size, the remaining TiKV cluster capacity is less than 20% of the total TiKV cluster capacity. +- After subtracting the per-store predicted index size, the remaining capacity of any TiKV store is less than 15% of that store's total capacity. + +By default, [`enforce_disk_space_precheck_before_add_index`](/system-variables.md#enforce_disk_space_precheck_before_add_index) is `OFF`. In this mode, TiDB logs a warning if the precheck predicts insufficient TiKV capacity, but it still executes the DDL job. To reject such DDL jobs before execution, set this variable to `ON`: + +```sql +SET GLOBAL enforce_disk_space_precheck_before_add_index = ON; +``` + +When this variable is `ON`, TiDB rejects the DDL job only if the prediction uses non-pseudo statistics. If the prediction uses pseudo statistics, TiDB logs a warning and does not reject the DDL job. If TiDB cannot collect the TiKV capacity snapshot or complete the prediction within 5 seconds, TiDB logs a warning, skips the precheck, and executes the DDL job. + +To observe the predicted and actual TiKV storage usage of an `ADD INDEX` job, check the TiDB logs. TiDB logs prediction fields such as `block_sample_predicted_tikv_index_all_replica_bytes`, `block_sample_predicted_tikv_index_single_replica_bytes`, and `block_sample_mvcc_overhead_total_bytes` before execution. After the job succeeds, TiDB also logs fields such as `logical_index_kv_bytes`, `ingested_sst_bytes`, `ingested_sst_bytes_source`, and `ingested_sst_bytes_reliable`. + ## MySQL compatibility * TiDB accepts index types such as `HASH`, `BTREE` and `RTREE` in syntax for compatibility with MySQL, but ignores them. diff --git a/system-variable-reference.md b/system-variable-reference.md index 45417535a2d62..1432bdf86c2b8 100644 --- a/system-variable-reference.md +++ b/system-variable-reference.md @@ -374,6 +374,7 @@ Referenced in: Referenced in: +- [ADD INDEX](/sql-statements/sql-statement-add-index.md#tikv-disk-space-precheck) - [System Variables](/system-variables.md#enforce_disk_space_precheck_before_add_index) - [TiDB Distributed eXecution Framework (DXF)](/tidb-distributed-execution-framework.md#tikv-disk-space-precheck-for-add-index-tasks) diff --git a/system-variables.md b/system-variables.md index f7a8bcc42d9a4..f655392d14106 100644 --- a/system-variables.md +++ b/system-variables.md @@ -532,9 +532,9 @@ For more possible values of this variable, see [Authentication plugin status](/s - Applies to hint [SET_VAR](/optimizer-hints.md#set_varvar_namevar_value): No - Type: Boolean - Default value: `OFF` -- This variable controls whether TiDB rejects a DXF [`ADD INDEX`](/sql-statements/sql-statement-add-index.md) task when the TiKV disk space precheck predicts insufficient TiKV capacity. +- This variable controls whether TiDB rejects an [`ADD INDEX`](/sql-statements/sql-statement-add-index.md) job when the TiKV disk space precheck predicts insufficient TiKV capacity. - When the value is `OFF`, TiDB still performs the precheck and logs warnings for insufficient TiKV capacity, but it does not reject the DDL job. -- When the value is `ON`, TiDB rejects the DDL job if the precheck predicts insufficient TiKV capacity and the prediction uses non-pseudo statistics. If the prediction uses pseudo statistics, TiDB logs a warning and does not reject the DDL job. For more information, see [TiKV disk space precheck for `ADD INDEX` tasks](/tidb-distributed-execution-framework.md#tikv-disk-space-precheck-for-add-index-tasks). +- When the value is `ON`, TiDB rejects the DDL job if the precheck predicts insufficient TiKV capacity and the prediction uses non-pseudo statistics. If the prediction uses pseudo statistics, TiDB logs a warning and does not reject the DDL job. For more information, see [TiKV disk space precheck](/sql-statements/sql-statement-add-index.md#tikv-disk-space-precheck). ### foreign_key_checks diff --git a/tidb-distributed-execution-framework.md b/tidb-distributed-execution-framework.md index dbd7672c59871..2b59c56f36614 100644 --- a/tidb-distributed-execution-framework.md +++ b/tidb-distributed-execution-framework.md @@ -91,22 +91,7 @@ Adjust the following system variables related to Fast Online DDL: ## TiKV disk space precheck for `ADD INDEX` tasks -For [`ADD INDEX`](/sql-statements/sql-statement-add-index.md) tasks executed by the DXF, TiDB collects a TiKV capacity snapshot before submitting the distributed task, predicts the TiKV index size based on block sampling, table statistics, and the replica count, and then checks whether TiKV has enough remaining disk space for the task. This precheck applies to both local sort and Global Sort execution paths. - -TiDB considers TiKV disk space insufficient in either of the following cases: - -- After subtracting the predicted index size, the remaining TiKV cluster capacity is less than 20% of the total TiKV cluster capacity. -- After subtracting the per-store predicted index size, the remaining capacity of any TiKV store is less than 15% of that store's total capacity. - -By default, [`enforce_disk_space_precheck_before_add_index`](/system-variables.md#enforce_disk_space_precheck_before_add_index) is `OFF`. In this mode, TiDB logs a warning if the precheck predicts insufficient TiKV capacity, but it still submits the DDL job. To reject such DDL jobs before task submission, set this variable to `ON`: - -```sql -SET GLOBAL enforce_disk_space_precheck_before_add_index = ON; -``` - -When this variable is `ON`, TiDB rejects the DDL job only if the prediction uses non-pseudo statistics. If the prediction uses pseudo statistics, TiDB logs a warning and does not reject the DDL job. If TiDB cannot collect the TiKV capacity snapshot or complete the prediction within 5 seconds, TiDB logs a warning, skips the precheck, and submits the DDL job. - -To observe the predicted and actual TiKV storage usage of a DXF `ADD INDEX` task, check the TiDB logs. TiDB logs prediction fields such as `block_sample_predicted_tikv_index_all_replica_bytes`, `block_sample_predicted_tikv_index_single_replica_bytes`, and `block_sample_mvcc_overhead_total_bytes` at task submission time. After the task succeeds, TiDB also logs fields such as `logical_index_kv_bytes`, `ingested_sst_bytes`, `ingested_sst_bytes_source`, and `ingested_sst_bytes_reliable`. +For [`ADD INDEX`](/sql-statements/sql-statement-add-index.md) tasks executed by the DXF, TiDB performs a TiKV disk space precheck before task submission. This precheck applies to both local sort and Global Sort execution paths. For more information, see [TiKV disk space precheck](/sql-statements/sql-statement-add-index.md#tikv-disk-space-precheck). ## Task scheduling