Skip to content

[Bug]: Duplicate process ID generated by multiple SnowflakeIdGenerator instances causing PRIMARY KEY conflict #4257

Description

@felixhzhu

What happened?

Multiple independent SnowflakeIdGenerator instances exist within the same AMS JVM, each maintaining their own sequence and lastTimestamp state. When two instances generate IDs within the same 10ms time window, they produce identical IDs, causing Duplicate entry for key 'PRIMARY' errors on the table_process table.

Root Cause

Prior to the fix, there were two separate static SnowflakeIdGenerator instances:

  1. IcebergTableUtil.java:

    private static final SnowflakeIdGenerator snowflakeIdGenerator = new SnowflakeIdGenerator();

    Used in createOptimizingPlanner() to generate process IDs for optimizing processes.

  2. TableProcessMeta.java:

    private static final SnowflakeIdGenerator idGenerator = new SnowflakeIdGenerator();

    Used in createProcessMeta() to generate process IDs for maintenance processes (e.g., EXPIRE_SNAPSHOTS).

Both instances use the default machineId = 0. The sequence counter and lastTimestamp are instance-level fields (not static/shared), so each instance independently resets sequence = 0 upon entering a new 10ms time window.

Affects Versions

0.9.0-incubating

What table formats are you seeing the problem on?

No response

What engines are you seeing the problem on?

No response

How to reproduce

Instance A (IcebergTableUtil):
  timestamp = T, lastTimestamp ≠ T → enters else branch → sequence = 0
  Generated ID = (T << 13) | (0 << 8) | 0

Instance B (TableProcessMeta):
  timestamp = T, lastTimestamp ≠ T → enters else branch → sequence = 0
  Generated ID = (T << 13) | (0 << 8) | 0

Both IDs are identical → INSERT fails with PRIMARY KEY conflict

Note: synchronized on generateId() locks on this (the respective instance), so the two instances do not synchronize with each other.

Relevant log output

### Error updating database. Cause: java.sql.SQLIntegrityConstraintViolationException: 
Duplicate entry 'XXXXXXXXX' for key 'table_process.PRIMARY'

### The error may exist in org/apache/amoro/server/persistence/mapper/TableProcessMapper.java (inline)
### The error may involve org.apache.amoro.server.persistence.mapper.TableProcessMapper.insertProcess-Inline
### The error occurred while setting parameters

### SQL: INSERT INTO table_process (process_id, table_id, ...) VALUES (?, ?, ...)

Anything else

Fix

Consolidate all usages to a single global SnowflakeIdGenerator instance:

// SnowflakeIdGenerator.java
public static final SnowflakeIdGenerator INSTANCE = new SnowflakeIdGenerator();

Both IcebergTableUtil and TableProcessMeta now reference SnowflakeIdGenerator.INSTANCE.generateId() instead of maintaining their own instances. This ensures the synchronized lock and sequence counter work correctly across all callers within the same JVM.

Are you willing to submit a PR?

  • Yes I am willing to submit a PR!

Code of Conduct

  • I agree to follow this project's Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Labels

    type:bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions