feat: add deterministic identity storage for indexed events#40
feat: add deterministic identity storage for indexed events#40Arome8240 wants to merge 1 commit into
Conversation
- Add IndexedEventEntity with contract ID, ledger number, transaction hash, event index - Create migration with uniqueness constraints and indexes for replay/debug queries - Implement IndexedEventRepository with deduplication via insertSafely() - Add indexer database configuration and environment variables - Update documentation and add tests Implements: Migrated from https://github.com/Fundable-Protocol/stellar_indexer/issues/6
|
@Arome8240 Great news! 🎉 Based on an automated assessment of this PR, the linked Wave issue(s) no longer count against your application limits. You can now already apply to more issues while waiting for a review of this PR. Keep up the great work! 🚀 |
📝 WalkthroughWalkthroughThe PR adds an ChangesIndexed event storage
Sequence Diagram(s)sequenceDiagram
participant IndexedEventRepository
participant TypeORMRepository as "Repository<IndexedEventEntity>"
participant PostgreSQL
IndexedEventRepository->>TypeORMRepository: create(event)
IndexedEventRepository->>TypeORMRepository: save(entity)
TypeORMRepository->>PostgreSQL: INSERT indexed_event
PostgreSQL-->>TypeORMRepository: 23505 on duplicate key
IndexedEventRepository->>TypeORMRepository: findOneBy(identity fields)
TypeORMRepository-->>IndexedEventRepository: existing row
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 ESLint
ESLint install failed. For unrecoverable errors, disable the tool in CodeRabbit configuration. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 5
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/config/persistence/data-source.ts (1)
22-30: 🩺 Stability & Availability | 🟠 Major | ⚡ Quick winFail fast on invalid
DATABASE_PORTvalues.
DATABASE_PORTis already treated as required, so falling back to5432on parse failure just hides misconfiguration. A typo likeDATABASE_PORT=abccan silently point the app at the wrong Postgres instance instead of failing startup.🔧 Suggested fix
-const databasePort = Number(dbConfigs.DATABASE_PORT ?? 5432); +const databasePort = Number(dbConfigs.DATABASE_PORT); +if (!Number.isInteger(databasePort) || databasePort < 1 || databasePort > 65535) { + throw new Error(`Invalid DATABASE_PORT: ${dbConfigs.DATABASE_PORT}`); +} @@ - port: Number.isNaN(databasePort) ? 5432 : databasePort, + port: databasePort,Also applies to: 38-38
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/config/persistence/data-source.ts` around lines 22 - 30, Fail fast when parsing DATABASE_PORT instead of silently defaulting to 5432; the current config validation in data-source.ts already treats DATABASE_PORT as required, so update the port parsing/normalization logic used with dbConfigs.DATABASE_PORT to throw or stop startup on invalid values like non-numeric input. Make the change in the configuration path that builds the DataSource options so the app only proceeds when DATABASE_PORT is a valid port number, and keep the required-key check aligned with that behavior.
🧹 Nitpick comments (3)
src/config/persistence/data-source.ts (1)
18-18: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick winImport
IndexedEventEntitythrough the public package surface.This deep relative import couples app boot to the monorepo source layout even though the DB surface is already exported publicly. Please verify the package entrypoint is available here and switch to it so this file depends on the published contract, not
indexer/common/src/....🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/config/persistence/data-source.ts` at line 18, The import for IndexedEventEntity is reaching into the monorepo source layout instead of using the public DB package surface. Update the data-source setup in this module to import IndexedEventEntity from the package entrypoint that already exports it, so the dependency is on the published contract rather than indexer/common/src/... .indexer/common/src/db/indexed-event.entity.ts (1)
21-28: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick winKeep the entity metadata aligned with the migration.
The migration adds
processed_by,(contract_id, ledger_number), and(processed_by, ledger_number)indexes, but the entity metadata only declares the single-column set plus the dedupe key. That drift makes future schema-diff migrations noisy and under-documents the access paths the repository now relies on for domain/replay lookups.Suggested fix
`@Entity`("indexed_event") `@Index`("indexed_event_contract_id_idx", ["contractId"]) `@Index`("indexed_event_ledger_idx", ["ledgerNumber"]) `@Index`("indexed_event_tx_hash_idx", ["transactionHash"]) `@Index`("indexed_event_created_at_idx", ["createdAt"]) +@Index("indexed_event_processed_by_idx", ["processedBy"]) +@Index("indexed_event_contract_ledger_idx", ["contractId", "ledgerNumber"]) +@Index("indexed_event_domain_ledger_idx", ["processedBy", "ledgerNumber"]) `@Index`("indexed_event_dedupe_idx", ["contractId", "ledgerNumber", "transactionHash", "eventIndex"], { unique: true, }) export class IndexedEventEntity { @@ `@CreateDateColumn`({ type: "timestamp", name: "created_at", + precision: 3, default: () => "CURRENT_TIMESTAMP", }) createdAt: Date;Also applies to: 78-82
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@indexer/common/src/db/indexed-event.entity.ts` around lines 21 - 28, The IndexedEvent entity metadata is missing the indexes introduced by the migration, so keep `@Entity`("indexed_event") and the existing `@Index` declarations in IndexedEventEntity aligned by adding the missing processedBy-related composite indexes alongside the current single-column and dedupe indexes. Update the entity’s index decorators to match the migration’s access paths, specifically the (contractId, ledgerNumber) and (processedBy, ledgerNumber) combinations, so future schema diffs stay clean and the repository’s lookup patterns are documented in the model.indexer/common/src/db/indexed-event.repository.test.ts (1)
17-168: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick winCover the two replay query methods as well.
getByLedgerRange()andgetByContract()are part of the new public repository surface, but this suite never asserts their filter/order/limit wiring. That leaves the replay/debug paths unprotected, especially the domain filter and stable ordering behavior.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@indexer/common/src/db/indexed-event.repository.test.ts` around lines 17 - 168, Add tests for the new replay query methods in IndexedEventRepository: getByLedgerRange() and getByContract() are currently unverified, so their filter, order, and limit wiring should be asserted. Extend the IndexedEventRepository test suite with mocked createQueryBuilder expectations for these methods, checking the domain/contract filters, ledger range conditions, stable ordering, and any pagination/limit behavior they apply. Use the existing mockRepo and query builder pattern in the IndexedEventRepository describe block to cover the new public surface.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@indexer/common/package.json`:
- Around line 17-20: Add reflect-metadata to the dependencies in
indexer/common/package.json so the new ./db entrypoint can resolve the explicit
import used by data-source.ts. Update the dependencies list alongside typeorm
and ulidx, keeping the version pinned as requested, so consumers of the common
package do not hit a runtime ReferenceError when importing the database module.
In `@indexer/common/src/db/data-source.ts`:
- Line 61: The TLS configuration in the data source setup currently disables
certificate verification by default via the ssl settings, which weakens
transport security. Update the db connection logic in data-source.ts so the
DataSource/connection options no longer hardcode rejectUnauthorized: false;
instead, keep certificate validation enabled by default and only relax it behind
an explicit opt-in path using the useSsl configuration flow.
- Around line 85-97: Guard initializeIndexerDataSource against concurrent calls
by reusing a single in-flight initialization promise instead of creating a new
DataSource for each caller. Update the initializeIndexerDataSource function and
its indexerDataSource singleton handling so that if initialization is already in
progress, later calls await the same promise and return the same initialized
instance. Make sure the createIndexerDataSource and initialize() path only runs
once until completion.
In `@indexer/common/src/db/indexed-event.entity.ts`:
- Around line 1-10: The indexed event entity is exposing a bigint-backed
ledgerNumber as a number type even though the Postgres driver returns int8
values as strings at runtime. Update the IndexedEventEntity ledgerNumber column
definition to normalize values at the boundary by adding a TypeORM transformer
(or equivalent conversion logic) so reads and writes consistently use the same
type, and keep the change localized to the ledgerNumber field in
indexed-event.entity.ts.
In `@indexer/common/src/db/indexed-event.repository.ts`:
- Around line 81-86: The ordering in the IndexedEventRepository query builders
is not fully stable because getByLedgerRange and getByContract only sort by
ledgerNumber and eventIndex, which can collide across transactions in the same
ledger. Update the ordering in both methods to add a deterministic tie-breaker
after eventIndex, using an existing unique field such as transactionHash or a
persisted transaction position if available, and keep the same sort direction so
replay/debug results are consistent across runs.
---
Outside diff comments:
In `@src/config/persistence/data-source.ts`:
- Around line 22-30: Fail fast when parsing DATABASE_PORT instead of silently
defaulting to 5432; the current config validation in data-source.ts already
treats DATABASE_PORT as required, so update the port parsing/normalization logic
used with dbConfigs.DATABASE_PORT to throw or stop startup on invalid values
like non-numeric input. Make the change in the configuration path that builds
the DataSource options so the app only proceeds when DATABASE_PORT is a valid
port number, and keep the required-key check aligned with that behavior.
---
Nitpick comments:
In `@indexer/common/src/db/indexed-event.entity.ts`:
- Around line 21-28: The IndexedEvent entity metadata is missing the indexes
introduced by the migration, so keep `@Entity`("indexed_event") and the existing
`@Index` declarations in IndexedEventEntity aligned by adding the missing
processedBy-related composite indexes alongside the current single-column and
dedupe indexes. Update the entity’s index decorators to match the migration’s
access paths, specifically the (contractId, ledgerNumber) and (processedBy,
ledgerNumber) combinations, so future schema diffs stay clean and the
repository’s lookup patterns are documented in the model.
In `@indexer/common/src/db/indexed-event.repository.test.ts`:
- Around line 17-168: Add tests for the new replay query methods in
IndexedEventRepository: getByLedgerRange() and getByContract() are currently
unverified, so their filter, order, and limit wiring should be asserted. Extend
the IndexedEventRepository test suite with mocked createQueryBuilder
expectations for these methods, checking the domain/contract filters, ledger
range conditions, stable ordering, and any pagination/limit behavior they apply.
Use the existing mockRepo and query builder pattern in the
IndexedEventRepository describe block to cover the new public surface.
In `@src/config/persistence/data-source.ts`:
- Line 18: The import for IndexedEventEntity is reaching into the monorepo
source layout instead of using the public DB package surface. Update the
data-source setup in this module to import IndexedEventEntity from the package
entrypoint that already exports it, so the dependency is on the published
contract rather than indexer/common/src/... .
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: f0c61c12-44de-4ac9-9b16-9c555f49ce17
📒 Files selected for processing (13)
.env.exampleindexer/INDEXER_GUIDELINES.mdindexer/common/package.jsonindexer/common/src/db/EXAMPLE_USAGE.mdindexer/common/src/db/data-source.tsindexer/common/src/db/index.tsindexer/common/src/db/indexed-event.entity.test.tsindexer/common/src/db/indexed-event.entity.tsindexer/common/src/db/indexed-event.repository.test.tsindexer/common/src/db/indexed-event.repository.tsindexer/common/src/index.tssrc/config/persistence/data-source.tssrc/migrations/CreateIndexedEventTable1704000000001.js
| }, | ||
| "dependencies": { | ||
| "typeorm": "^0.3.20", | ||
| "ulidx": "^2.4.1" |
There was a problem hiding this comment.
🩺 Stability & Availability | 🟡 Minor
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "Checking indexer/common/package.json dependencies:"
cat indexer/common/package.json | jq '.dependencies // {}'
echo
echo "Searching for direct import of reflect-metadata in indexer/common:"
rg -n --type=ts '^\s*import\s+["'\'']reflect-metadata["'\'']' indexer/common/src
echo
echo "Expected: dependencies includes \"reflect-metadata\" if imported by package entrypoints."Repository: Fundable-Protocol/Backend
Length of output: 484
Add reflect-metadata to package.json dependencies.
The file indexer/common/src/db/data-source.ts explicitly imports reflect-metadata. Since this file is exposed via the new ./db entrypoint in indexer/common/package.json, consumers using this module will encounter a runtime failure (ReferenceError) if reflect-metadata is not installed in their own node_modules.
// indexer/common/package.json
"dependencies": {
"reflect-metadata": "0.2.2",
"typeorm": "^0.3.20",
"ulidx": "^2.4.1"
}🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@indexer/common/package.json` around lines 17 - 20, Add reflect-metadata to
the dependencies in indexer/common/package.json so the new ./db entrypoint can
resolve the explicit import used by data-source.ts. Update the dependencies list
alongside typeorm and ulidx, keeping the version pinned as requested, so
consumers of the common package do not hit a runtime ReferenceError when
importing the database module.
| logging: process.env.NODE_ENV === "development", | ||
| entities: [IndexedEventEntity], | ||
| migrations: ["src/migrations/*.js"], | ||
| ...(useSsl ? { ssl: { rejectUnauthorized: false } } : {}), |
There was a problem hiding this comment.
🔒 Security & Privacy | 🟠 Major | ⚡ Quick win
Do not disable TLS certificate verification by default.
Using rejectUnauthorized: false disables server cert validation and weakens DB transport security in production-like environments.
Suggested fix
- ...(useSsl ? { ssl: { rejectUnauthorized: false } } : {}),
+ ...(useSsl ? { ssl: true } : {}),📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| ...(useSsl ? { ssl: { rejectUnauthorized: false } } : {}), | |
| ...(useSsl ? { ssl: true } : {}), |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@indexer/common/src/db/data-source.ts` at line 61, The TLS configuration in
the data source setup currently disables certificate verification by default via
the ssl settings, which weakens transport security. Update the db connection
logic in data-source.ts so the DataSource/connection options no longer hardcode
rejectUnauthorized: false; instead, keep certificate validation enabled by
default and only relax it behind an explicit opt-in path using the useSsl
configuration flow.
| export async function initializeIndexerDataSource( | ||
| config: IndexerDatabaseConfig, | ||
| ): Promise<DataSource> { | ||
| validateIndexerDatabaseConfig(config); | ||
|
|
||
| if (indexerDataSource?.isInitialized) { | ||
| return indexerDataSource; | ||
| } | ||
|
|
||
| indexerDataSource = createIndexerDataSource(config); | ||
| await indexerDataSource.initialize(); | ||
|
|
||
| return indexerDataSource; |
There was a problem hiding this comment.
🩺 Stability & Availability | 🟠 Major | ⚡ Quick win
Guard singleton initialization against concurrent calls.
Concurrent initializeIndexerDataSource() calls can race and create multiple DataSource instances before initialization completes.
Suggested fix
let indexerDataSource: DataSource | null = null;
+let indexerDataSourceInitPromise: Promise<DataSource> | null = null;
@@
export async function initializeIndexerDataSource(
config: IndexerDatabaseConfig,
): Promise<DataSource> {
validateIndexerDatabaseConfig(config);
-
+
if (indexerDataSource?.isInitialized) {
return indexerDataSource;
}
+ if (indexerDataSourceInitPromise) {
+ return indexerDataSourceInitPromise;
+ }
- indexerDataSource = createIndexerDataSource(config);
- await indexerDataSource.initialize();
-
- return indexerDataSource;
+ indexerDataSource = createIndexerDataSource(config);
+ indexerDataSourceInitPromise = indexerDataSource.initialize().then(() => indexerDataSource!);
+ try {
+ await indexerDataSourceInitPromise;
+ return indexerDataSource;
+ } finally {
+ indexerDataSourceInitPromise = null;
+ }
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| export async function initializeIndexerDataSource( | |
| config: IndexerDatabaseConfig, | |
| ): Promise<DataSource> { | |
| validateIndexerDatabaseConfig(config); | |
| if (indexerDataSource?.isInitialized) { | |
| return indexerDataSource; | |
| } | |
| indexerDataSource = createIndexerDataSource(config); | |
| await indexerDataSource.initialize(); | |
| return indexerDataSource; | |
| let indexerDataSource: DataSource | null = null; | |
| let indexerDataSourceInitPromise: Promise<DataSource> | null = null; | |
| export async function initializeIndexerDataSource( | |
| config: IndexerDatabaseConfig, | |
| ): Promise<DataSource> { | |
| validateIndexerDatabaseConfig(config); | |
| if (indexerDataSource?.isInitialized) { | |
| return indexerDataSource; | |
| } | |
| if (indexerDataSourceInitPromise) { | |
| return indexerDataSourceInitPromise; | |
| } | |
| indexerDataSource = createIndexerDataSource(config); | |
| indexerDataSourceInitPromise = indexerDataSource.initialize().then(() => indexerDataSource!); | |
| try { | |
| await indexerDataSourceInitPromise; | |
| return indexerDataSource; | |
| } finally { | |
| indexerDataSourceInitPromise = null; | |
| } | |
| } |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@indexer/common/src/db/data-source.ts` around lines 85 - 97, Guard
initializeIndexerDataSource against concurrent calls by reusing a single
in-flight initialization promise instead of creating a new DataSource for each
caller. Update the initializeIndexerDataSource function and its
indexerDataSource singleton handling so that if initialization is already in
progress, later calls await the same promise and return the same initialized
instance. Make sure the createIndexerDataSource and initialize() path only runs
once until completion.
| import { | ||
| Entity, | ||
| Column, | ||
| PrimaryColumn, | ||
| CreateDateColumn, | ||
| Index, | ||
| BeforeInsert, | ||
| } from "typeorm"; | ||
| import { ulid } from "ulidx"; | ||
|
|
There was a problem hiding this comment.
🎯 Functional Correctness | 🟠 Major
Normalize ledgerNumber at the column boundary to ensure type consistency.
TypeORM (Postgres driver) returns SQL bigint (int8) values as strings by default to prevent precision loss. The current entity declares ledgerNumber as bigint, which will conflict with the actual runtime string value unless a transformer is applied.
Suggested fix
import {
Entity,
Column,
PrimaryColumn,
CreateDateColumn,
Index,
BeforeInsert,
} from "typeorm";
import { ulid } from "ulidx";
+
+const bigintTransformer = {
+ to: (value: bigint) => value.toString(),
+ from: (value: string) => BigInt(value),
+};
/**
* Entity representing a Soroban event that has been indexed.
* Provides deterministic identity for deduplication and replay safety.
*
* Required fields for event identity (per INDEXER_GUIDELINES.md):
* - Contract ID
* - Ledger number
* - Transaction hash
* - Event index or another deterministic event position
*/
`@Entity`("indexed_event")
@@
`@Column`("bigint", { name: "ledger_number", nullable: false })
+ `@Column`("bigint", {
+ name: "ledger_number",
+ nullable: false,
+ transformer: bigintTransformer,
+ })
+ ledgerNumber: bigint;🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@indexer/common/src/db/indexed-event.entity.ts` around lines 1 - 10, The
indexed event entity is exposing a bigint-backed ledgerNumber as a number type
even though the Postgres driver returns int8 values as strings at runtime.
Update the IndexedEventEntity ledgerNumber column definition to normalize values
at the boundary by adding a TypeORM transformer (or equivalent conversion logic)
so reads and writes consistently use the same type, and keep the change
localized to the ledgerNumber field in indexed-event.entity.ts.
| const query = this.repo | ||
| .createQueryBuilder("event") | ||
| .where("event.ledgerNumber >= :startLedger", { startLedger }) | ||
| .andWhere("event.ledgerNumber <= :endLedger", { endLedger }) | ||
| .orderBy("event.ledgerNumber", "ASC") | ||
| .addOrderBy("event.eventIndex", "ASC"); |
There was a problem hiding this comment.
🎯 Functional Correctness | 🟠 Major | ⚡ Quick win
Add a stable tie-breaker to the replay/debug ordering.
Both query methods sort by ledgerNumber and eventIndex, but eventIndex is only unique within a transaction. Two different transactions in the same ledger can therefore share the same sort keys, so getByLedgerRange() and getByContract() can return rows in an arbitrary order across runs.
Suggested fix
const query = this.repo
.createQueryBuilder("event")
.where("event.ledgerNumber >= :startLedger", { startLedger })
.andWhere("event.ledgerNumber <= :endLedger", { endLedger })
.orderBy("event.ledgerNumber", "ASC")
+ .addOrderBy("event.transactionHash", "ASC")
.addOrderBy("event.eventIndex", "ASC");
@@
const query = this.repo
.createQueryBuilder("event")
.where("event.contractId = :contractId", { contractId })
.orderBy("event.ledgerNumber", "DESC")
+ .addOrderBy("event.transactionHash", "DESC")
.addOrderBy("event.eventIndex", "DESC");If callers need true chain order within a ledger, this table still needs a persisted transaction position; transactionHash is only a deterministic tie-breaker, not canonical execution order.
Also applies to: 102-106
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@indexer/common/src/db/indexed-event.repository.ts` around lines 81 - 86, The
ordering in the IndexedEventRepository query builders is not fully stable
because getByLedgerRange and getByContract only sort by ledgerNumber and
eventIndex, which can collide across transactions in the same ledger. Update the
ordering in both methods to add a deterministic tie-breaker after eventIndex,
using an existing unique field such as transactionHash or a persisted
transaction position if available, and keep the same sort direction so
replay/debug results are consistent across runs.
Implements: Migrated from https://github.com/Fundable-Protocol/stellar_indexer/issues/6
closes #25
Summary by CodeRabbit