Skip to content

Maxim055/LanguageAnalizer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Language Analyzer

ASP.NET Core API + background workers + React UI that ingests text files and counts English (Latin), Russian (Cyrillic), and Armenian letters per file. Built with .NET 8, EF Core 8, PostgreSQL, and Vite + React 18.

Repository layout

backend/                                 # .NET solution (run dotnet commands from here)
  LangAnalyzer.sln
  src/
    LangAnalyzer.Core.Contracts   # entities, enums, DTOs, ports, FluentResults error types
    LangAnalyzer.Core.Logic       # LanguageClassifier, FilePaginator, FileProcessor, FileManager
    LangAnalyzer.EF               # AnalyzerDbContext, EntityTypeConfiguration, Migrations
    LangAnalyzer.Api              # FilesController, Startup, Program
    LangAnalyzer.Worker           # FileProcessingWorker (claim + process loop)
    LangAnalyzer.SweeperWorker    # StaleLeaseSweeper (periodic reset of stale leases)
  tests/
    LangAnalyzer.Core.Logic.Tests
    LangAnalyzer.IntegrationTests
  docker-compose.yml                     # Postgres + Api + 2x Worker + SweeperWorker
  src/LangAnalyzer.{Api,Worker,SweeperWorker}/Dockerfile
web/                                     # React + Vite + TypeScript UI

backend/ and web/ are two independent build chains. dotnet commands run from backend/; npm commands run from web/.

Architecture

Three deployable .NET processes share the same DB and storage:

  • Api accepts uploads, persists files to ./Storage/uploads, inserts a Created row.
  • Worker claims one row at a time via an EF Core compare-and-swap (Where(Status == Created).ExecuteUpdateAsync(...)), paginates the file into char-bounded chunks, processes pages with Parallel.ForEachAsync(MaxDegreeOfParallelism = 10), heartbeats on every page, then marks completed.
  • SweeperWorker runs StaleLeaseSweeper on a PeriodicTimer. Resets any Processing row whose updated_at is older than StaleLeaseSeconds (default 20s) back to Created with all counters zeroed, so an abandoned file restarts from scratch on a different worker.

Horizontal scale: run N replicas of Worker, typically a single replica of SweeperWorker. The compare-and-swap claim makes worker replicas race-safe; the sweeper's ExecuteUpdateAsync is idempotent so multiple sweeper replicas would also be safe, just wasteful.

Prerequisites

You can run the backend either entirely in Docker (no .NET install needed) or directly on the host. Pick one path:

Docker path (simpler — recommended)

  • Docker Desktop with Compose v2

Host path

  • .NET 8 SDK
  • A locally-running PostgreSQL 16 (or any Postgres reachable from your machine) listening on port 6969, with a database called langanalyzer, user postgres, password postgres. Adjust the connection string in backend/src/LangAnalyzer.Api/appsettings.json (and the matching files for Worker / SweeperWorker) if your local setup differs.
  • dotnet-ef global tool (only if you want to run migrations manually): dotnet tool install --global dotnet-ef

Either path: Node 20+ and npm for the web UI.

Run with Docker (one command)

The compose file builds three .NET images and brings everything up: Postgres + API + two worker replicas + one sweeper. Migrations run automatically when the API container starts.

cd backend
docker compose up --build         # first time, builds the .NET images (~2-3 minutes)
docker compose up                 # subsequent runs use cached images

Once up:

  • API: http://localhost:5224 (Swagger at /swagger)
  • Postgres: localhost:6969 (connect with DBeaver as postgres/postgres / db langanalyzer)
  • Workers: 2 replicas (langanalyzer-worker-1, langanalyzer-worker-2)
  • Sweeper: 1 instance

Useful commands:

docker compose ps                 # confirm postgres healthy + 4 services running
docker compose logs -f api        # tail API logs
docker compose logs -f worker     # tail BOTH worker replicas
docker compose down               # stop everything (data persists)
docker compose down -v            # also wipe Postgres + uploads volumes (fresh start)

The web UI still runs on the host:

cd web
npm install
npm run dev                       # http://localhost:5173

It hits the API at http://localhost:5224 exactly as before — the UI doesn't need to know the API is in Docker.

Run on the host (without Docker)

All .NET commands run from backend/. Start your local PostgreSQL service first.

cd backend
dotnet build LangAnalyzer.sln

Apply migrations:

cd backend
dotnet ef database update \
  --project src/LangAnalyzer.EF \
  --startup-project src/LangAnalyzer.EF

Start the API (auto-migrates on startup as well):

cd backend
dotnet run --project src/LangAnalyzer.Api

Open Swagger at http://localhost:5224/swagger.

Start a file-processing worker (run this command in additional terminals to simulate replicas):

cd backend
dotnet run --project src/LangAnalyzer.Worker

Each worker process advertises itself as <machine>:<pid> via the worker_id column.

Start the stale-lease sweeper (one replica is normally enough):

cd backend
dotnet run --project src/LangAnalyzer.SweeperWorker

Start the web UI:

cd web
npm install
npm run dev    # http://localhost:5173

API

Method Route Description
POST /api/files (multipart file field) Upload a text file; returns the new id.
GET /api/files/{id}/status Current status, pages_completed/total, error if any. Rate-limited (see below).
GET /api/files/{id}/result 200 with counts + dominant language when Completed; 409 otherwise.

Sample upload:

curl -F "file=@./sample.txt" http://localhost:5224/api/files

Configuration

Each host reads from its own appsettings.json and appsettings.Development.json.

API + Worker:

{
  "ConnectionStrings": { "Postgres": "Host=localhost;..." },
  "Analyzer": {
    "UploadsRoot": "./Storage/uploads",
    "PageSize": 65536,
    "MaxDegreeOfParallelism": 10
  },
  "Worker": {
    "IdleBackoffSeconds": 2
  },
  "Cors": {
    "Origins": ["http://localhost:5173"]
  }
}

SweeperWorker:

{
  "ConnectionStrings": { "Postgres": "Host=localhost;..." },
  "Sweeper": {
    "IntervalSeconds": 5,
    "StaleLeaseSeconds": 20
  }
}

EF Core conventions

  • Every state transition uses ExecuteUpdateAsync with a guarding Where predicate (compare-and-swap). No raw SQL anywhere except what EF Core auto-generates in migrations.
  • The only SaveChanges call is the initial insert in FileManager (EF Core has no ExecuteInsert).
  • Snake_case column names are set explicitly via HasColumnName(...) in FileProcessingConfiguration.
  • Two indexes: (status, created_at) for the FIFO claim, (status, updated_at) for the stale-lease sweeper.

Add a new migration:

cd backend
dotnet ef migrations add <Name> \
  --project src/LangAnalyzer.EF \
  --startup-project src/LangAnalyzer.EF \
  --output-dir Migrations

Tests

cd backend
dotnet test LangAnalyzer.sln
  • LangAnalyzer.Core.Logic.Tests — unit tests for the language classifier.
  • LangAnalyzer.IntegrationTests — integration tests against a real AnalyzerDbContext over a Sqlite in-memory database (no external dependencies). Currently covers parallel processing of two files by two concurrent workers.

Web UI

A React + Vite + TypeScript UI lives in [web/](web/). It uploads files to the API, polls each file's status every 2s, and lets you click "Get result" once a file is Completed. Failed files surface their error message inline.

cd web
npm install
npm run dev    # http://localhost:5173

Production bundle:

cd web
npm run build  # output in web/dist/

The UI talks to the API at VITE_API_BASE_URL (default http://localhost:5224). Override by copying web/.env.example to web/.env.local and editing the value. The API allows the UI's origin via the Cors:Origins block in backend/src/LangAnalyzer.Api/appsettings.json.

Rate limiting

GET /api/files/{id}/status is rate-limited to 60 requests / minute per remote IP using ASP.NET Core's built-in Microsoft.AspNetCore.RateLimiting (fixed-window). Excess calls receive 429 Too Many Requests. Upload and result endpoints are not rate-limited.

The policy is wired in backend/src/LangAnalyzer.Api/Startup.cs and applied selectively to the status action via [EnableRateLimiting(Startup.StatusPollRateLimitPolicy)] on backend/src/LangAnalyzer.Api/Controllers/FilesController.cs.

About

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors