06 — Batch Operations UI
Ring: 2 (Retention) — usable for super_admin in Ring 1 as well Dependency: R1-1 (Auth), R1-2 (Cost Control) Handbook: Ch. 229 (pipeline architecture), batch scripts already running (CLI)
Problem
- 4 batch scripts (cleaner, finder, bulk_fixer, processor) only run from CLI.
- Requires terminal access — cannot be used from the web UI.
- No progress tracking — wait until the script finishes.
- Weak error handling — if the script crashes, unclear what happened.
- File upload (scraper) has minimal security limits.
Decisions
D1: Access — 2-Phase Rollout
| Phase | Who Uses It | Reason |
|---|---|---|
| Phase 1 (Ring 1-2) | super_admin only | Not opened to users until security protocols are tested |
| Phase 2 (Ring 3+) | Enterprise plan — with feature-based roles | Roles with can_run_batch permission. After security is proven. |
Batch operations DISABLED on Free/Pro/Team plans (batch_operations: false in plan_limits).
D2: 4 Batch Operations → Web UI
| Script | API Endpoint | What It Does |
|---|---|---|
cleaner.js | POST /api/batch/clean | Segment audit + IRRELEVANT cleanup |
finder.js | POST /api/batch/headhunt | Bulk contact discovery |
bulk_fixer.js | POST /api/batch/enrich | Bulk company enrichment |
processor.js | POST /api/batch/discover | Keywords + countries → bulk discovery |
D3: SSE Streaming Progress
ReadableStream + TextEncoder (Next.js App Router natively supports this).
D4: File Upload Security (Scraper)
Now (super_admin only):| Rule | Value | Current |
|---|---|---|
| Max file size | 50MB | ✅ Exists (FIX-10) |
| Timeout | 60s | ✅ Exists (FIX-10) |
| Allowed file types | .pdf, .xlsx, .xls, .docx, .csv | ⚠️ CSV missing |
| Concurrent job limit | 1 (sufficient for super_admin) | ❌ Missing |
| Rule | Value | When |
|---|---|---|
| File type whitelist (MIME + extension) | Strict check | Phase 2 |
| Virus/malware scan (ClamAV) | Every upload scanned | Phase 2 |
| Row limit per file | Max 10,000 records | Phase 2 |
| Concurrent job limit (per org) | 1 active job | Phase 2 |
| Quarantine folder | Suspicious files quarantined | Phase 2 |
In Phase 1 super_admin is the only user, so heavy security is unnecessary. Phase 2 security is implemented before Enterprise launch.
D5: Batch Job Tracking
Every batch job is recorded in theexport_ai_ai_job_runs table:
job_type: ‘batch_clean’ | ‘batch_headhunt’ | ‘batch_enrich’ | ‘batch_discover’status: ‘running’ → ‘done’ | ‘failed’- Progress updated from within the SSE stream
- Cost: each AI call’s estimated_cost is summed
Architecture
Operations Page
API Route Structure
Current Code Impact
Status of Existing Scripts
CLI scripts WILL NOT be removed. API routes will be built with the scripts’ logic, but CLI versions are kept for dev/debug. Scripts may be deprecated later.
New Files
| File | Content |
|---|---|
app/api/batch/clean/route.ts | SSE streaming batch cleanup |
app/api/batch/discover/route.ts | SSE streaming bulk discovery |
app/api/batch/enrich/route.ts | SSE streaming bulk enrichment |
app/api/batch/headhunt/route.ts | SSE streaming bulk contact discovery |
lib/batch/stream-helpers.ts | SSE encoder, progress event builder |
lib/batch/job-guard.ts | Concurrent job check (1 per org) |
app/operations/page.tsx | Operations UI (4 cards + form + progress) |
app/operations/components/BatchProgressPanel.tsx | SSE consumer + progress bar |
Scraper Change
| File | Change |
|---|---|
app/api/scraper/run/route.ts | CSV file type support to be added |
Future Decisions
FD-1: Batch Queue (Ring 4)
Background job queue instead of SSE (BullMQ/Redis). Job continues even if the user closes the page. Runs on a Hetzner worker VPS.
FD-2: Scheduled Batches (Ring 3+)
Scheduled batch operations — “run cleaner every Monday”. Cron job UI.
FD-3: Batch History & Replay (Ring 3)
List of past batch operations, results, re-run capability.
Atomic Tasks
| # | Task | Ring | Size |
|---|---|---|---|
| BATCH-1 | lib/batch/stream-helpers.ts — SSE encoder + progress builder | R2 | Small |
| BATCH-2 | lib/batch/job-guard.ts — concurrent job check | R2 | Small |
| BATCH-3 | POST /api/batch/clean — cleaner.js logic + SSE | R2 | Large |
| BATCH-4 | POST /api/batch/discover — processor.js logic + SSE | R2 | Large |
| BATCH-5 | POST /api/batch/enrich — bulk_fixer.js logic + SSE | R2 | Large |
| BATCH-6 | POST /api/batch/headhunt — finder.js logic + SSE | R2 | Large |
| BATCH-7 | app/operations/page.tsx — 4 cards + parameter form | R2 | Medium |
| BATCH-8 | BatchProgressPanel.tsx — SSE consumer + progress bar + cost display | R2 | Medium |
| BATCH-9 | Add CSV support to scraper | R2 | Small |
| BATCH-10 | Phase 2 security (ClamAV + MIME check + row limit) — before Enterprise | R3 | Medium |