[codex] fix large repository graph loading#732
[codex] fix large repository graph loading#732JaysonAlbert wants to merge 1 commit intoabhigyanpatwari:mainfrom
Conversation
|
Someone is attempting to deploy a commit to the NexusCore Team on Vercel. A member of the Team first needs to authorize it. |
|
@claude is this ready for production? Is there any performance issues and gaps in the implementation? Did we implement what we promised in the summary? Also analyze the architectural fit. |
|
Claude finished @magyargergo's task in 2m 3s —— View job PR Review: Large Graph StreamingTasks
TL;DRThe fix correctly addresses the root cause ( ✅ Summary Promises — Delivered
🔴 Production Blockers1. No client-disconnect handling — stream keeps running after browser closes
// Missing in api.ts around line 508:
req.on('close', () => { /* need a way to abort streamGraphNdjson */ });2. Session lock held for entire stream duration — blocks all other DB queries
// api.ts:512 — this locks the session for the full stream duration
await withLbugDb(lbugPath, async () => streamGraphNdjson(res, includeContent));The solution would be to either (a) release the lock after the query is prepared and use a cursor approach, or (b) accept the limitation but document it clearly. 3. No backpressure —
// api.ts:194 — no backpressure
res.write(JSON.stringify({ type: 'node', data: ... }) + '\n');🟡 Significant Gaps4. Silent error swallowing in node table queries // api.ts:201-203
} catch {
// ignore empty tables
}This catches all errors, not just "table empty" errors. A real DB failure mid-stream would be silently discarded, leaving the client with a partial graph and no error signal. Should at minimum check if the error is "table not found" before swallowing it. 5. Client fallback JSON path is dead code
6. // backend-client.ts:455-456
const contentLength = response.headers.get('Content-Length');
const total = contentLength ? parseInt(contentLength, 10) : null;NDJSON responses use chunked transfer encoding — no 7.
🟢 Architectural AssessmentThe NDJSON-over-HTTP approach is a good architectural fit:
The main architectural concern is the session lock holding strategy: the single global 🧪 Test Coverage Gaps
VerdictNot production-ready as-is. The root-cause fix is correct and the approach is sound, but issues #1 (client disconnect) and #3 (backpressure) are genuine production risks for the large-repo use case this PR targets. Issue #2 (session lock) is an architectural concern that should be documented if not fixed. The rest are lower priority but worth tracking. |
|
@JaysonAlbert could you please look at these findings? |
I'll check it out |
CI Report❌ Some checks failed Pipeline Status
Test Results
✅ All 5486 tests passed 92 test(s) skipped — expand for details
Code CoverageTests
📋 View full run · Generated by CI |
|
Thank you for your contribution too! This looks very promising! |
|
We need to ensure green CI and increased test coverage. |
Summary
/api/graphresponses as NDJSON instead of building one huge JSON stringRoot Cause
Large indexed repositories can produce an
/api/graphpayload that is too large to serialize as a single JSON string in Node. That caused the backend to fail withInvalid string lengthbefore the browser could finish loading the repository.Fix
streamQuery()on the LadybugDB adapter so large result sets can be emitted row-by-rowstream=truesupport on/api/graphand write node/relationship records as NDJSONfetchGraph()ingitnexus-webto consume NDJSON incrementally while preserving the old JSON fallbackFixes #495.
Validation
cd gitnexus-web && npm test -- test/unit/server-connection.test.tscd gitnexus && npm test -- test/unit/cors.test.tscd gitnexus && npx tsc --noEmitcd gitnexus-web && npx tsc -b --noEmitlibra-server-release-1.63.x(338911nodes /1437718relationships) through the streamed graph path