Server: Performance: Store update changes per-user, rather than per-item#14712
Server: Performance: Store update changes per-user, rather than per-item#14712personalizedrefrigerator wants to merge 159 commits intolaurent22:devfrom
Conversation
This commit modifies `populateDatabase.ts` to: - Be runnable from the api/debug (dev-mode-only) endpoint. - Create shares, in addition to users and items.
…into pr/server/denormalize-changes
…ng the same name as an enum imported by the same file
processing each page Note: This change should fix a potential issue flagged during code review.
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
packages/server/src/migrations/planned/regroup_changes.ts (1)
184-184: Prefer template literal for string construction.Based on learnings: In all TypeScript files across the repository, prefer template literals for string construction instead of string concatenation with
+.♻️ Suggested refactor
- logger.info('Migrating', await getTotal('changes') + await getTotal('changes_2'), 'changes... start', offset); + const totalChanges = await getTotal('changes') + await getTotal('changes_2'); + logger.info(`Migrating ${totalChanges} changes... start ${offset}`);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/server/src/migrations/planned/regroup_changes.ts` at line 184, Replace the string concatenation in the logger.info call with a template literal: update the logging expression that currently uses logger.info('Migrating', await getTotal('changes') + await getTotal('changes_2'), 'changes... start', offset) to a single template literal string that interpolates the totals from await getTotal('changes'), await getTotal('changes_2') and the offset so the message is constructed with backticks and ${...} placeholders while still calling logger.info with the final interpolated string.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@packages/server/src/migrations/planned/regroup_changes.ts`:
- Around line 190-200: The down migration's lookup of lastChange may be
undefined when changes_2 is empty, causing lastChange.counter to throw; modify
the down function (around the lastChange retrieval) to check for a missing
result (e.g., if (!lastChange) return or otherwise handle it) before using
lastChange.counter so the subsequent insert from db('changes_3') only runs when
a valid counter exists; update the logic in the down function referencing
lastChange, db('changes_2'), and db('changes_3') accordingly.
---
Nitpick comments:
In `@packages/server/src/migrations/planned/regroup_changes.ts`:
- Line 184: Replace the string concatenation in the logger.info call with a
template literal: update the logging expression that currently uses
logger.info('Migrating', await getTotal('changes') + await
getTotal('changes_2'), 'changes... start', offset) to a single template literal
string that interpolates the totals from await getTotal('changes'), await
getTotal('changes_2') and the offset so the message is constructed with
backticks and ${...} placeholders while still calling logger.info with the final
interpolated string.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 51d58e67-d1de-4fb3-b79c-8faa660f35b2
📒 Files selected for processing (16)
packages/server/src/migrations/20260310123600_split_changes.tspackages/server/src/migrations/planned/regroup_changes.tspackages/server/src/models/ChangeModel/BaseChangeModel.tspackages/server/src/models/ChangeModel/ChangeModel.new.tspackages/server/src/models/ChangeModel/ChangeModel.old.tspackages/server/src/models/ChangeModel/ChangeModel.test.tspackages/server/src/models/ChangeModel/ChangeModel.tspackages/server/src/models/ItemModel.tspackages/server/src/models/ShareModel.tspackages/server/src/models/UserItemModel.tspackages/server/src/models/factory.tspackages/server/src/routes/api/items.tspackages/server/src/routes/api/shares.folder.test.tspackages/server/src/services/database/types.tspackages/server/src/tools/generateTypes.tspackages/tools/cspell/dictionary4.txt
✅ Files skipped from review due to trivial changes (5)
- packages/server/src/tools/generateTypes.ts
- packages/tools/cspell/dictionary4.txt
- packages/server/src/routes/api/shares.folder.test.ts
- packages/server/src/models/UserItemModel.ts
- packages/server/src/routes/api/items.ts
🚧 Files skipped from review as they are similar to previous changes (3)
- packages/server/src/models/ItemModel.ts
- packages/server/src/services/database/types.ts
- packages/server/src/migrations/20260310123600_split_changes.ts
Applies feedback from code review
There was a problem hiding this comment.
♻️ Duplicate comments (1)
packages/server/src/migrations/planned/regroup_changes.ts (1)
216-227:⚠️ Potential issue | 🟡 MinorHandle empty
changes_2table in down migration.If
changes_2is empty,lastCounterretrieval viagetLastPreMigrationCounterwill returnundefined.counter, causing a runtime error. This scenario could occur if the migration runs before any new changes are recorded.🐛 Proposed fix
export const down = async (db: DbConnection) => { - const lastCounter = await getLastPreMigrationCounter(db); + const lastChange = await db('changes_2') + .select('counter') + .orderBy('counter', 'desc') + .first(); - await db('changes_2').insert( - db('changes_3') - .select('*') - .where('counter', '>', lastCounter) - .orderBy('counter', 'asc'), - ); + if (lastChange) { + await db('changes_2').insert( + db('changes_3') + .select('*') + .where('counter', '>', lastChange.counter) + .orderBy('counter', 'asc'), + ); + } else { + // If changes_2 was empty, copy all changes_3 records + await db('changes_2').insert( + db('changes_3') + .select('*') + .orderBy('counter', 'asc'), + ); + } await db.schema.dropTable('changes_3'); };🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/server/src/migrations/planned/regroup_changes.ts` around lines 216 - 227, The down migration's use of getLastPreMigrationCounter() can return undefined and cause a runtime error when accessing lastCounter; update the down function to guard against an empty changes_2 by checking if lastCounter is null/undefined before using it — if undefined, perform the insert from changes_3 without the .where('counter','>', lastCounter) filter (i.e., copy all rows), otherwise use the existing where clause; reference the down function, getLastPreMigrationCounter, and tables changes_2/changes_3 when making this change.
🧹 Nitpick comments (1)
packages/server/src/migrations/planned/regroup_changes.ts (1)
200-200: Use template literal for string construction.Per coding guidelines, prefer template literals over string concatenation.
🔧 Suggested fix
- logger.info('Migrating', await getTotal('changes') + await getTotal('changes_2'), 'changes... start', offset); + const totalChanges = await getTotal('changes') + await getTotal('changes_2'); + logger.info(`Migrating ${totalChanges} changes... start ${offset}`);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/server/src/migrations/planned/regroup_changes.ts` at line 200, Replace the concatenated logger.info call with a single template literal string so it uses template syntax instead of passing mixed args; locate the line that calls logger.info in regroup_changes.ts (the call that references getTotal('changes'), getTotal('changes_2') and offset) and construct one template literal like `Migrating ${await getTotal('changes') + await getTotal('changes_2')} changes... start ${offset}` before passing it to logger.info.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@packages/server/src/migrations/planned/regroup_changes.ts`:
- Around line 216-227: The down migration's use of getLastPreMigrationCounter()
can return undefined and cause a runtime error when accessing lastCounter;
update the down function to guard against an empty changes_2 by checking if
lastCounter is null/undefined before using it — if undefined, perform the insert
from changes_3 without the .where('counter','>', lastCounter) filter (i.e., copy
all rows), otherwise use the existing where clause; reference the down function,
getLastPreMigrationCounter, and tables changes_2/changes_3 when making this
change.
---
Nitpick comments:
In `@packages/server/src/migrations/planned/regroup_changes.ts`:
- Line 200: Replace the concatenated logger.info call with a single template
literal string so it uses template syntax instead of passing mixed args; locate
the line that calls logger.info in regroup_changes.ts (the call that references
getTotal('changes'), getTotal('changes_2') and offset) and construct one
template literal like `Migrating ${await getTotal('changes') + await
getTotal('changes_2')} changes... start ${offset}` before passing it to
logger.info.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 24606cab-acb3-4618-8d14-90d5c674fec7
📒 Files selected for processing (5)
packages/server/src/migrations/20260310123600_split_changes.tspackages/server/src/migrations/planned/regroup_changes.test.tspackages/server/src/migrations/planned/regroup_changes.tspackages/server/src/models/ChangeModel/ChangeModel.old.tspackages/server/src/models/ChangeModel/ChangeModel.test.ts
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@packages/server/src/models/ChangeModel/ChangeModel.new.ts`:
- Around line 30-59: The method changesForUserQuery currently always declares a
return type of Promise<Changes2[]> but returns a count shape when doCountQuery
is true; update the function signature to use TypeScript overloads so one
overload returns Promise<{ total: number }[]> when doCountQuery is true and the
other returns Promise<Changes2[]> when doCountQuery is false, implement a single
implementation matching both overloads, and mirror these overloads in the
abstract declaration on BaseChangeModel so all subclasses (including
ChangeModel) have accurate type contracts; ensure call sites compile and that
the runtime still returns the count rows when doCountQuery is true and Changes2
rows otherwise.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 5f6fe62d-28fc-44d1-9d08-695f43b06745
📒 Files selected for processing (1)
packages/server/src/models/ChangeModel/ChangeModel.new.ts
Summary
This pull request should:
deltaqueries in Joplin Server.May resolve #13655.
Follow-up to #13862.
IMPORTANT: This pull request makes
Updatechanges per-user. As such, in a share with 100 participants, a change to a single item results in 100 entries written to thechanges_2table on the server. Previously, just one entry would be written. This pull request will makechanges_2grow faster than it did before.Problems
changestable query speed #13655, the current delta query is slow in certain hard-to-reproduce cases.changestable is capped at just over two billion entries in PostgreSQL.changestable uses aserialfor its autoincrementingcounter, which is four bytes and has a maximum of 2147483647.Solution
changes_2table for new changes, stop updatingchanges.changes_2.changesis an archive of old, pre-migration changes.changescontains changes in the old format.changes_2contains changes in the new format.server_delta_sync.mdfor further details.bigserialinstead of aserialfor thechanges_2counter.changes_2, rather than per-item.deltaquery.server_delta_sync.mdfor further details.Notes
changes_2'scounteris set to start just after the last change inchanges. If new changes are written tochangesafter the migration, there may be issues with querying changes.migrate downstep involves copying and reformatting changes fromchanges_2to thechangestable. Ifchanges_2has a large number of entries, thismigrate downstep will be slow.Changes2, while old changes have typeChange. The database type generator currently usessqltsfor generating type names, which doesn't depluralizechanges_2.generateTypes.tscan be updated to use a different name (e.g.Change2) for the new change type.ChangeModel.oldandChangeModel.newcontain logic for interacting with the old and new changes tables, respectively.ChangeModelis responsible for bridging between the two.ChangeModel.newandChangeModel.oldwere copied and modified from the originalChangeModellogic. At present, some logic (e.g. thecompressOldChangesimplementation) is roughly the same in both.ChangeModel.new'srecordChangeavoids usingBaseModel.saveand instead applies all changes at once. This fixes a performance issue in shares with hundreds of participants.In the above chart, "unoptimized" (
BaseItem.save-based change recording) has a much steeper slope than "optimized" (bulkdb().insert-based change recording). For example, in a share with 300 participants, it takes about 10 seconds to creating an item, write 10 update changes, then deleting it. However, if changes are saved with a bulk call todb().insert, the same process takes about 300 ms.Even with the optimization, writing a change now involves a call to
allShareUserIds, which will impact write performance.Comparison with #13862
changestochanges_2. This was a slow and very complicated migration.changes. The migration is much faster, sincechanges_2can be initially almost empty.Performance
Some performance testing has been done by:
populateDatabasedebug calls).benchmarkDeltaPerformancedebug command..csvfiles to a safe place.dev.benchmarkDeltaPerformance.Graphing the results from step 2 and step 5 results in:

This chart suggests that the new
changes_2-based approach is much faster. However, the chart in the "old format" case is generated using data from the "migrate down" logic. As a result, there are the same number of "Update" changes with the old and new format data. Realistically, there would be fewer "Update" changes with the old format.Update: For more up-to-date benchmarks, see #14712 (comment).
Testing
Fuzzing
This pull request has been tested with the sync fuzzer. To test the old/new changes table split, the server was stopped and the migration was run while running the sync fuzzer. This results in some changes in
changesand some inchanges_2for user accounts under test by the fuzzer.Steps:
devbranch.DB_CLIENT=pg yarn start --env devfrompackages/server.DB_CLIENT=pg yarn syncFuzzer startpr/server/denormalize-changesbranch.changes.changestable and 909 entries inchanges_2.changes_2(and still 1283 entries inchanges).Note
Most testing was done before 6f7d608. However, it has been verified that the sync fuzzer runs successfully from 646f0fb for >= 90 steps, and again from 36cdad9 that the fuzzer runs successfully for >= 150 steps.