fix: add MarkInfo and ViewerPreferences to accessible PDF output#14999
fix: add MarkInfo and ViewerPreferences to accessible PDF output#14999anuradha1304 wants to merge 1 commit intolaurent22:devfrom
Conversation
The createAccessiblePdf function was generating PDFs without accessibility tags, despite the feature being marketed as creating accessible documents. pdf-lib does not set MarkInfo or ViewerPreferences by default, so screen readers could not identify the output as tagged. Fix: inject MarkInfo << /Marked true >> and ViewerPreferences into the PDF catalog after document creation, using pdf-lib's low-level context API. This matches what Electron's generateTaggedPDF flag does for the note export path. Fixes laurent22#14994
📝 WalkthroughWalkthroughThe Changes
Suggested labels
Important Pre-merge checks failedPlease resolve all errors before merging. Addressing warnings is optional. ❌ Failed checks (1 error)
✅ Passed checks (3 passed)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
packages/lib/services/ocr/utils/createAccessiblePdf.ts (1)
1-2: Consider consolidating imports from the same module.Both lines import from
pdf-lib. Merging them into a single import statement would be cleaner.Proposed fix
-import { PDFDocument, PDFFont, PDFPage, rgb, StandardFonts } from 'pdf-lib'; -import { PDFBool, PDFName } from 'pdf-lib'; +import { PDFBool, PDFDocument, PDFFont, PDFName, PDFPage, rgb, StandardFonts } from 'pdf-lib';🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/lib/services/ocr/utils/createAccessiblePdf.ts` around lines 1 - 2, Consolidate the two imports from the same module by merging the separate import lines for PDFDocument, PDFFont, PDFPage, rgb, StandardFonts and PDFBool, PDFName into a single import from 'pdf-lib' (referencing the existing imported symbols PDFDocument, PDFFont, PDFPage, rgb, StandardFonts, PDFBool, PDFName) so there is only one import statement at the top that includes all required exports from pdf-lib.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@packages/lib/services/ocr/utils/createAccessiblePdf.ts`:
- Around line 1-2: Consolidate the two imports from the same module by merging
the separate import lines for PDFDocument, PDFFont, PDFPage, rgb, StandardFonts
and PDFBool, PDFName into a single import from 'pdf-lib' (referencing the
existing imported symbols PDFDocument, PDFFont, PDFPage, rgb, StandardFonts,
PDFBool, PDFName) so there is only one import statement at the top that includes
all required exports from pdf-lib.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 421a73d5-8d60-4cb6-944b-27c32b7d50f0
📒 Files selected for processing (1)
packages/lib/services/ocr/utils/createAccessiblePdf.ts
|
With this change, lists/headings/etc still don't seem to be recognized and tagged as such. What are the accessibility benefits of this change by itself? Notes:
(Thank you for the pull request!) |
|
You're right that without a structure tree the practical benefit is limited. The main value is that some assistive tools check for MarkInfo before attempting to parse anything - without it they skip the document entirely. |
Thanks for the clarification! To help prevent future regressions, consider adding a code comment with an example of accessibility tools that will skip parsing the document. (A link to documentation for |
Fixes #14994
Problem
The "Create accessible document" context menu option generates PDFs without accessibility tags.
pdf-libdoes not setMarkInfoorViewerPreferencesin the PDF catalog by default, so screen readers cannot identify the output as a tagged document - despite the feature being specifically for accessibility.The note export path already handles this correctly via Electron's
generateTaggedPDFflag inInteropServiceHelper.ts. This fix brings the OCR-based accessible PDF path to the same standard.Fix
Inject
MarkInfo << /Marked true >>andViewerPreferencesinto the PDF catalog usingpdf-lib's low-level context API, immediately before saving the document.Changes
File :
packages/lib/services/ocr/utils/createAccessiblePdf.tsChange: Added MarkInfo and ViewerPreferences to PDF catalog
Note
This adds the required catalog entries for tagged PDFs. A fully PDF/UA compliant document would additionally require a structure tree,which is beyond the scope of this fix and the current pdf-lib capabilities.
Test Plan
pdfinfo,pdfid.py, or Adobe Acrobat's preflight)MarkInfodictionary is present in the PDF catalog with/Marked trueViewerPreferencesis present