With this patch, all the paths components are collected in the worker until a path
operation is met (i.e., stroke, fill, ...).
Then in the canvas a Path2D is created and will replace the path data transfered from the worker,
this way when rescaling, the Path2D can be reused.
In term of performances, using Path2D is very slightly improving speed when scaling the canvas.
- Most of the these are only used in the `src/display/api.js` file, and this leads to slightly shorter code.
- A number of unit-tests need a `BaseCanvasFactory`-instance, however that one is available through the `PDFDocumentProxy`-instance nowadays.
- For other unit-tests the remaining necessary default Factory-definitions can be moved into the `test/unit/test_utils.js` file.
This patch makes a clear separation between the way to draw and the editing stuff.
It adds a class DrawEditor which should be extended in order to create new drawing tools.
As an example, the ink tool has been rewritten in order to use it.
The `renderForms` parameter pre-dates the introduction of the general `intent` parameter, which means that we're now effectively passing the same state twice to these `getOperatorList` methods.
Instead of sending to the main thread an array of Objects for a list of points (or quadpoints),
we'll send just a basic float buffer.
It should slightly improve performances (especially when cloning the data) and use slightly less memory.
- Check that the `filename` is actually a string, before parsing it further.
- Use proper "shadowing" in the `filename` getter.
- Add a bit more validation of the data in `pickPlatformItem`.
- Last, but not least, return both the original `filename` and the (path stripped) variant needed in the display-layer and viewer.
The PDF specification states that empty dash arrays, i.e. arrays with
zero elements, are in fact valid. In that case the dash array simply
corresponds to a solid, unbroken line. However, this case was erroneously
being flagged as invalid and therefore the annotation was not drawn
because its width was set to zero. This commit fixes the issue by
allowing dash arrays to have a length of zero.
This unit-test is now failing in up to date versions of Node.js respectively Chromium-browsers, since `CompressionStream` no longer produces consistent data across all environments/browsers.
However logging the compressed TypedArray produced by `writeStream`, with Firefox respectively Chrome, and then feeding *both* of those TypedArray as input to `DecompressionStream` produced the same (correct) result in both browsers.
Hence the *exact* output of `CompressionStream` shouldn't matter, as long as we're able to successfully decompress it when the resulting PDF document is opened with the PDF.js library, and the unit-test is thus extended to check this.
Starting with Chrome 120.0.6099.109 (shipped with Puppeteer 21.8.0+) the
unit test fails in Chrome as well. The issue is tracked in #17399, but
for now we'll only run the unit test in Firefox so we can continue to
update Puppeteer while also still having a browser in which it runs,
until we figure out why the behavior of `CompressionStream` changed.
It isn't really a fix for the mentioned bug but it slightly improve things.
In reducing the memory use, the time spent in the GC is reduced either.
The algorithm to compute the bounding box is the same as before but it has just
been rewritten to be more efficient.
It seems this unit-test now fails consistently in "all" up-to-date Node.js versions. We should probably try and understand why, but for now just disable it to get passing CI tests.
It seems this unit-test started failing in Node.js version 20.10 as well. We should probably try and understand why, but for now just disable it to get passing CI tests.
Hopefully this will allow us to catch bugs in new Node.js versions earlier, rather than having to wait for bug reports.
Given that `CompressionStream` is (currently) only potentially used when saving a *modified* PDF document, which is unlikely to be a common use-case in Node.js environments, let's just disable the affected unit-test for now.
Over time the amount of "document level" data potentially needed during parsing of Annotations have increased a fair bit, which means that we currently need to ensure that a bunch of data is available for each individual Annotation.
Given that this data is "constant" for a PDF document we can instead create (and cache) it lazily, only when needed, *before* starting to parse the Annotations on a page. This way the parsing of individual Annotations should become slightly less asynchronous, which really cannot hurt.
An additional benefit of these changes is that we can reduce the number of parameters that need to be explicitly passed around in the annotation-code, which helps overall readability in my opinion.
One potential drawback of these changes is that the `AnnotationFactory.create` method no longer handles "everything" on its own, however given how few call-sites there are I don't think that's too much of a problem.
The existing code is unable to *correctly* extract the color from the appearance-stream when the ColorSpace-data is "complex". To reproduce this:
- Open `freetexts.pdf` in the viewer.
- Note the purple color of the "Hello World from Preview" annotation.
- Enable any of the Editors.
- Note how the relevant annotation is now black.
When there was a rotation, the generated bbox was wrong because of an inversion
between width and height.
This patch aims to fix this issue in re-writing the FreeText code generation
to have something similar to what Acrobat does.
And fix the name of the font which wasn't the correct one when calling the
evaluator.
- it'll help to be able to move popups on screen to let the user read the text
- popups won't inherit some properties from their parent:
- the popup can be misrendered if for example the parent has a clip-path property.
- add an outline to the popup when the parent is focused.
- hide a popup when it's clicked.
On my computer, it takes few tenths of a second to load a local font.
Since a font can be used several times in a document, the cache will
improve performances.
The initial CMap support was added in PR 4259 using the "raw" Adobe files, however they were quickly deemed to be unnecessarily large. As a result PR 4470 introduced the more compact "binary" CMap format, with both of those PRs being included in the very same release (version `0.8.1334`) .
Please note that we've thus never shipped anything *except* the "binary" CMap files with the PDF library, and furthermore note that we've not even once updated the CMap files since they were originally added almost nine years ago.
Requiring users to remember that `cMapPacked = true` is necessary, in addition to setting the `cMapUrl` parameter, in order for CMap loading to work feels like a less than ideal API.
Hence this patch, which suggests that we simply let `cMapPacked` default to `true` now.
- For text fields
* when printing, we generate a fake font which contains some widths computed thanks to
an OffscreenCanvas and its method measureText.
In order to avoid to have to layout the glyphs ourselves, we just render all of them
in one call in the showText method in using the system sans-serif/monospace fonts.
* when saving, we continue to create the appearance streams if the fonts contain the char
but when a char is missing, we just set, in the AcroForm dict, the flag /NeedAppearances
to true and remove the appearance stream. This way, we let the different readers handle
the rendering of the strings.
- For FreeText annotations
* when printing, we use the same trick as for text fields.
* there is no need to save an appearance since Acrobat is able to infer one from the
Content entry.
An annotation doesn't have to be in the text flow, hence it's likely a bad idea
to insert its text in the text layer. But the text must be visible from a screen
reader point of view so it must somewhere in the DOM.
So with this patch, the text from a FreeText annotation is extracted and added in
a div in its HTML counterpart, and with the patch #15237 the text should be visible
and positioned relatively to the text flow.
To improve performance of the sidebar we use the page-canvases to generate the thumbnails whenever possible, since that avoids unnecessary re-rendering when the sidebar is open. This works generally well, however there's an old problem in PDF documents that contain interactive forms (when those are enabled): Note how the thumbnails become partially (or fully) blank, since those Annotations are not included in the OperatorList.[1]
We obviously want to keep using the `PDFThumbnailView.setImage`-method for most documents, however we need a way to skip it only for those pages that contain interactive forms.
As it turns out it's unfortunately not all that simple to tell, after the fact, from looking only at the OperatorList that some Annotations were skipped. While it might have been possible to try and infer that in the viewer, it'd not have been pretty considering that at the time when rendering finishes the annotationLayer has not yet been built.
The overall simplest solution that I could come up with, was instead to include a *summary* of the interactive form-state when doing the final "flushing" of the OperatorList and expose that information in the API.
---
[1] Some examples from our test-suite: `annotation-tx2.pdf` where the thumbnail is completely blank, and `bug1737260.pdf` where the thumbnail is missing the "buttons" found on the page.
- As in the annotation layer, use percent instead of pixels as unit;
- handle the rotation of the editor layer in allowing editing when rotation
angle is not zero;
- the different editors are rotated counterclockwise in order to be usable
when the main page is itself rotated;
- add support for saving/printing rotated editors.
- it aims to fix#14502 and bug 1721335;
- Acrobat and Pdfium do the same;
- it'll avoid to have truncated data when printed;
- change the factor to compute font size in using field height: lineHeight = 1.35*fontSize
- this is the value used by Acrobat.
- in order to not have truncated strings on the bottom, add few basic metrics for standard fonts.
- First step to fix https://bugzilla.mozilla.org/show_bug.cgi?id=1737260;
- several interactive pdfs use the possibility to hide/show buttons to show different icons;
- render pushbuttons on their own canvas and then insert it the annotation_layer;
- update test/driver.js in order to convert canvases for pushbuttons into images.