1
0
Fork 0
mirror of https://github.com/mozilla/pdf.js.git synced 2025-04-19 22:58:07 +02:00
Commit graph

1218 commits

Author SHA1 Message Date
Tim van der Meij
2f3bf6f07e
Don't ignore errors in the Jasmine suite start/end stages
Currently errors in `afterAll` are logged, but don't fail the tests.
This could cause new errors during test teardown to go by unnoticed.

Moreover, the integration test use a different reporting mechanism which
also handled errors differently (this is extra reason to do #12730).

This patch fixes the issues by consistently handling errors in
`suiteStarted` and `suiteDone` in both reporting mechanisms.

Fixes #18319.
2024-06-23 20:59:48 +02:00
bootleq
890c567eca Expose entireWord in updateFindControlState
Allow apps with supportsIntegratedFind to better monitor the find state.

A recognized use case is the Firefox findbar, its "not found" sound must
consider `entireWord` and only make noise when it is off.

See related implementation in
https://hg.mozilla.org/mozilla-central/rev/16b902cbcf26

This change can help if we have to move the implementation from cpp to jsm.
2024-06-21 13:12:59 +08:00
Calixte Denizet
c14c3cfc9f Improve date parsing in the js sandbox
If for example dd:mm is failing we just try with d:m which is equivalent
to the regex /d{1,2}:m{1,2}/. This way it allows the user to forget the
0 for the first days/months.
2024-06-14 17:21:50 +02:00
Calixte Denizet
6fa98ac99f [api-minor] Simplify how the list of points are structured
Instead of sending to the main thread an array of Objects for a list of points (or quadpoints),
we'll send just a basic float buffer.
It should slightly improve performances (especially when cloning the data) and use slightly less memory.
2024-05-30 15:36:15 +02:00
Jonas Jenwald
06334c97ef Improve the loadingParams functionality in the API
- Move the definition of the `loadingParams` Object, to simplify the code.

 - Add a unit-test, since none existed and the viewer depends on this functionality.
2024-05-24 09:26:40 +02:00
Jonas Jenwald
15b5808eee [api-minor] Re-factor the basic textLayer-functionality
This is very old code, and predates e.g. the introduction of JavaScript classes, which creates unnecessarily unwieldy code in the viewer.
By introducing a new `TextLayer` class in the API, similar to how e.g. the `AnnotationLayer` looks, we're able to keep most parameters on the class-instance itself. This removes the need to manually track them in the viewer, and simplifies the call-sites.

This also removes the `numTextDivs` parameter from the "textlayerrendered" event, since that's only added to support default-viewer functionality that no longer exists.

Finally we try, as far as possible, to polyfill the old `renderTextLayer` and `updateTextLayer` functions since they are exposed in the library API.
For *simple* invocations of `renderTextLayer` the behaviour should thus be the same, with only a warning printed in the console.
2024-05-17 14:20:20 +02:00
Tim van der Meij
4db843617f
Merge pull request #18047 from Snuffleupagus/issue-18042
Avoid re-parsing global images that failed decoding (issue 18042, PR 17428 follow-up)
2024-05-15 15:40:18 +02:00
Tim van der Meij
6b237e3358
Implement a unit test for the BaseException class
The issue from #18003 hasn't been shown to be caused by PDF.js, but it
did surface that we don't have (direct) unit test coverage for the
`BaseException` class. This made it more difficult to prove that the
`stack` property was already available on exception instances, but more
importantly it caused the CI to be green even though the suggested
change would have caused the `stack` property to disappear.

To avoid future regressions, for e.g. similar changes or a rewrite from
a closure to a proper class, this commit introduces a dedicated unit
test for `BaseException` that asserts that our exception instances
indeed expose all expected properties.
2024-05-14 20:21:42 +02:00
Jonas Jenwald
c5f92437f7 Avoid re-parsing global images that failed decoding (issue 18042, PR 17428 follow-up)
For images that failed to decode once we want to avoid a pointless round-trip to the main-thread, which could otherwise happen for globally cached images.
2024-05-14 13:58:36 +02:00
Jonas Jenwald
6d523c316c [api-minor] Include the document /Lang attribute in the textContent-data
- These changes will allow a simpler way of implementing PR 17770.

 - The /Lang attribute is fetched lazily, with the first `getTextContent` invocation. Given the existing worker-thread caching, this will thus only need to be done *once* per PDF document (and most PDFs don't included this data).

 - This makes the /Lang attribute *directly available* in the `textLayer`, which has the following advantages:
    - We don't need to block, and thus delay, overall viewer initialization on fetching it (nor pass it around throughout the viewer).

    - Third-party users of the `textLayer` will automatically benefit from this, once we start actually using the /Lang attribute in PR 17770.
      *Please note:* This also, importantly, means that the `text` reference-tests will then cover this code (which wouldn't otherwise have been the case).
2024-05-14 12:44:41 +02:00
Jonas Jenwald
c0b5d93ef4
Merge pull request #18052 from Snuffleupagus/textLayer-only-ReadableStream
Restore broken functionality and simplify the implementation in `src/display/text_layer.js`
2024-05-14 12:30:27 +02:00
Jonas Jenwald
049848ba00 Unify the ReadableStream and TextContent code-paths in src/display/text_layer.js
The only reason that this code still accepts `TextContent` is for backward-compatibility purposes, so we can simplify the implementation by always using a `ReadableStream` internally.
2024-05-07 13:03:57 +02:00
Jonas Jenwald
2643570364 [api-minor] Re-factor how Node.js packages/polyfills are loaded (issue 17245)
*Please note:* This removes top level await from the GENERIC builds of the PDF.js library.

Despite top level await being supported in all modern browsers/environments, note [the MDN compatibility data](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Operators/await#browser_compatibility), it seems that many frameworks and build-tools unfortunately have trouble with it.
Hence, in order to reduce the influx of support requests regarding top level await it thus seems that we'll have to try and fix this.

Given that top level await is only needed for Node.js environments, to load packages/polyfills, we re-factor things to limit the asynchronicity to that environment.
The "best" solution, with the least likelihood of causing future problems, would probably be to await the load of Node.js packages/polyfills e.g. at the top of the `getDocument`-function. Unfortunately that doesn't work though, since that's a *synchronous* function that we cannot change without breaking "the world".

Hence we instead await the load of Node.js packages/polyfills together with the `PDFWorker` initialization, since that's the *first point* of asynchronicity during initialization/loading of a PDF document. The reason that this works is that the Node.js packages/polyfills are only needed during fetching of the PDF document respectively during rendering, neither of which can happen *until* the worker has been initialized.
Hopefully this won't cause any future problems, since looking at the history of the PDF.js project I don't believe that we've (thus far) ever needed a Node.js dependency at an earlier point.
This new pattern for accessing Node.js packages/polyfills will also require some care during development *and* importantly reviewing, to ensure that no new top level await is added in the main code-base.
2024-05-06 23:20:03 +02:00
Jonas Jenwald
a790f2df5d [api-minor] Remove the unused onlyStripPath option from the getFilenameFromUrl helper function 2024-05-03 08:29:41 +02:00
Jonas Jenwald
2b69fb76ac [api-minor] Improve the FileSpec implementation
- Check that the `filename` is actually a string, before parsing it further.
 - Use proper "shadowing" in the `filename` getter.
 - Add a bit more validation of the data in `pickPlatformItem`.
 - Last, but not least, return both the original `filename` and the (path stripped) variant needed in the display-layer and viewer.
2024-05-01 18:02:05 +02:00
Jonas Jenwald
bf4e36d1b5 [api-minor] Expose the /Desc-attribute of file attachments in the viewer (issue 18030)
In the viewer this will be displayed in the `title` of the hyperlink, which is probably the best we can do here given how the viewer is implemented.
2024-05-01 09:02:11 +02:00
Calixte Denizet
45fa867577 Allow to insert several annotations under the same parent in the structure tree
While testing stamp insertion with the added pdf, I noticed that the tags using a MCID
weren't considered when trying to attach an annotation to it.
2024-04-24 16:23:05 +02:00
Tim van der Meij
bda98b91cb
Merge pull request #17967 from Snuffleupagus/eventBus-signal
Add `signal`-support in the `EventBus`, and utilize it in the viewer (PR 17964 follow-up)
2024-04-23 15:55:59 +02:00
Jonas Jenwald
9e80c6d228
Merge pull request #17978 from Snuffleupagus/pr-17428-followup
Extend the globally cached image main-thread copying to "complex" images as well (PR 17428 follow-up)
2024-04-22 16:46:23 +02:00
Tim van der Meij
335d8394cd
Merge pull request #17979 from Snuffleupagus/image-errors-shorter-msg
[api-minor] Remove the image-related error message prefixes
2024-04-22 15:35:10 +02:00
Jonas Jenwald
912b57b95d [api-minor] Remove the image-related error message prefixes
Other custom errors, based on `BaseException`, do not use such a format.
2024-04-20 12:51:45 +02:00
Jonas Jenwald
702ee7b1e1 Add signal-support in the EventBus, and utilize it in the viewer (PR 17964 follow-up)
This mimics the `signal` option that's available for `addEventListener`, see [MDN](https://developer.mozilla.org/en-US/docs/Web/API/EventTarget/addEventListener#signal).
2024-04-20 12:00:58 +02:00
Jonas Jenwald
91898e5923 Extend the globally cached image main-thread copying to "complex" images as well (PR 17428 follow-up)
In PR 17428 this functionality was limited to "larger" images, to not affect performance negatively. However it turns out that it's also beneficial to consider more "complex" images, regardless of their size, that contain /SMask or /Mask data; see issue 11518.
2024-04-20 11:10:09 +02:00
Calixte Denizet
901d995a7e Correctly update the xref table when an annotation is deleted 2024-04-18 21:27:39 +02:00
Calixte Denizet
52ea2333b3 Remove the tag for missing font subset when trying to find a substitution
Fixes #17929.
2024-04-11 20:34:28 +02:00
Tim van der Meij
d01a0bd0c8
Fix annotation border style parsing by handling empty dash arrays
The PDF specification states that empty dash arrays, i.e. arrays with
zero elements, are in fact valid. In that case the dash array simply
corresponds to a solid, unbroken line. However, this case was erroneously
being flagged as invalid and therefore the annotation was not drawn
because its width was set to zero. This commit fixes the issue by
allowing dash arrays to have a length of zero.
2024-04-08 16:34:27 +02:00
Tim van der Meij
2e5282928f
Merge pull request #17854 from Snuffleupagus/rm-PromiseCapability
[api-minor] Replace the `PromiseCapability` with  `Promise.withResolvers()`
2024-04-02 15:21:43 +02:00
Jonas Jenwald
e4d0e84802 [api-minor] Replace the PromiseCapability with Promise.withResolvers()
This replaces our custom `PromiseCapability`-class with the new native `Promise.withResolvers()` functionality, which does *almost* the same thing[1]; please see https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Promise/withResolvers

The only difference is that `PromiseCapability` also had a `settled`-getter, which was however not widely used and the call-sites can either be removed or re-factored to avoid it. In particular:
 - In `src/display/api.js` we can tweak the `PDFObjects`-class to use a "special" initial data-value and just compare against that, in order to replace the `settled`-state.
 - In `web/app.js` we change the only case to manually track the `settled`-state, which should hopefully be OK given how this is being used.
 - In `web/pdf_outline_viewer.js` we can remove the `settled`-checks, since the code should work just fine without it. The only thing that could potentially happen is that we try to `resolve` a Promise multiple times, which is however *not* a problem since the value of a Promise cannot be changed once fulfilled or rejected.
 - In `web/pdf_viewer.js` we can remove the `settled`-checks, since the code should work fine without them:
     - For the `_onePageRenderedCapability` case the `settled`-check is used in a `EventBus`-listener which is *removed* on its first (valid) invocation.
     - For the `_pagesCapability` case the `settled`-check is used in a print-related helper that works just fine with "only" the other checks.
 - In `test/unit/api_spec.js` we can change the few relevant cases to manually track the `settled`-state, since this is both simple and *test-only* code.

---
[1] In browsers/environments that lack native support, note [the compatibility data](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Promise/withResolvers#browser_compatibility), it'll be polyfilled via the `core-js` library (but only in `legacy` builds).
2024-04-01 11:42:37 +02:00
Calixte Denizet
136c1faa7f Display outlines even if one has no title
Fixes #17856.
2024-03-29 21:30:24 +01:00
Jonas Jenwald
0d039937f9 Add better support for /Launch actions with /FileSpec dictionaries (issue 17846) 2024-03-26 20:15:48 +01:00
Jonas Jenwald
0022310b9c
Merge pull request #17706 from Snuffleupagus/Node-Fetch-API
[api-minor] Use the Fetch API, when supported, to load PDF documents in Node.js environments
2024-03-19 11:04:28 +01:00
Jonas Jenwald
eded037d06 [api-minor] Use the Fetch API, when supported, to load PDF documents in Node.js environments
Given that modern Node.js versions now implement support for a fair number of "browser" APIs, we can utilize the standard Fetch API to load PDF documents that are specified via http/https URLs.

Please find compatibility information at:
 - https://developer.mozilla.org/en-US/docs/Web/API/Fetch_API#browser_compatibility
 - https://nodejs.org/dist/latest-v18.x/docs/api/globals.html#fetch
 - https://developer.mozilla.org/en-US/docs/Web/API/Response#browser_compatibility
 - https://nodejs.org/dist/latest-v18.x/docs/api/globals.html#response
2024-02-21 22:38:42 +01:00
Jonas Jenwald
90b2664622 Add better validation for the "PREFERENCE" kind AppOptions
Given that the "PREFERENCE" kind is used e.g. to generate the preference-list for the Firefox PDF Viewer, those options need to be carefully validated.
With this patch we'll now check this unconditionally in development mode, during testing, and when creating the preferences in the gulpfile.
2024-02-20 18:38:15 +01:00
Calixte Denizet
2133da166e When updating, write the xref table in the same format as the previous one (bug 1878916)
The specs are unclear about what kind of xref table format must be used.
In checking the validity of some pdfs in the preflight tool from Acrobat
we can guess that having the same format is the correct way to do.
The pdf in the mentioned bug, after having been changed, wasn't correctly
displayed in neither Chrome nor Acrobat: it's now fixed.
2024-02-13 14:14:37 +01:00
Jonas Jenwald
37e98e39f6 Skip any whitespace after the first object in linearized PDFs (issue 17665)
This way the code is now consistent with the non-linearized branch in the `PDFDocument.startXRef` getter.
2024-02-12 22:05:36 +01:00
Jonas Jenwald
19ef3e367b Tweak the issue 11878 unit-test parsing time check (PR 17428 follow-up)
This unit-test has been failing occasionally in Chrome and Node.js, hence we tweak the parsing time check to reduce the likelihood of that happening.
2024-02-12 12:31:55 +01:00
Jonas Jenwald
5732faee1e Prevent duplicate names in unit/integration tests
Having identical names for different test-cases may result in less helpful output, which we can avoid with the use of the ESLint Jasmine plugin.
This patch enables the rules at the `branch` level, to limit the amount/scope of the changes slightly. (We could thus make this rule more strict in the future, if that's deemed useful.)

Please refer to:
 - https://github.com/tlvince/eslint-plugin-jasmine/blob/master/docs/rules/no-spec-dupes.md
 - https://github.com/tlvince/eslint-plugin-jasmine/blob/master/docs/rules/no-suite-dupes.md
2024-02-11 11:45:09 +01:00
Jonas Jenwald
6da9448f6c Remove the web-com import map (PR 17588 follow-up)
With the changes in PR 17588 we're already importing the relevant code via the `web/app.js` file.
2024-02-07 16:33:27 +01:00
Jonas Jenwald
97c2ce9da0 Ensure that GenericL10n works if the locale files cannot be loaded
- Ensure that localization works in the GENERIC viewer, even if the necessary locale files cannot be loaded.
   This was the behaviour prior to the introduction of Fluent, and it seems worthwhile to keep that (especially since we already bundle the en-US strings anyway).

 - Let the `GenericL10n`-implementation use the *bundled* en-US strings directly when no language is provided.

 - Remove the `NullL10n`-implementation, and simply fallback to `GenericL10n`, to reduce the maintenance burden of viewer-components localization.

 - Indirectly, given the previous point, stop exporting `NullL10n` in the viewer-components since it's now removed.
   Note that it was never really intended to be used directly and only existed as a fallback.

*Please note:* This doesn't affect the Firefox PDF Viewer, thanks to the use of import maps.
2024-01-31 14:07:11 +01:00
Jonas Jenwald
384291234d Re-enable the should compress and save text unit-test (issue 17399)
This unit-test is now failing in up to date versions of Node.js respectively Chromium-browsers, since `CompressionStream` no longer produces consistent data across all environments/browsers.
However logging the compressed TypedArray produced by `writeStream`, with Firefox respectively Chrome, and then feeding *both* of those TypedArray as input to `DecompressionStream` produced the same (correct) result in both browsers.

Hence the *exact* output of `CompressionStream` shouldn't matter, as long as we're able to successfully decompress it when the resulting PDF document is opened with the PDF.js library, and the unit-test is thus extended to check this.
2024-01-28 14:31:07 +01:00
Tim van der Meij
94309edc9a
Disable the "should compress and save text" unit test in Chrome too
Starting with Chrome 120.0.6099.109 (shipped with Puppeteer 21.8.0+) the
unit test fails in Chrome as well. The issue is tracked in #17399, but
for now we'll only run the unit test in Firefox so we can continue to
update Puppeteer while also still having a browser in which it runs,
until we figure out why the behavior of `CompressionStream` changed.
2024-01-27 20:34:30 +01:00
Jonas Jenwald
5dd25b6e80 Re-factor DefaultExternalServices into a regular class, without static methods
The `DefaultExternalServices` code, which is used to provide build-specific functionality, is very old. This results in a pattern where we first initialize `PDFViewerApplication.externalServices` and then *override* it for the different builds.

By converting `DefaultExternalServices` into a "regular" class, and leveraging import maps, we can directly initialize the correct instance depending on the build.
2024-01-27 12:07:15 +01:00
Jonas Jenwald
d1080e785a Remove the createPreferences method from DefaultExternalServices
Given the simplicity of the `createPreferences` method, we can leverage import maps to directly initialize the correct `Preferences`-instance depending on the build.
2024-01-27 11:38:42 +01:00
Jonas Jenwald
1698991ae2 Remove the createDownloadManager method from DefaultExternalServices
Given the simplicity of the `createDownloadManager` method, we can leverage import maps to directly initialize the correct `DownloadManager`-instance depending on the build.
2024-01-27 11:38:36 +01:00
Calixte Denizet
7f2428a77e Reduce memory use and improve perfs when computing the bounding box of a bezier curve (bug 1875547)
It isn't really a fix for the mentioned bug but it slightly improve things.
In reducing the memory use, the time spent in the GC is reduced either.
The algorithm to compute the bounding box is the same as before but it has just
been rewritten to be more efficient.
2024-01-24 23:41:14 +01:00
Jonas Jenwald
f9a384d711 Enable the arrow-body-style ESLint rule
This manually ignores some cases where the resulting auto-formatting would not, as far as I'm concerned, constitute a readability improvement or where we'd just end up with more overall indentation.

Please see https://eslint.org/docs/latest/rules/arrow-body-style
2024-01-21 16:20:55 +01:00
Jonas Jenwald
9dfe9c552c Use shorter arrow functions where possible
For arrow functions that are both simple and short, we can avoid using explicit `return` to shorten them even further without hurting readability.

For the `gulp mozcentral` build-target this reduces the overall size of the output by just under 1 kilo-byte (which isn't a lot but still can't hurt).
2024-01-21 10:13:12 +01:00
Calixte Denizet
5732c0c54a Use the original value of a field when propagating event (fixes #17540)
And avoid to not format a field when the value is 0.
2024-01-19 22:13:51 +01:00
Jonas Jenwald
f8e3c79cb5
Merge pull request #17537 from Snuffleupagus/rm-isArrayBuffer
Remove the `isArrayBuffer` helper function
2024-01-19 15:37:02 +01:00
Jonas Jenwald
b37536c38c Remove the isArrayBuffer helper function
This old helper function can now be replaced with `ArrayBuffer.isView()` and/or `instanceof ArrayBuffer` checks, as needed depending on the situation.
2024-01-19 14:10:52 +01:00