1
0
Fork 0
mirror of https://github.com/mozilla/pdf.js.git synced 2025-04-23 16:48:08 +02:00
Commit graph

6699 commits

Author SHA1 Message Date
Calixte Denizet
4bf7787084 Simplify saving added/modified annotations.
Having this map to collect the different changes will allow to know if some objects have already been modified.
2024-11-12 10:59:38 +01:00
calixteman
2ad8782428
Merge pull request #19023 from calixteman/issue19022
Apply gradient when stroking text
2024-11-11 17:15:50 +01:00
Calixte Denizet
79e1f155ac Apply gradient when stroking text
It fixes #19022.

I noticed that the glyph contours weren't correct (for T and x) and
because we forgot to close the contour.
2024-11-11 15:53:07 +01:00
Jonas Jenwald
5524216c23
Merge pull request #19015 from Snuffleupagus/@napi-rs/canvas
[api-minor] Replace the `canvas` package with `@napi-rs/canvas`
2024-11-10 19:45:01 +01:00
Jonas Jenwald
0b864ee7d5 Shorten the Page.prototype.userUnit getter slightly 2024-11-10 16:30:07 +01:00
Jonas Jenwald
9b62f2e7d1 Polyfill ImageData in Node.js environments
Given that `ImageData` has been supported for many years in all browsers, see [MDN](https://developer.mozilla.org/en-US/docs/Web/API/ImageData#browser_compatibility), we have a `typeof` check that's only necessary in Node.js environments.
Since the `@napi-rs/canvas` package provides that functionality, we can thus add an `ImageData` polyfill which allows us to ever so slightly simplify the code.
2024-11-09 18:51:32 +01:00
Jonas Jenwald
86f943ca03 [api-minor] Replace the canvas package with @napi-rs/canvas
The `@napi-rs/canvas` package has fewer dependencies, which should *hopefully* make installing and using it easier for `pdfjs-dist` end-users. (Over the years we've seen, repeatedly, that `canvas` can be difficult to install successfully.)
Furthermore, this package includes more functionality (such as `Path2D`) which reduces the overall number of dependencies in the PDF.js project.

One point to note is that `@napi-rs/canvas` is a fair bit newer than `canvas`, and has a lot fewer users, however looking at the commit history it does seem to be actively maintained.

Note that I've successfully tested the [Node.js examples](https://github.com/mozilla/pdf.js/tree/master/examples/node), in particular the `pdf2png` one, with this patch applied and things appear to work fine.

Please see:
 - https://www.npmjs.com/package/@napi-rs/canvas
 - https://github.com/Brooooooklyn/canvas
2024-11-09 18:51:29 +01:00
Pascal Maximilian Bremer
6d7157a875 Fix Typo:XFATemplate class Para Styling paddingight => paddingRight 2024-11-06 12:04:55 +01:00
Calixte Denizet
d59f9648a9 Simplify toRomanNumerals function 2024-11-05 22:35:35 +01:00
Jonas Jenwald
fdfcfbc351
Merge pull request #19005 from Snuffleupagus/core_utils-shorten
Shorten a few helper functions in `src/core/core_utils.js`
2024-11-05 21:46:44 +01:00
Jonas Jenwald
c78eebbace
Merge pull request #19007 from Snuffleupagus/issue-18986
Try to improve handling of missing trailer dictionaries in `XRef.indexObjects` (issue 18986)
2024-11-05 21:45:27 +01:00
Andrii Vitiv
824a619a2a
Fix error on empty response headers
Fixes https://github.com/mozilla/pdf.js/issues/18957

https://github.com/mozilla/pdf.js/pull/18682 introduced a regression that causes the following error:

```
Uncaught TypeError: Failed to construct 'Headers': Invalid name
    at PDFNetworkStreamFullRequestReader._onHeadersReceived (pdf.mjs:10214:29)
    at NetworkManager.onStateChange (pdf.mjs:10103:22)
```

The mentioned PR replaced a call to `getResponseHeader()` with `getAllResponseHeaders()` without handling cases where it may return null or an empty string. Quote from the [docs](https://developer.mozilla.org/en-US/docs/Web/API/XMLHttpRequest/getAllResponseHeaders#return_value):

> Returns:
>
>A string representing all of the response's headers (except those whose field name is Set-Cookie) separated by CRLF, or null if no response has been received. If a network error happened, an empty string is returned.

Run the following code and observe the error in the console. Note that the URL is intentionally set to an invalid value to simulate network error

```js
<script src="//mozilla.github.io/pdf.js/build/pdf.mjs" type="module"></script>

<script type="module">
  var url = 'blob:';

  pdfjsLib.GlobalWorkerOptions.workerSrc = '//mozilla.github.io/pdf.js/build/pdf.worker.mjs';

  var loadingTask = pdfjsLib.getDocument(url);
  loadingTask.promise
    .then((pdf) => console.log('PDF loaded'))
    .catch((reason) => console.error(reason));

</script>
```
2024-11-05 21:52:50 +02:00
Jonas Jenwald
e92a929a58 Try to improve handling of missing trailer dictionaries in XRef.indexObjects (issue 18986)
The problem with the referenced PDF document has nothing to do with invalid dates, as the issue seems to suggest, but rather with the fact that it has neither an XRef table nor a trailer dictionary.
Given that crucial parts of the internal document structure is missing, you might argue that it's not really a PDF document.

In an attempt to support this kind of corruption, we'll simply iterate through all (previously found) XRef entries and pick one that *might* be a valid /Root dictionary.
There's obviously no guarantee that this works, and it might not be fast in larger PDF documents, but at least it cannot be any worse than *immediately* throwing `InvalidPDFException` as we previously did here.

*Please note:* I'm totally fine with this patch being rejected, since it's somewhat questionable if we should actually attempt to support "PDF documents" with this level of corruption.
2024-11-05 18:19:26 +01:00
Jonas Jenwald
2c90eee5a8 Shorten a few helper functions in src/core/core_utils.js
In a few cases we can ever so slightly shorten the code without negatively impacting the readability.
2024-11-05 13:58:00 +01:00
Jonas Jenwald
9269fb9be2 Remove the BaseFullReader and BaseRangeReader classes in the src/display/node_stream.js file
After the previous patch these base-classes are only extended once each and they can thus be combined with the final classes.
2024-11-03 16:18:12 +01:00
Jonas Jenwald
cbf0ca71bf [api-minor] Only support the Fetch API for "remote" PDF documents in Node.js environments
The Fetch API has been supported since Node.js version 18, see https://developer.mozilla.org/en-US/docs/Web/API/Fetch_API#browser_compatibility
2024-11-03 16:18:10 +01:00
Jonas Jenwald
c7407230c1 [api-minor] Load Node.js packages/polyfills with process.getBuiltinModule
This allows *synchronous* loading of Node.js modules and (indirectly) packages, thus simplifying the code a fair bit.
2024-11-03 16:13:58 +01:00
Tim van der Meij
3634dab10c
Merge pull request #18988 from Snuffleupagus/split-dom-factory
Move the various DOM-factories into their own files
2024-11-02 19:06:47 +01:00
Tim van der Meij
e930f3030c
Merge pull request #18992 from Snuffleupagus/getPdfManager-inline-flushChunks
Inline the `flushChunks` helper function, used in `getPdfManager` on the worker-thread
2024-11-02 18:58:29 +01:00
Jonas Jenwald
e5485108ec
Merge pull request #18990 from Snuffleupagus/ensure-structTree-serializable
Ensure that serializing of StructTree-data cannot fail during loading
2024-11-02 15:17:10 +01:00
Jonas Jenwald
2145a7b9ca Use the hexNumbers structure in the stringToUTF16HexString helper
We can re-use the `hexNumbers` structure here, since that allows us to directly lookup the hexadecimal values and shortens the code.
2024-11-02 15:00:32 +01:00
Jonas Jenwald
196f7d7df1 Inline the flushChunks helper function, used in getPdfManager on the worker-thread
- This helper function has only a single call-site, and the function is fairly short.

 - It'll only be invoked if range requests are *disabled*, or if the entire PDF manages to load *before* the headers are resolved (which is very unlikely).
   Hence, by default, this helper function is not invoked.

 - By inlining the code we're able to utilize the existing error-handling at the call-site, rather than having to duplicate it, which further reduces the size of this code.

Finally, while slightly unrelated, this patch also adds optional chaining in one spot in the file (PR 16424 follow-up).
2024-11-02 11:06:30 +01:00
Jonas Jenwald
b26dc19392 Ensure that serializing of StructTree-data cannot fail during loading
I discovered that doing skip-cache re-reloading of https://opensource.adobe.com/dc-acrobat-sdk-docs/pdfstandards/PDF32000_2008.pdf would *intermittently* cause (some of) the AnnotationLayers to break with errors printed in the console (see below).

In hindsight this bug is really obvious, however it took me quite some time to find it, since the `StructTreePage.prototype.serializable` getter will lookup various data and all of those cases can fail during loading when streaming and/or range requests are being used.

Finally, to prevent any future errors, ensure that the viewer won't break in these sort of situations.

```
Uncaught (in promise)
Object { message: "Missing data [19098296, 19098297)", name: "UnknownErrorException", details: "MissingDataException: Missing data [19098296, 19098297)", stack: "BaseExceptionClosure@resource://pdf.js/build/pdf.mjs:453:29\n@resource://pdf.js/build/pdf.mjs:456:2\n" }
viewer.mjs:8801:55

\#renderAnnotationLayer: "UnknownErrorException: Missing data [17552729, 17552730)". viewer.mjs:8737:15

Uncaught (in promise)
Object { message: "Missing data [17552729, 17552730)", name: "UnknownErrorException", details: "MissingDataException: Missing data [17552729, 17552730)", stack: "BaseExceptionClosure@resource://pdf.js/build/pdf.mjs:453:29\n@resource://pdf.js/build/pdf.mjs:456:2\n" }
viewer.mjs:8801:55
```
2024-11-01 17:43:59 +01:00
Jonas Jenwald
4e12906061 Move the various DOM-factories into their own files
- Over time the number and size of these factories have increased, especially the `DOMFilterFactory` class, and this split should thus aid readability/maintainability of the code.

 - By introducing a couple of new import maps we can avoid bundling the `DOMCMapReaderFactory`/`DOMStandardFontDataFactory` classes in the Firefox PDF Viewer, since they are dead code there given that worker-thread fetching is always being used.

 - This patch has been successfully tested, by running `$ ./mach test toolkit/components/pdfjs/`, in a local Firefox artifact-build.

*Note:* This patch reduces the size of the `gulp mozcentral` output by `1.3` kilo-bytes, which isn't a lot but still cannot hurt.
2024-11-01 13:31:28 +01:00
Jonas Jenwald
7572382c7a Change the "FetchBuiltInCMap"/"FetchStandardFontData" message-handlers to be asynchronous
This way we can directly throw Errors, rather than having to "manually" return rejected Promises, which is ever so slightly shorter.

Also, since `useWorkerFetch` is always true in MOZCENTRAL builds these message-handlers should not be invoked there.
2024-10-31 09:29:11 +01:00
Jonas Jenwald
f013c39b9f
Merge pull request #18978 from Snuffleupagus/toHexUtil-simplify
Re-factor the `toHexUtil` helper (PR 17862 follow-up)
2024-10-29 17:10:36 +01:00
calixteman
f142fb8c28
Merge pull request #18972 from calixteman/refactor_highlight
[Editor] Refactor the free highlight stuff in order to be able to use the code for more general drawing
2024-10-29 17:09:03 +01:00
Jonas Jenwald
db1238aae3 Re-factor the toHexUtil helper (PR 17862 follow-up)
We can re-use the `hexNumbers` structure, since that allows us to directly lookup the hexadecimal values and shortens the code.
2024-10-29 16:35:44 +01:00
Jonas Jenwald
9870099e90
Merge pull request #18977 from Snuffleupagus/api-ReaderHeadersReady-simplify
Simplify the "ReaderHeadersReady" message-handler in the API
2024-10-29 15:46:31 +01:00
Calixte Denizet
5a9607b2ad [Editor] Refactor the free highlight stuff in order to be able to use the code for more general drawing
One goal is to make the code for drawing with the Ink tool similar to the one to free highlighting:
it doesn't really make sense to have so different ways to do almost the same thing.

When the zoom level is high, it'll avoid to create a too big canvas covering all the page which consume
more memory, makes the drawing very slow and the overall user xp pretty bad.

A second goal is to be able to easily implement more drawing tools where we would just have to implement
how to draw from the pointer coordinates.
2024-10-29 15:41:08 +01:00
Jonas Jenwald
afb4813d1c Simplify the "ReaderHeadersReady" message-handler in the API
We can convert the handler to an `async` function, which removes the need to create a temporary Promise here.
Given the age of this code it shouldn't hurt to simplify it a little bit.
2024-10-29 14:59:39 +01:00
Jonas Jenwald
8f47d06d07 Add helper functions to allow using new Uint8Array methods
This allows using the new methods in browsers that support them, e.g. Firefox 133+, while still providing fallbacks where necessary; see https://github.com/tc39/proposal-arraybuffer-base64

*Please note:* These are not actual polyfills, but only implements what we need in the PDF.js code-base. Eventually this patch should be reverted, once support is generally available.
2024-10-29 10:22:35 +01:00
Jonas Jenwald
bfc645bab1 Introduce some Uint8Array.fromBase64 and Uint8Array.prototype.toBase64 usage in the main code-base
See https://github.com/tc39/proposal-arraybuffer-base64
2024-10-29 10:22:35 +01:00
Jonas Jenwald
f9fc477080 Improve the implementation of the PDFDocument.fingerprints-getter
- Add explicit `length` validation of the /ID entries. Given the `EMPTY_FINGERPRINT` constant we're already *implicitly* assuming a particular length.

 - Move the constants into the `fingerprints`-getter, since they're not used anywhere else.

 - Replace the `hexString` helper function with the standard `Uint8Array.prototype.toHex` method; see https://github.com/tc39/proposal-arraybuffer-base64
2024-10-29 10:22:35 +01:00
Jonas Jenwald
48a18585f2 Allow StreamsSequenceStream to skip sub-streams that are not actual Streams (issue 18973)
This extends PR 13796 to also handle the case where sub-streams contain invalid data, i.e. anything that isn't a Stream, however please note that in these cases there's no guarantee that we'll render the page "correctly".

Note that Adobe Reader, i.e. the PDF reference implementation, cannot render the last page of the referenced PDF document.
2024-10-29 09:36:08 +01:00
Jonas Jenwald
ee812b5df2 [Editor] Utilize Fluent "better" when localizing the AltText
Currently we manually localize and update the DOM-elements of the AltText-button, and it seems nicer to utilize Fluent "properly" for that task.
This can be achieved by introducing an explicit `span`-element on the AltText-button (similar to e.g. the regular toolbar-buttons), and adding a few more l10n-strings, since that allows just setting the `data-l10n-id`-attribute on all the relevant DOM-elements.

Finally, note how we no longer need to localize any strings eagerly when initializing the various editors.
2024-10-28 17:19:02 +01:00
Calixte Denizet
b649b6f8dd Use a BMP decoder when resizing an image
The image decoding won't block the main thread any more.
For now, it isn't enabled for Chrome because issue6741.pdf leads to a crash.
2024-10-28 14:09:52 +01:00
calixteman
2bee3af0ee
Merge pull request #18967 from calixteman/bug1910431
Make util.scand a bit more flexible with dates which don't match the given format (bug 1910431)
2024-10-28 09:36:34 +01:00
Calixte Denizet
230d7f9229 Make util.scand a bit more flexible with dates which don't match the given format (bug 1910431) 2024-10-27 19:19:06 +01:00
Tim van der Meij
5418060bbc
Merge pull request #18951 from Snuffleupagus/CMap-isCompressed
[api-minor] Remove the `CMapCompressionType` enumeration
2024-10-27 14:42:00 +01:00
Tim van der Meij
b5805caacd
Merge pull request #18965 from Snuffleupagus/_goodSquareLength-static
Re-factor the `ImageResizer._goodSquareLength` definition
2024-10-27 14:38:43 +01:00
Jonas Jenwald
8a2b95418a Re-factor the ImageResizer._goodSquareLength definition
Move the `ImageResizer._goodSquareLength` definition into the class itself, since the current position shouldn't be necessary, and also convert it into an actually private field.
2024-10-27 11:03:04 +01:00
Calixte Denizet
d114f71feb Always fill the mask with the backdrop color
It fixes #18956.

In the patch #18029, for performance reasons and because I thought it was useless, I deliberately chose to not fill the mask
with the backdrop color when it's full black: it was a bad idea.
So in this patch we always add the backdrop color to the mask.
2024-10-26 14:14:51 +02:00
Jonas Jenwald
b048420d21 [api-minor] Remove the CMapCompressionType enumeration
After the binary CMap format had been added there were also some ideas about *maybe* providing other formats, see [here](https://github.com/mozilla/pdf.js/pull/8064#issuecomment-279730182), however that was over seven years ago and we still only use binary CMaps.
Hence it now seems reasonable to simplify the relevant code by removing `CMapCompressionType` and instead just use a boolean to indicate the type of the built-in CMaps.
2024-10-24 11:08:16 +02:00
Jonas Jenwald
50c291eb33 Unconditionally cache built-in CMaps on the worker-thread
Given that we've not shipped, nor used, anything except binary CMaps for years let's just cache them unconditionally (since that's a tiny bit less code).
2024-10-24 10:15:09 +02:00
calixteman
1ad09779f1
Merge pull request #18910 from calixteman/image_decoder1
Use ImageDecoder in order to decode jpeg images (bug 1901223)
2024-10-23 13:54:07 +02:00
Calixte Denizet
b6c4f0b69e Use ImageDecoder in order to decode jpeg images (bug 1901223) 2024-10-23 10:42:01 +02:00
Tim van der Meij
1e07b87bb6
Merge pull request #18933 from Snuffleupagus/base-factory-fetchData
Change the `BaseCMapReaderFactory` fetch-helper to return a `Uint8Array`
2024-10-22 20:03:50 +02:00
Jonas Jenwald
236c8d862e Re-factor how we handle missing, corrupt, or empty font-file entries
This improves the fixes for e.g. issue 9462 and 18941 slightly and allows better fallback behaviour for non-standard fonts.
2024-10-22 17:07:12 +02:00
Jonas Jenwald
63b34114b1 Fallback to a standard font if a font-file entry doesn't contain a Stream (issue 18941)
The PDF document is clearly corrupt, since it has /FontFile2 entries that are Dictionaries which obviously isn't correct.
While there's obviously no guarantee that things will look perfect this way, actually rendering the text at all should be an improvement in general.
2024-10-22 11:51:28 +02:00