pdf.js

mirror of https://github.com/mozilla/pdf.js.git synced 2025-04-23 16:48:08 +02:00

Author	SHA1	Message	Date
Calixte Denizet	4bf7787084	Simplify saving added/modified annotations. Having this map to collect the different changes will allow to know if some objects have already been modified.	2024-11-12 10:59:38 +01:00
calixteman	2ad8782428	Merge pull request #19023 from calixteman/issue19022 Apply gradient when stroking text	2024-11-11 17:15:50 +01:00
Calixte Denizet	79e1f155ac	Apply gradient when stroking text It fixes #19022. I noticed that the glyph contours weren't correct (for T and x) and because we forgot to close the contour.	2024-11-11 15:53:07 +01:00
Jonas Jenwald	5524216c23	Merge pull request #19015 from Snuffleupagus/@napi-rs/canvas [api-minor] Replace the `canvas` package with `@napi-rs/canvas`	2024-11-10 19:45:01 +01:00
Jonas Jenwald	0b864ee7d5	Shorten the `Page.prototype.userUnit` getter slightly	2024-11-10 16:30:07 +01:00
Jonas Jenwald	9b62f2e7d1	Polyfill `ImageData` in Node.js environments Given that `ImageData` has been supported for many years in all browsers, see [MDN](https://developer.mozilla.org/en-US/docs/Web/API/ImageData#browser_compatibility), we have a `typeof` check that's only necessary in Node.js environments. Since the `@napi-rs/canvas` package provides that functionality, we can thus add an `ImageData` polyfill which allows us to ever so slightly simplify the code.	2024-11-09 18:51:32 +01:00
Jonas Jenwald	86f943ca03	[api-minor] Replace the `canvas` package with `@napi-rs/canvas` The `@napi-rs/canvas` package has fewer dependencies, which should hopefully make installing and using it easier for `pdfjs-dist` end-users. (Over the years we've seen, repeatedly, that `canvas` can be difficult to install successfully.) Furthermore, this package includes more functionality (such as `Path2D`) which reduces the overall number of dependencies in the PDF.js project. One point to note is that `@napi-rs/canvas` is a fair bit newer than `canvas`, and has a lot fewer users, however looking at the commit history it does seem to be actively maintained. Note that I've successfully tested the [Node.js examples](https://github.com/mozilla/pdf.js/tree/master/examples/node), in particular the `pdf2png` one, with this patch applied and things appear to work fine. Please see: - https://www.npmjs.com/package/@napi-rs/canvas - https://github.com/Brooooooklyn/canvas	2024-11-09 18:51:29 +01:00
Pascal Maximilian Bremer	6d7157a875	Fix Typo:XFATemplate class Para Styling paddingight => paddingRight	2024-11-06 12:04:55 +01:00
Calixte Denizet	d59f9648a9	Simplify toRomanNumerals function	2024-11-05 22:35:35 +01:00
Jonas Jenwald	fdfcfbc351	Merge pull request #19005 from Snuffleupagus/core_utils-shorten Shorten a few helper functions in `src/core/core_utils.js`	2024-11-05 21:46:44 +01:00
Jonas Jenwald	c78eebbace	Merge pull request #19007 from Snuffleupagus/issue-18986 Try to improve handling of missing trailer dictionaries in `XRef.indexObjects` (issue 18986)	2024-11-05 21:45:27 +01:00
Andrii Vitiv	824a619a2a	Fix error on empty response headers Fixes https://github.com/mozilla/pdf.js/issues/18957 https://github.com/mozilla/pdf.js/pull/18682 introduced a regression that causes the following error: ``` Uncaught TypeError: Failed to construct 'Headers': Invalid name at PDFNetworkStreamFullRequestReader._onHeadersReceived (pdf.mjs:10214:29) at NetworkManager.onStateChange (pdf.mjs:10103:22) ``` The mentioned PR replaced a call to `getResponseHeader()` with `getAllResponseHeaders()` without handling cases where it may return null or an empty string. Quote from the [docs](https://developer.mozilla.org/en-US/docs/Web/API/XMLHttpRequest/getAllResponseHeaders#return_value): > Returns: > >A string representing all of the response's headers (except those whose field name is Set-Cookie) separated by CRLF, or null if no response has been received. If a network error happened, an empty string is returned. Run the following code and observe the error in the console. Note that the URL is intentionally set to an invalid value to simulate network error ```js <script src="//mozilla.github.io/pdf.js/build/pdf.mjs" type="module"></script> <script type="module"> var url = 'blob:'; pdfjsLib.GlobalWorkerOptions.workerSrc = '//mozilla.github.io/pdf.js/build/pdf.worker.mjs'; var loadingTask = pdfjsLib.getDocument(url); loadingTask.promise .then((pdf) => console.log('PDF loaded')) .catch((reason) => console.error(reason)); </script> ```	2024-11-05 21:52:50 +02:00
Jonas Jenwald	e92a929a58	Try to improve handling of missing trailer dictionaries in `XRef.indexObjects` (issue 18986) The problem with the referenced PDF document has nothing to do with invalid dates, as the issue seems to suggest, but rather with the fact that it has neither an XRef table nor a trailer dictionary. Given that crucial parts of the internal document structure is missing, you might argue that it's not really a PDF document. In an attempt to support this kind of corruption, we'll simply iterate through all (previously found) XRef entries and pick one that might be a valid /Root dictionary. There's obviously no guarantee that this works, and it might not be fast in larger PDF documents, but at least it cannot be any worse than immediately throwing `InvalidPDFException` as we previously did here. Please note: I'm totally fine with this patch being rejected, since it's somewhat questionable if we should actually attempt to support "PDF documents" with this level of corruption.	2024-11-05 18:19:26 +01:00
Jonas Jenwald	2c90eee5a8	Shorten a few helper functions in `src/core/core_utils.js` In a few cases we can ever so slightly shorten the code without negatively impacting the readability.	2024-11-05 13:58:00 +01:00
Jonas Jenwald	9269fb9be2	Remove the `BaseFullReader` and `BaseRangeReader` classes in the `src/display/node_stream.js` file After the previous patch these base-classes are only extended once each and they can thus be combined with the final classes.	2024-11-03 16:18:12 +01:00
Jonas Jenwald	cbf0ca71bf	[api-minor] Only support the Fetch API for "remote" PDF documents in Node.js environments The Fetch API has been supported since Node.js version 18, see https://developer.mozilla.org/en-US/docs/Web/API/Fetch_API#browser_compatibility	2024-11-03 16:18:10 +01:00
Jonas Jenwald	c7407230c1	[api-minor] Load Node.js packages/polyfills with `process.getBuiltinModule` This allows synchronous loading of Node.js modules and (indirectly) packages, thus simplifying the code a fair bit.	2024-11-03 16:13:58 +01:00
Tim van der Meij	3634dab10c	Merge pull request #18988 from Snuffleupagus/split-dom-factory Move the various DOM-factories into their own files	2024-11-02 19:06:47 +01:00
Tim van der Meij	e930f3030c	Merge pull request #18992 from Snuffleupagus/getPdfManager-inline-flushChunks Inline the `flushChunks` helper function, used in `getPdfManager` on the worker-thread	2024-11-02 18:58:29 +01:00
Jonas Jenwald	e5485108ec	Merge pull request #18990 from Snuffleupagus/ensure-structTree-serializable Ensure that serializing of StructTree-data cannot fail during loading	2024-11-02 15:17:10 +01:00
Jonas Jenwald	2145a7b9ca	Use the `hexNumbers` structure in the `stringToUTF16HexString` helper We can re-use the `hexNumbers` structure here, since that allows us to directly lookup the hexadecimal values and shortens the code.	2024-11-02 15:00:32 +01:00
Jonas Jenwald	196f7d7df1	Inline the `flushChunks` helper function, used in `getPdfManager` on the worker-thread - This helper function has only a single call-site, and the function is fairly short. - It'll only be invoked if range requests are disabled, or if the entire PDF manages to load before the headers are resolved (which is very unlikely). Hence, by default, this helper function is not invoked. - By inlining the code we're able to utilize the existing error-handling at the call-site, rather than having to duplicate it, which further reduces the size of this code. Finally, while slightly unrelated, this patch also adds optional chaining in one spot in the file (PR 16424 follow-up).	2024-11-02 11:06:30 +01:00
Jonas Jenwald	b26dc19392	Ensure that serializing of StructTree-data cannot fail during loading I discovered that doing skip-cache re-reloading of https://opensource.adobe.com/dc-acrobat-sdk-docs/pdfstandards/PDF32000_2008.pdf would intermittently cause (some of) the AnnotationLayers to break with errors printed in the console (see below). In hindsight this bug is really obvious, however it took me quite some time to find it, since the `StructTreePage.prototype.serializable` getter will lookup various data and all of those cases can fail during loading when streaming and/or range requests are being used. Finally, to prevent any future errors, ensure that the viewer won't break in these sort of situations. ``` Uncaught (in promise) Object { message: "Missing data [19098296, 19098297)", name: "UnknownErrorException", details: "MissingDataException: Missing data [19098296, 19098297)", stack: "BaseExceptionClosure@resource://pdf.js/build/pdf.mjs:453:29\n@resource://pdf.js/build/pdf.mjs:456:2\n" } viewer.mjs:8801:55 \#renderAnnotationLayer: "UnknownErrorException: Missing data [17552729, 17552730)". viewer.mjs:8737:15 Uncaught (in promise) Object { message: "Missing data [17552729, 17552730)", name: "UnknownErrorException", details: "MissingDataException: Missing data [17552729, 17552730)", stack: "BaseExceptionClosure@resource://pdf.js/build/pdf.mjs:453:29\n@resource://pdf.js/build/pdf.mjs:456:2\n" } viewer.mjs:8801:55 ```	2024-11-01 17:43:59 +01:00
Jonas Jenwald	4e12906061	Move the various DOM-factories into their own files - Over time the number and size of these factories have increased, especially the `DOMFilterFactory` class, and this split should thus aid readability/maintainability of the code. - By introducing a couple of new import maps we can avoid bundling the `DOMCMapReaderFactory`/`DOMStandardFontDataFactory` classes in the Firefox PDF Viewer, since they are dead code there given that worker-thread fetching is always being used. - This patch has been successfully tested, by running `$ ./mach test toolkit/components/pdfjs/`, in a local Firefox artifact-build. Note: This patch reduces the size of the `gulp mozcentral` output by `1.3` kilo-bytes, which isn't a lot but still cannot hurt.	2024-11-01 13:31:28 +01:00
Jonas Jenwald	7572382c7a	Change the "FetchBuiltInCMap"/"FetchStandardFontData" message-handlers to be asynchronous This way we can directly throw Errors, rather than having to "manually" return rejected Promises, which is ever so slightly shorter. Also, since `useWorkerFetch` is always true in MOZCENTRAL builds these message-handlers should not be invoked there.	2024-10-31 09:29:11 +01:00
Jonas Jenwald	f013c39b9f	Merge pull request #18978 from Snuffleupagus/toHexUtil-simplify Re-factor the `toHexUtil` helper (PR 17862 follow-up)	2024-10-29 17:10:36 +01:00
calixteman	f142fb8c28	Merge pull request #18972 from calixteman/refactor_highlight [Editor] Refactor the free highlight stuff in order to be able to use the code for more general drawing	2024-10-29 17:09:03 +01:00
Jonas Jenwald	db1238aae3	Re-factor the `toHexUtil` helper (PR 17862 follow-up) We can re-use the `hexNumbers` structure, since that allows us to directly lookup the hexadecimal values and shortens the code.	2024-10-29 16:35:44 +01:00
Jonas Jenwald	9870099e90	Merge pull request #18977 from Snuffleupagus/api-ReaderHeadersReady-simplify Simplify the "ReaderHeadersReady" message-handler in the API	2024-10-29 15:46:31 +01:00
Calixte Denizet	5a9607b2ad	[Editor] Refactor the free highlight stuff in order to be able to use the code for more general drawing One goal is to make the code for drawing with the Ink tool similar to the one to free highlighting: it doesn't really make sense to have so different ways to do almost the same thing. When the zoom level is high, it'll avoid to create a too big canvas covering all the page which consume more memory, makes the drawing very slow and the overall user xp pretty bad. A second goal is to be able to easily implement more drawing tools where we would just have to implement how to draw from the pointer coordinates.	2024-10-29 15:41:08 +01:00
Jonas Jenwald	afb4813d1c	Simplify the "ReaderHeadersReady" message-handler in the API We can convert the handler to an `async` function, which removes the need to create a temporary Promise here. Given the age of this code it shouldn't hurt to simplify it a little bit.	2024-10-29 14:59:39 +01:00
Jonas Jenwald	8f47d06d07	Add helper functions to allow using new `Uint8Array` methods This allows using the new methods in browsers that support them, e.g. Firefox 133+, while still providing fallbacks where necessary; see https://github.com/tc39/proposal-arraybuffer-base64 Please note: These are not actual polyfills, but only implements what we need in the PDF.js code-base. Eventually this patch should be reverted, once support is generally available.	2024-10-29 10:22:35 +01:00
Jonas Jenwald	bfc645bab1	Introduce some `Uint8Array.fromBase64` and `Uint8Array.prototype.toBase64` usage in the main code-base See https://github.com/tc39/proposal-arraybuffer-base64	2024-10-29 10:22:35 +01:00
Jonas Jenwald	f9fc477080	Improve the implementation of the `PDFDocument.fingerprints`-getter - Add explicit `length` validation of the /ID entries. Given the `EMPTY_FINGERPRINT` constant we're already implicitly assuming a particular length. - Move the constants into the `fingerprints`-getter, since they're not used anywhere else. - Replace the `hexString` helper function with the standard `Uint8Array.prototype.toHex` method; see https://github.com/tc39/proposal-arraybuffer-base64	2024-10-29 10:22:35 +01:00
Jonas Jenwald	48a18585f2	Allow `StreamsSequenceStream` to skip sub-streams that are not actual Streams (issue 18973) This extends PR 13796 to also handle the case where sub-streams contain invalid data, i.e. anything that isn't a Stream, however please note that in these cases there's no guarantee that we'll render the page "correctly". Note that Adobe Reader, i.e. the PDF reference implementation, cannot render the last page of the referenced PDF document.	2024-10-29 09:36:08 +01:00
Jonas Jenwald	ee812b5df2	[Editor] Utilize Fluent "better" when localizing the AltText Currently we manually localize and update the DOM-elements of the AltText-button, and it seems nicer to utilize Fluent "properly" for that task. This can be achieved by introducing an explicit `span`-element on the AltText-button (similar to e.g. the regular toolbar-buttons), and adding a few more l10n-strings, since that allows just setting the `data-l10n-id`-attribute on all the relevant DOM-elements. Finally, note how we no longer need to localize any strings eagerly when initializing the various editors.	2024-10-28 17:19:02 +01:00
Calixte Denizet	b649b6f8dd	Use a BMP decoder when resizing an image The image decoding won't block the main thread any more. For now, it isn't enabled for Chrome because issue6741.pdf leads to a crash.	2024-10-28 14:09:52 +01:00
calixteman	2bee3af0ee	Merge pull request #18967 from calixteman/bug1910431 Make util.scand a bit more flexible with dates which don't match the given format (bug 1910431)	2024-10-28 09:36:34 +01:00
Calixte Denizet	230d7f9229	Make util.scand a bit more flexible with dates which don't match the given format (bug 1910431)	2024-10-27 19:19:06 +01:00
Tim van der Meij	5418060bbc	Merge pull request #18951 from Snuffleupagus/CMap-isCompressed [api-minor] Remove the `CMapCompressionType` enumeration	2024-10-27 14:42:00 +01:00
Tim van der Meij	b5805caacd	Merge pull request #18965 from Snuffleupagus/_goodSquareLength-static Re-factor the `ImageResizer._goodSquareLength` definition	2024-10-27 14:38:43 +01:00
Jonas Jenwald	8a2b95418a	Re-factor the `ImageResizer._goodSquareLength` definition Move the `ImageResizer._goodSquareLength` definition into the class itself, since the current position shouldn't be necessary, and also convert it into an actually private field.	2024-10-27 11:03:04 +01:00
Calixte Denizet	d114f71feb	Always fill the mask with the backdrop color It fixes #18956. In the patch #18029, for performance reasons and because I thought it was useless, I deliberately chose to not fill the mask with the backdrop color when it's full black: it was a bad idea. So in this patch we always add the backdrop color to the mask.	2024-10-26 14:14:51 +02:00
Jonas Jenwald	b048420d21	[api-minor] Remove the `CMapCompressionType` enumeration After the binary CMap format had been added there were also some ideas about maybe providing other formats, see [here](https://github.com/mozilla/pdf.js/pull/8064#issuecomment-279730182), however that was over seven years ago and we still only use binary CMaps. Hence it now seems reasonable to simplify the relevant code by removing `CMapCompressionType` and instead just use a boolean to indicate the type of the built-in CMaps.	2024-10-24 11:08:16 +02:00
Jonas Jenwald	50c291eb33	Unconditionally cache built-in CMaps on the worker-thread Given that we've not shipped, nor used, anything except binary CMaps for years let's just cache them unconditionally (since that's a tiny bit less code).	2024-10-24 10:15:09 +02:00
calixteman	1ad09779f1	Merge pull request #18910 from calixteman/image_decoder1 Use ImageDecoder in order to decode jpeg images (bug 1901223)	2024-10-23 13:54:07 +02:00
Calixte Denizet	b6c4f0b69e	Use ImageDecoder in order to decode jpeg images (bug 1901223)	2024-10-23 10:42:01 +02:00
Tim van der Meij	1e07b87bb6	Merge pull request #18933 from Snuffleupagus/base-factory-fetchData Change the `BaseCMapReaderFactory` fetch-helper to return a `Uint8Array`	2024-10-22 20:03:50 +02:00
Jonas Jenwald	236c8d862e	Re-factor how we handle missing, corrupt, or empty font-file entries This improves the fixes for e.g. issue 9462 and 18941 slightly and allows better fallback behaviour for non-standard fonts.	2024-10-22 17:07:12 +02:00
Jonas Jenwald	63b34114b1	Fallback to a standard font if a font-file entry doesn't contain a Stream (issue 18941) The PDF document is clearly corrupt, since it has /FontFile2 entries that are Dictionaries which obviously isn't correct. While there's obviously no guarantee that things will look perfect this way, actually rendering the text at all should be an improvement in general.	2024-10-22 11:51:28 +02:00

1 2 3 4 5 ...

6699 commits