pdf.js

mirror of https://github.com/mozilla/pdf.js.git synced 2025-04-19 14:48:08 +02:00

Author	SHA1	Message	Date
Andrii Vitiv	824a619a2a	Fix error on empty response headers Fixes https://github.com/mozilla/pdf.js/issues/18957 https://github.com/mozilla/pdf.js/pull/18682 introduced a regression that causes the following error: ``` Uncaught TypeError: Failed to construct 'Headers': Invalid name at PDFNetworkStreamFullRequestReader._onHeadersReceived (pdf.mjs:10214:29) at NetworkManager.onStateChange (pdf.mjs:10103:22) ``` The mentioned PR replaced a call to `getResponseHeader()` with `getAllResponseHeaders()` without handling cases where it may return null or an empty string. Quote from the [docs](https://developer.mozilla.org/en-US/docs/Web/API/XMLHttpRequest/getAllResponseHeaders#return_value): > Returns: > >A string representing all of the response's headers (except those whose field name is Set-Cookie) separated by CRLF, or null if no response has been received. If a network error happened, an empty string is returned. Run the following code and observe the error in the console. Note that the URL is intentionally set to an invalid value to simulate network error ```js <script src="//mozilla.github.io/pdf.js/build/pdf.mjs" type="module"></script> <script type="module"> var url = 'blob:'; pdfjsLib.GlobalWorkerOptions.workerSrc = '//mozilla.github.io/pdf.js/build/pdf.worker.mjs'; var loadingTask = pdfjsLib.getDocument(url); loadingTask.promise .then((pdf) => console.log('PDF loaded')) .catch((reason) => console.error(reason)); </script> ```	2024-11-05 21:52:50 +02:00
Jonas Jenwald	e92a929a58	Try to improve handling of missing trailer dictionaries in `XRef.indexObjects` (issue 18986) The problem with the referenced PDF document has nothing to do with invalid dates, as the issue seems to suggest, but rather with the fact that it has neither an XRef table nor a trailer dictionary. Given that crucial parts of the internal document structure is missing, you might argue that it's not really a PDF document. In an attempt to support this kind of corruption, we'll simply iterate through all (previously found) XRef entries and pick one that might be a valid /Root dictionary. There's obviously no guarantee that this works, and it might not be fast in larger PDF documents, but at least it cannot be any worse than immediately throwing `InvalidPDFException` as we previously did here. Please note: I'm totally fine with this patch being rejected, since it's somewhat questionable if we should actually attempt to support "PDF documents" with this level of corruption.	2024-11-05 18:19:26 +01:00
Jonas Jenwald	2c90eee5a8	Shorten a few helper functions in `src/core/core_utils.js` In a few cases we can ever so slightly shorten the code without negatively impacting the readability.	2024-11-05 13:58:00 +01:00
Jonas Jenwald	f2fb3b95ce	Add helper functions to load image blob/bitmap data in `test/unit/api_spec.js` This avoids repeating the same code multiple times, and as part of the changes we'll also utilize existing PDF.js helpers more.	2024-11-04 14:09:34 +01:00
Jonas Jenwald	e4a5bd9555	Bump library version to `4.9`	2024-11-04 10:37:35 +01:00
Jonas Jenwald	cefd1ebcd2	Merge pull request #18959 from Snuffleupagus/Node-20 [api-minor] Update the minimum supported Node.js version to 20, and only support the Fetch API for "remote" PDF documents in Node.js	2024-11-04 10:29:55 +01:00
Jonas Jenwald	9269fb9be2	Remove the `BaseFullReader` and `BaseRangeReader` classes in the `src/display/node_stream.js` file After the previous patch these base-classes are only extended once each and they can thus be combined with the final classes.	2024-11-03 16:18:12 +01:00
Jonas Jenwald	cbf0ca71bf	[api-minor] Only support the Fetch API for "remote" PDF documents in Node.js environments The Fetch API has been supported since Node.js version 18, see https://developer.mozilla.org/en-US/docs/Web/API/Fetch_API#browser_compatibility	2024-11-03 16:18:10 +01:00
Jonas Jenwald	c7407230c1	[api-minor] Load Node.js packages/polyfills with `process.getBuiltinModule` This allows synchronous loading of Node.js modules and (indirectly) packages, thus simplifying the code a fair bit.	2024-11-03 16:13:58 +01:00
Jonas Jenwald	4f01cdef18	[api-minor] Update the minimum supported Node.js version to 20 This patch updates the minimum supported environments as follows: - Node.js 20, which was released on 2023-04-18 and has now entered the "Maintenance"-phase; see https://github.com/nodejs/release#release-schedule Furthermore, note also that Node.js 18 will fairly soon reach EOL.	2024-11-03 16:13:55 +01:00
Tim van der Meij	35673d3e6e	Merge pull request #19001 from timvandermeij/integration-test-scripting-uppercase Fix the "must convert input to uppercase" scripting integration test	2024-11-03 15:47:45 +01:00
Tim van der Meij	3adf8b6be0	Fix the "must convert input to uppercase" scripting integration test This integration test fails intermittently because we're not (correctly) awaiting the sandbox actions. The `27R` field in `issue14862.pdf` triggers sandbox events for every typing action, but for the backspace and "a" character typing actions we weren't awaiting the sandbox trip at all, and for other places we weren't awaiting it fully (causing some characters to be missed in the assertion). This commit fixes the issues by using the appropriate helper functions, similar to what we did in PR #18399. Not only is this shorter in terms of code, but it also fixed the near-permafail for this test with newer versions of Puppeteer.	2024-11-03 15:08:55 +01:00
Tim van der Meij	20fbb4d661	Merge pull request #19000 from timvandermeij/types-node Install and use the most recent Node types for the types tests	2024-11-03 14:41:06 +01:00
Tim van der Meij	ccfaf20ee2	Install and use the most recent Node types for the types tests The types tests run in Node.js and therefore use Node types for e.g. builtins. However, we didn't explicitly indicate this in `tsconfig.json` (see [1] for more information and [2] for the PR where we found this). Moreover, we didn't explicitly install the most recent version of `@types/node` which implicitly made us fall back to version 14.14.45 (because that was installed as a dependency of other modules) whereas much newer versions are available and we need those after changes in Node.js (see [3] for more information and [4] for the PR where we found this). This commit fixes both issues by explicitly installing and using the most recent Node.js types, which should also avoid future issues with the types tests. [1] https://github.com/TypeStrong/ts-node/issues/1012 [2] https://github.com/mozilla/pdf.js/pull/18237 [3] https://stackoverflow.com/questions/78790943/in-typescript-5-6-buffer-is-not-assignable-to-arraybufferview-or-uint8arr [4] https://github.com/mozilla/pdf.js/pull/18959	2024-11-03 13:40:55 +01:00
Tim van der Meij	c1bcb46b3b	Merge pull request #18999 from Snuffleupagus/toBase64Util-unittest Use the `toBase64Util` helper function in the unit-tests	2024-11-03 13:04:13 +01:00
Jonas Jenwald	f78a8f3c54	Use the `toBase64Util` helper function in the unit-tests	2024-11-03 11:25:19 +01:00
Tim van der Meij	5f77b907eb	Merge pull request #18997 from Snuffleupagus/Node-enable-Blob-unittest Enable the 'gets PDF filename from query string appended to "blob:" URL' unit-test in Node.js	2024-11-03 11:11:36 +01:00
Tim van der Meij	7ae21ece4a	Merge pull request #18998 from Snuffleupagus/Node-enable-XFA-alt-unittest Enable the "should have an alt attribute from toolTip" unit-test in Node.js	2024-11-03 11:10:54 +01:00
Tim van der Meij	3755d680e4	Merge pull request #18995 from timvandermeij/updates Update dependencies and translations to the most recent versions	2024-11-03 11:09:15 +01:00
Jonas Jenwald	faf9e32ecb	Enable the "should have an alt attribute from toolTip" unit-test in Node.js Despite the pending-message mentioning "Image", this appears to be another case where the code actually depends on [`Blob`](https://developer.mozilla.org/en-US/docs/Web/API/Blob/Blob#browser_compatibility); note `cf3ca8b5bc/src/core/xfa/template.js (L3453)`	2024-11-03 00:15:44 +01:00
Jonas Jenwald	15fbee158c	Enable the 'gets PDF filename from query string appended to "blob:" URL' unit-test in Node.js The necessary functionality has been supported in Node.js for quite some time now, please see: - https://developer.mozilla.org/en-US/docs/Web/API/Blob/Blob#browser_compatibility - https://developer.mozilla.org/en-US/docs/Web/API/URL/createObjectURL_static#browser_compatibility	2024-11-02 23:53:03 +01:00
Tim van der Meij	3854ab5efd	Update translations to the most recent versions	2024-11-02 20:21:59 +01:00
Tim van der Meij	3cd906829e	Update dependencies to the most recent versions	2024-11-02 20:20:58 +01:00
Tim van der Meij	cf3ca8b5bc	Merge pull request #18994 from timvandermeij/bump Bump the stable version in `pdfjs.config`	2024-11-02 19:59:41 +01:00
Tim van der Meij	627a588336	Bump the stable version in `pdfjs.config`	2024-11-02 19:56:10 +01:00
Tim van der Meij	3634dab10c	Merge pull request #18988 from Snuffleupagus/split-dom-factory Move the various DOM-factories into their own files	2024-11-02 19:06:47 +01:00
Tim van der Meij	e930f3030c	Merge pull request #18992 from Snuffleupagus/getPdfManager-inline-flushChunks Inline the `flushChunks` helper function, used in `getPdfManager` on the worker-thread	2024-11-02 18:58:29 +01:00
Jonas Jenwald	e5485108ec	Merge pull request #18990 from Snuffleupagus/ensure-structTree-serializable Ensure that serializing of StructTree-data cannot fail during loading	2024-11-02 15:17:10 +01:00
Jonas Jenwald	aa4839ed0f	Merge pull request #18993 from Snuffleupagus/stringToUTF16HexString-hexNumbers Use the `hexNumbers` structure in the `stringToUTF16HexString` helper	2024-11-02 15:15:47 +01:00
Jonas Jenwald	2145a7b9ca	Use the `hexNumbers` structure in the `stringToUTF16HexString` helper We can re-use the `hexNumbers` structure here, since that allows us to directly lookup the hexadecimal values and shortens the code.	2024-11-02 15:00:32 +01:00
Jonas Jenwald	196f7d7df1	Inline the `flushChunks` helper function, used in `getPdfManager` on the worker-thread - This helper function has only a single call-site, and the function is fairly short. - It'll only be invoked if range requests are disabled, or if the entire PDF manages to load before the headers are resolved (which is very unlikely). Hence, by default, this helper function is not invoked. - By inlining the code we're able to utilize the existing error-handling at the call-site, rather than having to duplicate it, which further reduces the size of this code. Finally, while slightly unrelated, this patch also adds optional chaining in one spot in the file (PR 16424 follow-up).	2024-11-02 11:06:30 +01:00
Jonas Jenwald	b26dc19392	Ensure that serializing of StructTree-data cannot fail during loading I discovered that doing skip-cache re-reloading of https://opensource.adobe.com/dc-acrobat-sdk-docs/pdfstandards/PDF32000_2008.pdf would intermittently cause (some of) the AnnotationLayers to break with errors printed in the console (see below). In hindsight this bug is really obvious, however it took me quite some time to find it, since the `StructTreePage.prototype.serializable` getter will lookup various data and all of those cases can fail during loading when streaming and/or range requests are being used. Finally, to prevent any future errors, ensure that the viewer won't break in these sort of situations. ``` Uncaught (in promise) Object { message: "Missing data [19098296, 19098297)", name: "UnknownErrorException", details: "MissingDataException: Missing data [19098296, 19098297)", stack: "BaseExceptionClosure@resource://pdf.js/build/pdf.mjs:453:29\n@resource://pdf.js/build/pdf.mjs:456:2\n" } viewer.mjs:8801:55 \#renderAnnotationLayer: "UnknownErrorException: Missing data [17552729, 17552730)". viewer.mjs:8737:15 Uncaught (in promise) Object { message: "Missing data [17552729, 17552730)", name: "UnknownErrorException", details: "MissingDataException: Missing data [17552729, 17552730)", stack: "BaseExceptionClosure@resource://pdf.js/build/pdf.mjs:453:29\n@resource://pdf.js/build/pdf.mjs:456:2\n" } viewer.mjs:8801:55 ```	2024-11-01 17:43:59 +01:00
Jonas Jenwald	4e12906061	Move the various DOM-factories into their own files - Over time the number and size of these factories have increased, especially the `DOMFilterFactory` class, and this split should thus aid readability/maintainability of the code. - By introducing a couple of new import maps we can avoid bundling the `DOMCMapReaderFactory`/`DOMStandardFontDataFactory` classes in the Firefox PDF Viewer, since they are dead code there given that worker-thread fetching is always being used. - This patch has been successfully tested, by running `$ ./mach test toolkit/components/pdfjs/`, in a local Firefox artifact-build. Note: This patch reduces the size of the `gulp mozcentral` output by `1.3` kilo-bytes, which isn't a lot but still cannot hurt.	2024-11-01 13:31:28 +01:00
Tim van der Meij	06f3b2d0a6	Merge pull request #18983 from Snuffleupagus/api-FetchBuiltInCMap-FetchStandardFontData-async Change the "FetchBuiltInCMap"/"FetchStandardFontData" message-handlers to be asynchronous	2024-10-31 20:30:11 +01:00
Jonas Jenwald	3ed438aef5	Merge pull request #18979 from Snuffleupagus/L10n-#elements-lazy-init Don't initialize `L10n.#elements` eagerly since it's unused in MOZCENTRAL builds	2024-10-31 11:03:24 +01:00
Jonas Jenwald	7572382c7a	Change the "FetchBuiltInCMap"/"FetchStandardFontData" message-handlers to be asynchronous This way we can directly throw Errors, rather than having to "manually" return rejected Promises, which is ever so slightly shorter. Also, since `useWorkerFetch` is always true in MOZCENTRAL builds these message-handlers should not be invoked there.	2024-10-31 09:29:11 +01:00
Jonas Jenwald	cdd4b052f9	Don't initialize `L10n.#elements` eagerly since it's unused in MOZCENTRAL builds It's not necessary to manually start translation in the Firefox PDF Viewer, and doing so would even cause problems there (see issue 17142).	2024-10-30 15:20:44 +01:00
Jonas Jenwald	f013c39b9f	Merge pull request #18978 from Snuffleupagus/toHexUtil-simplify Re-factor the `toHexUtil` helper (PR 17862 follow-up)	2024-10-29 17:10:36 +01:00
calixteman	f142fb8c28	Merge pull request #18972 from calixteman/refactor_highlight [Editor] Refactor the free highlight stuff in order to be able to use the code for more general drawing	2024-10-29 17:09:03 +01:00
Jonas Jenwald	db1238aae3	Re-factor the `toHexUtil` helper (PR 17862 follow-up) We can re-use the `hexNumbers` structure, since that allows us to directly lookup the hexadecimal values and shortens the code.	2024-10-29 16:35:44 +01:00
Jonas Jenwald	9870099e90	Merge pull request #18977 from Snuffleupagus/api-ReaderHeadersReady-simplify Simplify the "ReaderHeadersReady" message-handler in the API	2024-10-29 15:46:31 +01:00
Calixte Denizet	5a9607b2ad	[Editor] Refactor the free highlight stuff in order to be able to use the code for more general drawing One goal is to make the code for drawing with the Ink tool similar to the one to free highlighting: it doesn't really make sense to have so different ways to do almost the same thing. When the zoom level is high, it'll avoid to create a too big canvas covering all the page which consume more memory, makes the drawing very slow and the overall user xp pretty bad. A second goal is to be able to easily implement more drawing tools where we would just have to implement how to draw from the pointer coordinates.	2024-10-29 15:41:08 +01:00
Jonas Jenwald	25cf4add05	Merge pull request #17862 from Snuffleupagus/fingerprints-toHex Improve the implementation of the `PDFDocument.fingerprints`-getter	2024-10-29 15:34:02 +01:00
Jonas Jenwald	afb4813d1c	Simplify the "ReaderHeadersReady" message-handler in the API We can convert the handler to an `async` function, which removes the need to create a temporary Promise here. Given the age of this code it shouldn't hurt to simplify it a little bit.	2024-10-29 14:59:39 +01:00
Jonas Jenwald	8f47d06d07	Add helper functions to allow using new `Uint8Array` methods This allows using the new methods in browsers that support them, e.g. Firefox 133+, while still providing fallbacks where necessary; see https://github.com/tc39/proposal-arraybuffer-base64 Please note: These are not actual polyfills, but only implements what we need in the PDF.js code-base. Eventually this patch should be reverted, once support is generally available.	2024-10-29 10:22:35 +01:00
Jonas Jenwald	bfc645bab1	Introduce some `Uint8Array.fromBase64` and `Uint8Array.prototype.toBase64` usage in the main code-base See https://github.com/tc39/proposal-arraybuffer-base64	2024-10-29 10:22:35 +01:00
Jonas Jenwald	f9fc477080	Improve the implementation of the `PDFDocument.fingerprints`-getter - Add explicit `length` validation of the /ID entries. Given the `EMPTY_FINGERPRINT` constant we're already implicitly assuming a particular length. - Move the constants into the `fingerprints`-getter, since they're not used anywhere else. - Replace the `hexString` helper function with the standard `Uint8Array.prototype.toHex` method; see https://github.com/tc39/proposal-arraybuffer-base64	2024-10-29 10:22:35 +01:00
Jonas Jenwald	3a85479c67	Merge pull request #18974 from Snuffleupagus/issue-18973 Allow `StreamsSequenceStream` to skip sub-streams that are not actual Streams (issue 18973)	2024-10-29 10:21:35 +01:00
Jonas Jenwald	48a18585f2	Allow `StreamsSequenceStream` to skip sub-streams that are not actual Streams (issue 18973) This extends PR 13796 to also handle the case where sub-streams contain invalid data, i.e. anything that isn't a Stream, however please note that in these cases there's no guarantee that we'll render the page "correctly". Note that Adobe Reader, i.e. the PDF reference implementation, cannot render the last page of the referenced PDF document.	2024-10-29 09:36:08 +01:00
Jonas Jenwald	93961e2802	Merge pull request #18971 from Snuffleupagus/AltText-fluent [Editor] Utilize Fluent "better" when localizing the AltText	2024-10-28 20:53:06 +01:00

... 2 3 4 5 6 ...

20014 commits