mirrors/pdf.js - Forgejo

mirrors/pdf.js

mirror of https://github.com/mozilla/pdf.js.git synced 2025-04-25 17:48:07 +02:00

Author	SHA1	Message	Date
Jonas Jenwald	ec1a05c104	Add missing `startWorkerTask` calls in the "SaveDocument" handler Without these calls we'll not actually wait for saving to complete when document destruction runs; compare with other `WorkerTask`-usage in this file. While I cannot imagine that this has caused any problems for library users, the code is however not technically correct as-is.	2024-12-21 14:22:18 +01:00
Jonas Jenwald	ede589dd6e	Shorten the `WorkerMessageHandler` class a little bit - Use `this` in all scopes where that's possible, to avoid having to spell out `WorkerMessageHandler` everywhere. - Inline the `isMessagePort` helper function, since there's only a single call-site. - Use a static initialization block to move more code into the `WorkerMessageHandler` class itself.	2024-11-30 14:07:16 +01:00
Jonas Jenwald	8ec399d7e1	Convert the `getPdfManager` function to be asynchronous This is fairly old code, and by making the function `async` we can handle initialization errors "automatically" without the need for try-catch statements.	2024-11-22 17:49:43 +01:00
Jonas Jenwald	2c0cc48d1b	Replace the `forEach` method in `Dict` with "proper" iteration support	2024-11-17 12:45:32 +01:00
Calixte Denizet	4bf7787084	Simplify saving added/modified annotations. Having this map to collect the different changes will allow to know if some objects have already been modified.	2024-11-12 10:59:38 +01:00
Jonas Jenwald	196f7d7df1	Inline the `flushChunks` helper function, used in `getPdfManager` on the worker-thread - This helper function has only a single call-site, and the function is fairly short. - It'll only be invoked if range requests are disabled, or if the entire PDF manages to load before the headers are resolved (which is very unlikely). Hence, by default, this helper function is not invoked. - By inlining the code we're able to utilize the existing error-handling at the call-site, rather than having to duplicate it, which further reduces the size of this code. Finally, while slightly unrelated, this patch also adds optional chaining in one spot in the file (PR 16424 follow-up).	2024-11-02 11:06:30 +01:00
Calixte Denizet	3103deaa44	Fix missing annotation parent in using the one from the Fields entry Fixes #15096.	2024-10-04 20:00:19 +02:00
Tim van der Meij	c77b97daff	Update the JS/CSS files for the new Prettier/Stylelint versions	2024-07-13 16:29:47 +02:00
Jonas Jenwald	a4ffc1066c	Move the internal API/Worker `isEditing`-state into `RenderingIntentFlag` In hindsight this seems like a better idea, since it avoids the need to manually pass `isEditing` around as a boolean value. Note that `RenderingIntentFlag` is internal functionality, not exposed in the official API, which means that it can be extended and modified as necessary.	2024-07-04 23:34:30 +02:00
Calixte Denizet	64635f3b35	[api-minor][Editor] When switching to editing mode, redraw pages containing editable annotations Right now, editable annotations are using their own canvas when they're drawn, but it induces several issues: - if the annotation has to be composed with the page then the canvas must be correctly composed with its parent. That means we should move the canvas under canvasWrapper and we should extract composing info from the drawing instructions... Currently it's the case with highlight annotations. - we use some extra memory for those canvas even if the user will never edit them, which the case for example when opening a pdf in Fenix. So with this patch, all the editable annotations are drawn on the canvas. When the user switches to editing mode, then the pages with some editable annotations are redrawn but without them: they'll be replaced by their counterpart in the annotation editor layer.	2024-07-02 14:11:40 +02:00
Jonas Jenwald	f6cd03955b	[api-minor] Move the page reference/number caching into the API Rather than having to handle this manually throughout the viewer, this functionality can instead be moved into the API which simplifies the code slightly.	2024-04-29 18:54:06 +02:00
Jonas Jenwald	e4d0e84802	[api-minor] Replace the `PromiseCapability` with `Promise.withResolvers()` This replaces our custom `PromiseCapability`-class with the new native `Promise.withResolvers()` functionality, which does almost the same thing[1]; please see https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Promise/withResolvers The only difference is that `PromiseCapability` also had a `settled`-getter, which was however not widely used and the call-sites can either be removed or re-factored to avoid it. In particular: - In `src/display/api.js` we can tweak the `PDFObjects`-class to use a "special" initial data-value and just compare against that, in order to replace the `settled`-state. - In `web/app.js` we change the only case to manually track the `settled`-state, which should hopefully be OK given how this is being used. - In `web/pdf_outline_viewer.js` we can remove the `settled`-checks, since the code should work just fine without it. The only thing that could potentially happen is that we try to `resolve` a Promise multiple times, which is however not a problem since the value of a Promise cannot be changed once fulfilled or rejected. - In `web/pdf_viewer.js` we can remove the `settled`-checks, since the code should work fine without them: - For the `_onePageRenderedCapability` case the `settled`-check is used in a `EventBus`-listener which is removed on its first (valid) invocation. - For the `_pagesCapability` case the `settled`-check is used in a print-related helper that works just fine with "only" the other checks. - In `test/unit/api_spec.js` we can change the few relevant cases to manually track the `settled`-state, since this is both simple and test-only code. --- [1] In browsers/environments that lack native support, note [the compatibility data](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Promise/withResolvers#browser_compatibility), it'll be polyfilled via the `core-js` library (but only in `legacy` builds).	2024-04-01 11:42:37 +02:00
Calixte Denizet	2133da166e	When updating, write the xref table in the same format as the previous one (bug 1878916) The specs are unclear about what kind of xref table format must be used. In checking the validity of some pdfs in the preflight tool from Acrobat we can guess that having the same format is the correct way to do. The pdf in the mentioned bug, after having been changed, wasn't correctly displayed in neither Chrome nor Acrobat: it's now fixed.	2024-02-13 14:14:37 +01:00
Jonas Jenwald	37e98e39f6	Skip any whitespace after the first object in linearized PDFs (issue 17665) This way the code is now consistent with the non-linearized branch in the `PDFDocument.startXRef` getter.	2024-02-12 22:05:36 +01:00
Calixte Denizet	f2196f7803	StructParents entry isn't required on pages with no tagged contents (bug 1855641)	2023-09-28 14:23:10 +02:00
Calixte Denizet	a8573d4e1b	[Editor] Add the ability to create/update the structure tree when saving a pdf containing newly added annotations (bug 1845087) When there is no tree, the tags for the new annotions are just put under the root element. When there is a tree, we insert the new tags at the right place in using the value of structTreeParentId (added in PR #16916).	2023-09-16 18:34:58 +02:00
Jonas Jenwald	ff96c413d3	Use `await` even more in the "SaveDocument" worker-thread handler Given that the function is already asynchronous we can make use of `await` even more and reduce the amount of indentation a little bit.	2023-09-16 13:06:48 +02:00
Jonas Jenwald	50937a3539	Ensure that the entire PDF document is loaded before we begin saving it When I started looking at PR 16938 it occurred to me that some of the new structTree-methods are synchronously accessing certain dictionary-data (not used during "normal" structTree-parsing), which may not be generally safe since everything in a dictionary could be a reference (and the relevant data may not have been loaded yet). Rather than suggesting that we make all those new methods even more asynchronous, to me the overall simplest and safest solution is to ensure that the entire PDF document has been loaded before we begin saving it. In practice this shouldn't really affect "performance" of saving noticeably, since it's always depended on the entire PDF document being downloaded. Finally note that with the exception of the PDF document possibly not having been fully downloaded when saving is triggered, all other "global" document properties are pretty much guaranteed to already be available at this point.	2023-09-12 13:26:57 +02:00
Jonas Jenwald	64e8557fb5	[api-minor] Deprecate the `PDFDocumentProxy.getJavaScript` method This method is very old, however with the exception of the auto-print hack (when scripting is disabled) in the viewer it's never actually been used. Most likely the idea with `PDFDocumentProxy.getJavaScript` was that it'd be useful if scripting support was added, however it turned out that it was a bit too simplistic and instead a number of new methods were added for the scripting use-cases.	2023-08-01 09:02:05 +02:00
Calixte Denizet	33fdec1392	Don't replace Acroform dictionary if nothing has changed when saving (bug 1844572)	2023-07-22 17:51:06 +02:00
Jonas Jenwald	88524bf9ae	Don't reset temporary XRef-entries during saving (PR 16392 follow-up) Please note: I'm not aware of any bugs caused by this, however that might be more luck than anything else. In PR 16392 the `incrementalUpdate` function, and all of its various helpers, were made asynchronous. However the call-site in `src/core/worker.js` wasn't updated, which means that we currently reset temporary XRef-entries while saving is ongoing.	2023-07-20 15:49:59 +02:00
Jonas Jenwald	3a886e7264	Move the `isNodeJS`-helper into the `src/shared/util.js` file With the changes in the previous patch the `isNodeJS`-helper no longer needs to live in its own file, which helps get rid of a closure in the built files.	2023-07-17 16:42:25 +02:00
Calixte Denizet	599b9498f2	[Editor] Add support for printing/saving newly added Stamp annotations In order to minimize the size the of a saved pdf, we generate only one image and use a reference in each annotation using it. When printing, it's slightly different since we have to render each page independantly but we use the same image within a page.	2023-06-26 15:47:05 +02:00
Calixte Denizet	71479fdd21	[Editor] Avoid to have duplicated entries in the Annot array when saving an existing and modified annotation	2023-06-15 22:02:10 +02:00
Jonas Jenwald	1753e321cd	Remove the compatibility checks in `WorkerMessageHandler.createDocumentHandler` For some time these checks have only targeted Node.js environments, since the features in question exist in all supported browsers (even when a `legacy`-build is used). Now that we've updated the minimum supported Node.js version to 18, a number of polyfills are thus (finally) no longer necessary in that environment. Hence for certain basic functionality, such as e.g. text-extraction, it's now possible to use either a modern- or a `legacy`-build of the PDF.js library in Node.js environments. Please note: For e.g. canvas-rendering in Node.js environments it's still necessary to use a `legacy`-build, since that functionality requires various polyfills.	2023-05-07 13:43:19 +02:00
Jonas Jenwald	ed8be6f882	[api-minor] Update the minimum supported Node.js version to 18 This patch updates the minimum supported environments as follows: - Node.js 18, which was released on 2022-04-19; see https://en.wikipedia.org/wiki/Node.js#Releases Note also that Node.js 16 will soon reach EOL, and thus no longer receive any security updates.	2023-05-07 13:43:19 +02:00
Jonas Jenwald	d950b91c4e	Introduce some logical assignment in the `src/core/` folder	2023-04-29 13:49:37 +02:00
Jonas Jenwald	317abd6d07	Change the `createPromiseCapability` helper function into a `PromiseCapability` class This is not only slightly more compact, but it also simplifies the handling of the `settled` getter.	2023-04-29 13:43:24 +02:00
Tim van der Meij	c9359957e6	Merge pull request #16305 from Snuffleupagus/PDFJSDev-skip-PRODUCTION Remove the `PRODUCTION` build-target	2023-04-22 14:53:30 +02:00
Calixte Denizet	117bbf7cd9	[api-minor] Don't normalize the text used in the text layer. Some arabic chars like \ufe94 could be searched in a pdf, hence it must be normalized when creating the search query. So to avoid to duplicate the normalization code, everything is moved in the find controller. The previous code to normalize text was using NFKC but with a hardcoded map, hence it has been replaced by the use of normalize("NFKC") (it helps to reduce the bundle size by 30kb). In playing with this \ufe94 char, I noticed that the bidi algorithm wasn't taking into account some RTL unicode ranges, the generated font wasn't embedding the mapping this char and the unicode ranges in the OS/2 table weren't up-to-date. When normalized some chars can be replaced by several ones and it induced to have some extra chars in the text layer. To avoid any regression, when copying some text from the text layer, a copied string is normalized (NFKC) before being put in the clipboard (it works like this in either Acrobat or Chrome).	2023-04-17 14:31:23 +02:00
Jonas Jenwald	804aa896a7	Stop using the `PRODUCTION` build-target in the JavaScript code This special build-target is very old, and was introduced with the first pre-processor that only uses comments to enable/disable code. When the new pre-processor was added `PRODUCTION` effectively became redundant, at least in JavaScript code, since `typeof PDFJSDev === "undefined"` checks now do the same thing. This patch proposes that we remove `PRODUCTION` from the JavaScript code, since that simplifies the conditions and thus improves readability in many cases. Please note: There's not, nor has there ever been, any gulp-task that set `PRODUCTION = false` during building.	2023-04-17 12:04:34 +02:00
Jonas Jenwald	edd13895dd	Limit the `Path2D`-checks in the worker-thread to Node.js (PR 16238 follow-up, issue 16289) The changes in PR 16238 were intended specifically for Node.js environments, however they accidentally applied to older browsers as well. Please note: In up-to-date browsers `Path2D` is available in Workers, which should be connected to the introduction of `OffscreenCanvas`.	2023-04-14 11:51:11 +02:00
Tim van der Meij	13f2426aab	Merge pull request #16238 from Snuffleupagus/update-Node-compat-check Update the Node.js compatibility-check in the worker-thread	2023-04-01 14:20:33 +02:00
Jonas Jenwald	57a307d0cd	Update the Node.js compatibility-check in the worker-thread Please note: In Node.js environments a `legacy`-build must be used since only those versions include any polyfills. Previously we'd only check if `ReadableStream` is natively supported, however since Node.js version 18 that's now been implemented; please see https://developer.mozilla.org/en-US/docs/Web/API/ReadableStream#browser_compatibility Hence we'll also check for the availability of `Path2D`, since that's browser-specific functionality not expected to be available in Node.js environments; please see https://developer.mozilla.org/en-US/docs/Web/API/Path2D#browser_compatibility	2023-03-30 18:36:15 +02:00
Jonas Jenwald	5063a6f2a9	[api-minor] Remove the `disableCombineTextItems` option Please note: This parameter has never been used within the PDF.js library/viewer itself, and it was only ever added for backwards compatibility reasons. This parameter was added in PR 7475, over six years ago, to try and optionally maintain the previous default text-extraction behaviour. However as part of the general text-extraction improvements in PR 13257, almost two years ago, the `disableCombineTextItems` functionality was accidentally "broken" in various ways. Note how the only (very basic) unit-test was updated in a way that doesn't really make sense, since generally speaking you'd expect that using the option should result in more (or at least the same number of) text-items. Furthermore there's also the recent issue 16209, where the option causes almost all textContent to be concatenated together. Hence this patch proposes that we simply remove the `disableCombineTextItems` option since it's essentially unused/untested functionality, as evident from the fact that it took almost two years for someone to notice that it's broken.	2023-03-30 14:23:38 +02:00
Calixte Denizet	2d0f30a67c	Use the position of the previous xref stream if any when saving a pdf (bug 1823296)	2023-03-21 19:27:24 +01:00
Calixte Denizet	3a21423386	[Acroform] Use the full path to find the node in the XFA datasets where to store the value I noticed several 'Path not found' errors because of a field called #subform[2]. From the XFA specs, the hash is used for a class of elements in the template tree. When we're looking for a node in the datasets tree, it doesn't make sense to search for a class. Hence the path element starting with a hash are just skipped.	2023-02-23 12:09:39 +01:00
Tim van der Meij	22618213c7	Merge pull request #16040 from Snuffleupagus/arrayBuffersToBytes Re-factor the `arraysToBytes` helper function (PR 16032 follow-up)	2023-02-12 11:47:57 +01:00
Jonas Jenwald	6d4d402a78	Move the `arrayBuffersToBytes` helper function into the worker-thread Given that this helper function is only used on the worker-thread, there's no reason to duplicate it in both of the built `pdf.js` and `pdf.worker.js` files.	2023-02-11 21:34:37 +01:00
Jonas Jenwald	18042163ce	Improve the consistency between the `LocalPdfManager`/`NetworkPdfManager` constructor Currently these classes take a bunch of parameters (somewhat randomly ordered), probably because this is very old code that's been extended over the years. Hence this patch changes the constructors to use parameter-objects instead, which improves consistency and (slightly) reduces the amount of code as well. Please note: Also removes the `msgHandler`-property on these classes, since I cannot find a single call-site that accesses it.	2023-02-11 13:39:52 +01:00
Jonas Jenwald	14b0e8c0b6	Ensure that "GetAnnotations" errors are propagated to the main-thread (PR 15267 follow-up) With the changes in PR 15267 we're now accidentally swallowing "GetAnnotations" errors, rather than propagating them to the main-thread as intended.	2023-02-10 12:18:35 +01:00
Jonas Jenwald	c56f25409d	Re-factor the `arraysToBytes` helper function (PR 16032 follow-up) Currently this helper function only has two call-sites, and both of them only pass in `ArrayBuffer` data. Given how it's implemented there's a couple of code-paths that are completely unused (e.g. the "string" one), and in particular the intended fast-paths don't actually work. This patch re-factors and simplifies the helper function, and it'll no longer accept anything except `ArrayBuffer` data (hence why it's also re-named). Note that at the time when `arraysToBytes` was added we still supported browsers without TypedArray functionality, and we'd then simulate them using regular Arrays.	2023-02-10 10:26:35 +01:00
Jonas Jenwald	5ba596786c	Change `WorkerTasks`, in `WorkerMessageHandler.createDocumentHandler`, to a use a Set This is a tiny bit more compact, thanks to the `Set.prototype.delete` method.	2023-02-09 22:01:16 +01:00
Jonas Jenwald	96d338e437	Reduce usage of the `arrayByteLength` helper function We're using this helper function when reading data from the [`PDFWorkerStreamReader.read`](`a49d1d1615/src/core/worker_stream.js (L90-L98)`) and [`PDFWorkerStreamRangeReader.read`](`a49d1d1615/src/core/worker_stream.js (L122-L128)`) methods, and as can be seen they always return `ArrayBuffer` data. Hence we can simply get the `byteLength` directly, and don't need to use the helper function. Note that at the time when `arrayByteLength` was added we still supported browsers without TypedArray functionality, and we'd then simulate them using regular Arrays.	2023-02-09 15:50:38 +01:00
Jonas Jenwald	70d362f22c	Remove an unnecessary variable in `getPdfManager`, in the `src/core/worker.js` file Another tiny piece of clean-up, since adding a `catch`-handler to a Promise shouldn't require an intermediate variable.	2022-11-17 15:31:41 +01:00
Jonas Jenwald	a2a200175f	Remove unnecessary function names in the `src/core/worker.js` file Currently some functions in this file have names while others don't, and in a few cases the names are no longer entirely accurate. For the relevant functions there should really be no need to name them, and if memory serves this was originally done since browsers (many years ago) didn't always handle anonymous functions correctly in stack traces.	2022-11-17 15:12:48 +01:00
Calixte Denizet	3ca03603c2	[Annotation] Fix printing/saving for annotations containing some non-ascii chars and with no fonts to handle them (bug 1666824) - For text fields * when printing, we generate a fake font which contains some widths computed thanks to an OffscreenCanvas and its method measureText. In order to avoid to have to layout the glyphs ourselves, we just render all of them in one call in the showText method in using the system sans-serif/monospace fonts. * when saving, we continue to create the appearance streams if the fonts contain the char but when a char is missing, we just set, in the AcroForm dict, the flag /NeedAppearances to true and remove the appearance stream. This way, we let the different readers handle the rendering of the strings. - For FreeText annotations * when printing, we use the same trick as for text fields. * there is no need to save an appearance since Acrobat is able to infer one from the Content entry.	2022-11-10 19:05:39 +01:00
Jonas Jenwald	caef47a0cf	Remove the `PdfManager.onLoadedStream` method (PR 15616 follow-up) After the clean-up in PR 15616, the `PdfManager.onLoadedStream` method now only has a single call-site. Hence why this patch suggests that we remove this method and replace it with an optional parameter in `PdfManager.requestLoadedStream` instead. By making the new behaviour opt-in, we'll thus not change any existing call-site.	2022-10-29 14:42:17 +02:00
Jonas Jenwald	bcffbf74f3	Let the `PdfManager.requestLoadedStream` method return the stream This is very old code, and it could thus do with some simplification. Note how in the `src/core/worker.js` file we're combining both the `PdfManager.requestLoadedStream` and `PdfManager.onLoadedStream` methods in order to access the stream-data. This seems unnecessary, and it's simple enough to always let the `PdfManager.requestLoadedStream` method return the stream-data as well.	2022-10-24 17:00:48 +02:00
Jonas Jenwald	f2f0a1e871	[api-minor] Stop sending "UnsupportedFeature" from the worker-thread GetOperatorList-handling This code was added all the way back in PR 6698, almost seven years ago, for backwards compatibility reasons. At this point in time, it seems that we can remove that since: - We have more fine-grained "UnsupportedFeature" reporting elsewhere in the worker-thread code nowadays. - The GetOperatorList-handling is now using `ReadableStream`s, which means that errors are being forwarded to the main-thread anyway. - We're also no longer displaying a notification-bar, in the built-in Firefox PDF Viewer, for any of these "UnsupportedFeature" messages.	2022-10-13 11:46:17 +02:00

1 2 3 4 5 ...