pdf.js

mirror of https://github.com/mozilla/pdf.js.git synced 2025-04-23 16:48:08 +02:00

Author	SHA1	Message	Date
Calixte Denizet	196affd8e0	Fix decoding of JPX images having an alpha channel When an image has a non-zero SMaskInData it means that the image has an alpha channel. With JPX images, the colorspace isn't required (by spec) so when we don't have it, the JPX decoder will handle the conversion in RGBA format.	2024-06-03 20:08:11 +02:00
Calixte Denizet	9654ad570a	Decompress when it's possible images in using DecompressionStream Getting images is already asynchronous, so we can use this opportunity to use DecompressStream (which is async too) to decompress images.	2024-06-02 14:00:05 +02:00
Calixte Denizet	6fa98ac99f	[api-minor] Simplify how the list of points are structured Instead of sending to the main thread an array of Objects for a list of points (or quadpoints), we'll send just a basic float buffer. It should slightly improve performances (especially when cloning the data) and use slightly less memory.	2024-05-30 15:36:15 +02:00
Tim van der Meij	ee545930ea	Merge pull request #18171 from Snuffleupagus/move-pendingTextLayers Don't register a pending `TextLayer` until `render` is invoked (PR 18104 follow-up)	2024-05-28 15:37:51 +02:00
Jonas Jenwald	f2e7eee00e	Don't register a pending `TextLayer` until `render` is invoked (PR 18104 follow-up) After the re-factoring in PR 18104 there's now a theoretical risk that a pending `TextLayer` is never removed, which we can avoid by not registering it until `render` is invoked. Note that this doesn't affect the viewer or tests, but if a third-party user calls `new TextLayer(...)` without a following call of either the `render`- or `cancel`-method we'd block global clean-up without this patch.	2024-05-26 18:38:40 +02:00
Jonas Jenwald	27436d52b2	Reduce indentation when parsing new annotations in `getOperatorList` This code has, over the years, become more complex and less indentation generally helps readability.	2024-05-25 12:00:44 +02:00
Jonas Jenwald	ce52ce063e	Change `parsingType3Font` to a getter (PR 14448 follow-up) We can easily "compute" `parsingType3Font` from the `type3FontRefs`-value, and thus avoid having to separately track two related properties.	2024-05-25 10:46:12 +02:00
Jonas Jenwald	c349ac3a5d	Skip the temporary variable when calling `#findStreamLength` (PR 18125 follow-up)	2024-05-25 10:38:32 +02:00
Jonas Jenwald	17e09e5478	Merge pull request #18159 from Snuffleupagus/loadingParams-test Improve the `loadingParams` functionality in the API	2024-05-24 23:21:24 +02:00
Jonas Jenwald	cfcb700ecc	Prevent XRef errors from breaking font loading (bug 1898802) Note that the referenced file is trivially corrupt, since it contains two PDF documents placed in the same file which doesn't make sense (and isn't how a PDF document should be updated). However it's still a good idea to ensure that `loadFont` is able to handle errors when resolving References, since that allows us to invoke the existing fallback font handling.	2024-05-24 21:37:35 +02:00
Jonas Jenwald	06334c97ef	Improve the `loadingParams` functionality in the API - Move the definition of the `loadingParams` Object, to simplify the code. - Add a unit-test, since none existed and the viewer depends on this functionality.	2024-05-24 09:26:40 +02:00
Jonas Jenwald	3afa9bfc42	Improve /Page validation for linearized documents (issue 18138) The referenced PDF document contains corrupt linearization-data, that doesn't point to the first page as intended.	2024-05-22 12:04:02 +02:00
Jonas Jenwald	2a52fda11b	Merge pull request #17770 from Aditi-1400/fix-issue-16843 Add language attribute to canvas	2024-05-21 21:35:43 +02:00
calixteman	5da2894278	Merge pull request #18136 from calixteman/ml_stamp [Editor] Pass a buffer instead of a blob url to the ML api	2024-05-21 18:24:26 +02:00
Calixte Denizet	b20ddff300	[Editor] Pass a buffer instead of a blob url to the ML api	2024-05-21 17:07:03 +02:00
Calixte Denizet	2369e40d2e	[Editor] Update popup position and contents after a FreeText has been edited	2024-05-21 16:54:10 +02:00
Aditi	9edca0a5ed	Add `lang` attribute to canvas element Fixes issue #16843. In certain cases, the text layer was misaligned due to a difference between the `lang` attribute of the viewer and the canvas. This commit addresses the problem by adding the `lang` attribute to the canvas. The issue was caused because PDF.js uses serif/sans-serif fonts to generate the text layer and relies on system fonts. The difference in the `lang` attribute led to different fonts being picked, causing the misalignment.	2024-05-21 19:41:24 +05:30
Jonas Jenwald	57014d0d13	Support corrupt PDF documents that contain "endsteam" commands (issue 18122) This patch also re-factors the findStreamLength-helper to avoid even more code duplication.	2024-05-21 13:38:17 +02:00
Jonas Jenwald	9ee7c07b83	Merge pull request #18104 from Snuffleupagus/TextLayer-class [api-minor] Re-factor the basic textLayer-functionality	2024-05-21 12:28:28 +02:00
Jonas Jenwald	59637c1fa8	Merge pull request #18115 from Snuffleupagus/freeze-evaluatorOptions Freeze `evaluatorOptions` in the src/core/pdf_manager.js file	2024-05-21 12:19:04 +02:00
Jonas Jenwald	440b4b6eeb	Support charCodes larger than 32-bit in `adjustMapping` (issue 18117) This also required changing the initial `charCodeToGlyphId`-data to an Object, which seems generally correct since it's consistent with existing code in the `src\core\{cff_font, type1_font}.js` files.	2024-05-20 12:13:55 +02:00
Jonas Jenwald	3cd6c6c0e6	Freeze `evaluatorOptions` in the src/core/pdf_manager.js file Given that these options are passed from the API we don't want to accidentally modify them.	2024-05-18 15:16:12 +02:00
Jonas Jenwald	15b5808eee	[api-minor] Re-factor the basic textLayer-functionality This is very old code, and predates e.g. the introduction of JavaScript classes, which creates unnecessarily unwieldy code in the viewer. By introducing a new `TextLayer` class in the API, similar to how e.g. the `AnnotationLayer` looks, we're able to keep most parameters on the class-instance itself. This removes the need to manually track them in the viewer, and simplifies the call-sites. This also removes the `numTextDivs` parameter from the "textlayerrendered" event, since that's only added to support default-viewer functionality that no longer exists. Finally we try, as far as possible, to polyfill the old `renderTextLayer` and `updateTextLayer` functions since they are exposed in the library API. For simple invocations of `renderTextLayer` the behaviour should thus be the same, with only a warning printed in the console.	2024-05-17 14:20:20 +02:00
Jonas Jenwald	d8e0fca609	Don't invoke `cleanupTextLayer` when there are pending textLayers Please note: This doesn't really affect the viewer, but may affect the library API if multiple PDF documents are opened in parallel. Since we clean-up "global" textLayer-data when destroying a PDF document, this means that other active PDFs could potentially break by invoking `cleanupTextLayer` unconditionally. Note that textLayer rendering is an asynchronous task, and we thus need to ensure those are all finished before running clean-up.	2024-05-17 08:52:10 +02:00
Jonas Jenwald	d5f3829f91	Actually disable `TextLayerRenderTask.prototype.#processItems` when `MAX_TEXT_DIVS_TO_RENDER` is reached (PR 18089 follow-up) I broke this accidentally in PR 18089, sorry about that! Note that since `#processItems` is private we can no longer just "replace" the method as was done in PR 18052.	2024-05-16 11:48:11 +02:00
Tim van der Meij	4db843617f	Merge pull request #18047 from Snuffleupagus/issue-18042 Avoid re-parsing global images that failed decoding (issue 18042, PR 17428 follow-up)	2024-05-15 15:40:18 +02:00
Jonas Jenwald	6b171540b7	Initialize the `networkStream` synchronously in `getDocument` This is fairly old code, and at some point the need for this to be asynchronous disappeared.	2024-05-14 17:04:25 +02:00
Jonas Jenwald	cbb8748a22	Inline the `_fetchDocument` helper function in `getDocument` This function has been modified a number of times over the years, and at this point it's small/simple enough that we can just inline the code instead.	2024-05-14 16:29:41 +02:00
Jonas Jenwald	036fd11ad7	Improve the `TextLayerRenderTask` implementation - Change all possible semi-private methods into properly private ones. Note that this code is old enough to predate standard classes. - Move the `appendText` helper function into `TextLayerRenderTask`, as a private method, to avoid having to manually pass in the scope. - Simplify `#layoutText` by directly passing in all necessary data. This is possible after the changes PR 18052.	2024-05-14 14:10:17 +02:00
Jonas Jenwald	c5f92437f7	Avoid re-parsing global images that failed decoding (issue 18042, PR 17428 follow-up) For images that failed to decode once we want to avoid a pointless round-trip to the main-thread, which could otherwise happen for globally cached images.	2024-05-14 13:58:36 +02:00
Jonas Jenwald	6d523c316c	[api-minor] Include the document /Lang attribute in the textContent-data - These changes will allow a simpler way of implementing PR 17770. - The /Lang attribute is fetched lazily, with the first `getTextContent` invocation. Given the existing worker-thread caching, this will thus only need to be done once per PDF document (and most PDFs don't included this data). - This makes the /Lang attribute directly available in the `textLayer`, which has the following advantages: - We don't need to block, and thus delay, overall viewer initialization on fetching it (nor pass it around throughout the viewer). - Third-party users of the `textLayer` will automatically benefit from this, once we start actually using the /Lang attribute in PR 17770. Please note: This also, importantly, means that the `text` reference-tests will then cover this code (which wouldn't otherwise have been the case).	2024-05-14 12:44:41 +02:00
Jonas Jenwald	c0b5d93ef4	Merge pull request #18052 from Snuffleupagus/textLayer-only-ReadableStream Restore broken functionality and simplify the implementation in `src/display/text_layer.js`	2024-05-14 12:30:27 +02:00
Jonas Jenwald	298d72133e	Merge pull request #18051 from Snuffleupagus/NodePackages [api-minor] Re-factor how Node.js packages/polyfills are loaded (issue 17245)	2024-05-14 11:43:57 +02:00
Jonas Jenwald	761abc7cc3	Merge pull request #18066 from Snuffleupagus/rm-FontFaceObject-ignoreErrors Remove the `ignoreErrors` option from the `FontFaceObject` class	2024-05-14 09:49:08 +02:00
Tim van der Meij	0347e59b99	Merge pull request #18061 from Snuffleupagus/api-report-Stats Slightly re-factor how the viewer initializes debug-only functionality	2024-05-13 19:38:59 +02:00
Jonas Jenwald	4aee67227e	Remove the unused `Font.prototype.spaceWidth` getter (PR 13424 follow-up) This getter became unused in PR 13424, well over two years ago, and apparently none of us noticed that.	2024-05-11 11:50:51 +02:00
Jonas Jenwald	5f6f1686b5	Remove the `ignoreErrors` option from the `FontFaceObject` class - The `stopAtErrors` API option, which is the inverse of the "internal" `ignoreErrors` option, is explicitly documented as applying to parsing (i.e. the worker-thread) while the `FontFaceObject` class is used during rendering (i.e. the main-thread); see `b6765403a1/src/display/api.js (L164-L167)` - A glyph that fails in the `FontRendererFactory`, on the worker-thread, will already cause (overall) parsing to stop when `ignoreErrors === false` hence checking the option on the main-thread as well seems redundant; see `b6765403a1/src/core/evaluator.js (L4527-L4533)` - Removing this option simplifies the code, and slightly reduces the number of options that we need to handle in the main-thread code.	2024-05-11 10:18:23 +02:00
Jonas Jenwald	5e50479ac6	Use more object destructuring in the "commonobj" handler in the API	2024-05-11 09:44:10 +02:00
Jonas Jenwald	4a8d742592	Move the reporting of page `Stats` into the API This avoids having to add a couple of event listeners in the viewer, when debugging is enabled, and is consistent with the existing handling of `FontInspector` and `StepperManager` in the API.	2024-05-11 09:42:05 +02:00
Jonas Jenwald	8d86e18a32	Restore the `MAX_TEXT_DIVS_TO_RENDER` limit in the textLayer This limit is currently completely non-functional, since the check happens after the entire textLayer has been parsed and appended to the DOM. It seems that this has been accidentally broken ever since the introduction of `ReadableStream` support. The reason that this hasn't caused noticeable textLayer-related performance issues in practice is probably because we nowadays manage to coalesce the textLayer into fewer overall DOM elements, whereas years ago many PDF documents ended up with one DOM element per glyph. By moving this check, and thus restoring the functionality, we're also able to remove the `render` helper function and simplify the code.	2024-05-07 13:04:00 +02:00
Jonas Jenwald	30840e411e	Ensure that the textLayer `styleCache` is always cleared, even on failure By also moving it to the `TextLayerRenderTask`-instance, we can avoid a bit of manual parameter passing.	2024-05-07 13:04:00 +02:00
Jonas Jenwald	049848ba00	Unify the `ReadableStream` and `TextContent` code-paths in `src/display/text_layer.js` The only reason that this code still accepts `TextContent` is for backward-compatibility purposes, so we can simplify the implementation by always using a `ReadableStream` internally.	2024-05-07 13:03:57 +02:00
Jonas Jenwald	2643570364	[api-minor] Re-factor how Node.js packages/polyfills are loaded (issue 17245) Please note: This removes top level await from the GENERIC builds of the PDF.js library. Despite top level await being supported in all modern browsers/environments, note [the MDN compatibility data](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Operators/await#browser_compatibility), it seems that many frameworks and build-tools unfortunately have trouble with it. Hence, in order to reduce the influx of support requests regarding top level await it thus seems that we'll have to try and fix this. Given that top level await is only needed for Node.js environments, to load packages/polyfills, we re-factor things to limit the asynchronicity to that environment. The "best" solution, with the least likelihood of causing future problems, would probably be to await the load of Node.js packages/polyfills e.g. at the top of the `getDocument`-function. Unfortunately that doesn't work though, since that's a synchronous function that we cannot change without breaking "the world". Hence we instead await the load of Node.js packages/polyfills together with the `PDFWorker` initialization, since that's the first point of asynchronicity during initialization/loading of a PDF document. The reason that this works is that the Node.js packages/polyfills are only needed during fetching of the PDF document respectively during rendering, neither of which can happen until the worker has been initialized. Hopefully this won't cause any future problems, since looking at the history of the PDF.js project I don't believe that we've (thus far) ever needed a Node.js dependency at an earlier point. This new pattern for accessing Node.js packages/polyfills will also require some care during development and importantly reviewing, to ensure that no new top level await is added in the main code-base.	2024-05-06 23:20:03 +02:00
Jonas Jenwald	9b41bfc374	Introduce helper functions for parsing /Matrix and /BBox arrays	2024-05-03 22:37:50 +02:00
Jonas Jenwald	52f7ff155d	Validate even more dictionary properties This checks primarily Arrays, but also some other properties, that we'll end up sending (sometimes indirectly) to the main-thread.	2024-05-03 22:37:14 +02:00
Jonas Jenwald	1b811ac113	Merge pull request #18034 from Snuffleupagus/FileSpec-filename-stripPath [api-minor] Improve the `FileSpec` implementation	2024-05-03 09:03:17 +02:00
Jonas Jenwald	a790f2df5d	[api-minor] Remove the unused `onlyStripPath` option from the `getFilenameFromUrl` helper function	2024-05-03 08:29:41 +02:00
Jonas Jenwald	c419c8333b	Merge pull request #18037 from Snuffleupagus/validate-more-widths Add even more validation of width-data (PR 18017 follow-up)	2024-05-02 14:41:02 +02:00
Jonas Jenwald	6c05f8b381	Add even more validation of width-data (PR 18017 follow-up) I missed this case in PR 18017, sorry about that.	2024-05-02 11:24:15 +02:00
calixteman	33732ff2cb	Merge pull request #18035 from calixteman/rm_max_group_size Remove the limit used to decided if a group canvas must be upscaled or not	2024-05-01 20:14:28 +02:00

1 2 3 4 5 ...

6462 commits