Fixes https://github.com/mozilla/pdf.js/issues/18957https://github.com/mozilla/pdf.js/pull/18682 introduced a regression that causes the following error:
```
Uncaught TypeError: Failed to construct 'Headers': Invalid name
at PDFNetworkStreamFullRequestReader._onHeadersReceived (pdf.mjs:10214:29)
at NetworkManager.onStateChange (pdf.mjs:10103:22)
```
The mentioned PR replaced a call to `getResponseHeader()` with `getAllResponseHeaders()` without handling cases where it may return null or an empty string. Quote from the [docs](https://developer.mozilla.org/en-US/docs/Web/API/XMLHttpRequest/getAllResponseHeaders#return_value):
> Returns:
>
>A string representing all of the response's headers (except those whose field name is Set-Cookie) separated by CRLF, or null if no response has been received. If a network error happened, an empty string is returned.
Run the following code and observe the error in the console. Note that the URL is intentionally set to an invalid value to simulate network error
```js
<script src="//mozilla.github.io/pdf.js/build/pdf.mjs" type="module"></script>
<script type="module">
var url = 'blob:';
pdfjsLib.GlobalWorkerOptions.workerSrc = '//mozilla.github.io/pdf.js/build/pdf.worker.mjs';
var loadingTask = pdfjsLib.getDocument(url);
loadingTask.promise
.then((pdf) => console.log('PDF loaded'))
.catch((reason) => console.error(reason));
</script>
```
The problem with the referenced PDF document has nothing to do with invalid dates, as the issue seems to suggest, but rather with the fact that it has neither an XRef table nor a trailer dictionary.
Given that crucial parts of the internal document structure is missing, you might argue that it's not really a PDF document.
In an attempt to support this kind of corruption, we'll simply iterate through all (previously found) XRef entries and pick one that *might* be a valid /Root dictionary.
There's obviously no guarantee that this works, and it might not be fast in larger PDF documents, but at least it cannot be any worse than *immediately* throwing `InvalidPDFException` as we previously did here.
*Please note:* I'm totally fine with this patch being rejected, since it's somewhat questionable if we should actually attempt to support "PDF documents" with this level of corruption.
This patch updates the minimum supported environments as follows:
- Node.js 20, which was released on 2023-04-18 and has now entered the "Maintenance"-phase; see https://github.com/nodejs/release#release-schedule
Furthermore, note also that Node.js 18 will fairly soon reach EOL.
This integration test fails intermittently because we're not
(correctly) awaiting the sandbox actions. The `27R` field in
`issue14862.pdf` triggers sandbox events for every typing action, but
for the backspace and "a" character typing actions we weren't awaiting
the sandbox trip at all, and for other places we weren't awaiting it
fully (causing some characters to be missed in the assertion).
This commit fixes the issues by using the appropriate helper functions,
similar to what we did in PR #18399. Not only is this shorter in terms
of code, but it also fixed the near-permafail for this test with newer
versions of Puppeteer.
- This helper function has only a single call-site, and the function is fairly short.
- It'll only be invoked if range requests are *disabled*, or if the entire PDF manages to load *before* the headers are resolved (which is very unlikely).
Hence, by default, this helper function is not invoked.
- By inlining the code we're able to utilize the existing error-handling at the call-site, rather than having to duplicate it, which further reduces the size of this code.
Finally, while slightly unrelated, this patch also adds optional chaining in one spot in the file (PR 16424 follow-up).
I discovered that doing skip-cache re-reloading of https://opensource.adobe.com/dc-acrobat-sdk-docs/pdfstandards/PDF32000_2008.pdf would *intermittently* cause (some of) the AnnotationLayers to break with errors printed in the console (see below).
In hindsight this bug is really obvious, however it took me quite some time to find it, since the `StructTreePage.prototype.serializable` getter will lookup various data and all of those cases can fail during loading when streaming and/or range requests are being used.
Finally, to prevent any future errors, ensure that the viewer won't break in these sort of situations.
```
Uncaught (in promise)
Object { message: "Missing data [19098296, 19098297)", name: "UnknownErrorException", details: "MissingDataException: Missing data [19098296, 19098297)", stack: "BaseExceptionClosure@resource://pdf.js/build/pdf.mjs:453:29\n@resource://pdf.js/build/pdf.mjs:456:2\n" }
viewer.mjs:8801:55
\#renderAnnotationLayer: "UnknownErrorException: Missing data [17552729, 17552730)". viewer.mjs:8737:15
Uncaught (in promise)
Object { message: "Missing data [17552729, 17552730)", name: "UnknownErrorException", details: "MissingDataException: Missing data [17552729, 17552730)", stack: "BaseExceptionClosure@resource://pdf.js/build/pdf.mjs:453:29\n@resource://pdf.js/build/pdf.mjs:456:2\n" }
viewer.mjs:8801:55
```
- Over time the number and size of these factories have increased, especially the `DOMFilterFactory` class, and this split should thus aid readability/maintainability of the code.
- By introducing a couple of new import maps we can avoid bundling the `DOMCMapReaderFactory`/`DOMStandardFontDataFactory` classes in the Firefox PDF Viewer, since they are dead code there given that worker-thread fetching is always being used.
- This patch has been successfully tested, by running `$ ./mach test toolkit/components/pdfjs/`, in a local Firefox artifact-build.
*Note:* This patch reduces the size of the `gulp mozcentral` output by `1.3` kilo-bytes, which isn't a lot but still cannot hurt.
This way we can directly throw Errors, rather than having to "manually" return rejected Promises, which is ever so slightly shorter.
Also, since `useWorkerFetch` is always true in MOZCENTRAL builds these message-handlers should not be invoked there.
One goal is to make the code for drawing with the Ink tool similar to the one to free highlighting:
it doesn't really make sense to have so different ways to do almost the same thing.
When the zoom level is high, it'll avoid to create a too big canvas covering all the page which consume
more memory, makes the drawing very slow and the overall user xp pretty bad.
A second goal is to be able to easily implement more drawing tools where we would just have to implement
how to draw from the pointer coordinates.
We can convert the handler to an `async` function, which removes the need to create a temporary Promise here.
Given the age of this code it shouldn't hurt to simplify it a little bit.
This allows using the new methods in browsers that support them, e.g. Firefox 133+, while still providing fallbacks where necessary; see https://github.com/tc39/proposal-arraybuffer-base64
*Please note:* These are not actual polyfills, but only implements what we need in the PDF.js code-base. Eventually this patch should be reverted, once support is generally available.
- Add explicit `length` validation of the /ID entries. Given the `EMPTY_FINGERPRINT` constant we're already *implicitly* assuming a particular length.
- Move the constants into the `fingerprints`-getter, since they're not used anywhere else.
- Replace the `hexString` helper function with the standard `Uint8Array.prototype.toHex` method; see https://github.com/tc39/proposal-arraybuffer-base64
This extends PR 13796 to also handle the case where sub-streams contain invalid data, i.e. anything that isn't a Stream, however please note that in these cases there's no guarantee that we'll render the page "correctly".
Note that Adobe Reader, i.e. the PDF reference implementation, cannot render the last page of the referenced PDF document.