[api-minor] Convert Catalog.getPageDict to an asynchronous method

Besides converting `Catalog.getPageDict` to an `async` method, thus simplifying the code, this patch also allows us to pro-actively fix a existing issue. Note how we're looking up References in such a way that `MissingDataException`s won't cause trouble, however it's *technically possible* that the entries (i.e. /Count, /Kids, and /Type) in a /Pages Dictionary could actually be indirect objects as well. In the existing code this could lead to *some*, or even all, pages failing to load/render as intended. In practice that doesn't *appear* to happen in real-world PDF documents, but given all the weird things that PDF software do I'd prefer to fix this pro-actively (rather than waiting for a bug report). With `Catalog.getPageDict` being `async` this is now really simple to address, however I didn't want to introduce a bunch more *unconditional* asynchronicity in this method if it could be avoided (since that could slow things down). Hence we'll *synchronously* lookup the *raw* data in a /Pages Dictionary, and only fallback to asynchronous data lookup when a Reference was encountered. In addition to the above, this patch also makes the following notable changes: - Let `Catalog.getPageDict` *consistently* reject with the actual error, regardless of what data we're fetching. Previously we'd "swallow" the actual errors except when looking up Dictionary entries, which is inconsistent and thus seem unfortunate. As can be seen from the updated unit-tests this change is API-observable, hence why the patch is tagged `[api-minor]`. - Improve the consistency of the Dictionary /Type-checks in both the `Catalog.getPageDict` and `Catalog.getAllPageDicts` methods. In `Catalog.getPageDict` there's a fallback code-path where we're *incorrectly* checking the /Page Dictionary for a /Contents-entry, which is wrong since a /Page Dictionary doesn't need to have a /Contents-entry in order to be valid. For consistency the `Catalog.getAllPageDicts` method is also updated to handle errors in the /Type-lookup correctly. - Reduce the `PagesCountLimit.PAUSE_EAGER_PAGE_INIT` viewer constant, to further improve loading/rendering performance of the *second* page during initialization of very long documents; PR 14359 follow-up.
2025-04-20 15:18:08 +02:00 · 2021-12-24 13:46:35 +01:00 · 2021-12-24 13:46:35 +01:00 · b513c64d9d
commit b513c64d9d
parent 41dab8e7b6
3 changed files with 102 additions and 120 deletions
--- a/test/unit/api_spec.js
+++ b/test/unit/api_spec.js
@ -622,9 +622,7 @@ describe("api", function () {
        expect(false).toEqual(true);
      } catch (reason) {
        expect(reason instanceof UnknownErrorException).toEqual(true);
-        expect(reason.message).toEqual(
-          "Page dictionary kids object is not an array."
-        );
+        expect(reason.message).toEqual("Illegal character: 41");
      }
      try {
        await pdfDocument2.getPage(1);
@ -633,9 +631,7 @@ describe("api", function () {
        expect(false).toEqual(true);
      } catch (reason) {
        expect(reason instanceof UnknownErrorException).toEqual(true);
-        expect(reason.message).toEqual(
-          "Page dictionary kids object is not an array."
-        );
+        expect(reason.message).toEqual("End of file inside array.");
      }

      await Promise.all([loadingTask1.destroy(), loadingTask2.destroy()]);