mirror of
https://github.com/mozilla/pdf.js.git
synced 2025-04-20 15:18:08 +02:00
Re-factor searching for incomplete objects in XRef.indexObjects
(issue 15803)
When trying to find incomplete objects, i.e. those missing the "endobj"-string at the end, there's unfortunately a number of possible operators that we need to check for. Otherwise we could miss e.g. the "trailer" at the end of a corrupt PDF document, which is why the referenced document didn't work. Currently we do all searching on the "raw" bytes of the PDF document, for efficiency, however this doesn't really work when we need to check for *multiple* potential command-strings. To keep the complexity manageable we'll instead use regular expressions here, but we can at least avoid creating lots of substrings thanks to the `RegExp.lastIndex` property; which is well supported across browsers according to https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp/lastIndex#browser_compatibility Note that this repeated regular expression usage could perhaps be slightly less efficient than the old code, however this method is only invoked for corrupt PDF documents.
This commit is contained in:
parent
6a9a567670
commit
2fcf8bb5be
3 changed files with 39 additions and 39 deletions
1
test/pdfs/issue15803.pdf.link
Normal file
1
test/pdfs/issue15803.pdf.link
Normal file
|
@ -0,0 +1 @@
|
|||
https://github.com/mozilla/pdf.js/files/10200431/ocg.pdf
|
|
@ -1761,6 +1761,15 @@
|
|||
"link": false,
|
||||
"type": "eq"
|
||||
},
|
||||
{ "id": "issue15803",
|
||||
"file": "pdfs/issue15803.pdf",
|
||||
"md5": "e501a4418d4ece5be7ce4e8acf029100",
|
||||
"rounds": 1,
|
||||
"link": true,
|
||||
"lastPage": 1,
|
||||
"type": "eq",
|
||||
"annotations": true
|
||||
},
|
||||
{ "id": "issue9105_other",
|
||||
"file": "pdfs/issue9105_other.pdf",
|
||||
"md5": "4c8b9c2cceb9c5d621e1d50b3dc38efc",
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue