1
0
Fork 0
mirror of https://github.com/mozilla/pdf.js.git synced 2025-04-22 16:18:08 +02:00

Fix getTextContent evaluation to only apply TJ horizontal offsets using numeric items/args

While the array argument to TJ should only contain strings and numbers, other
unfortunate items are found in PDFs in the wild, e.g.:

[(Grandes) 0.0 Tc
-250.0 (Client\350les,) 0.0 Tc
-250.0 (Financements) 0.0 Tc
-250.0 (et) 0.0 Tc
-250.0 (March\351s) ] TJ

getOperatorList already properly ignores any non-string, non-numeric values in
TJ arrays; without this patch to getTextContent, returned text items can have
NaN widths due to calculations being applied to those non-numeric values.
This commit is contained in:
Chas Emerick 2016-10-13 07:47:17 -04:00
parent 8c5b925547
commit 85c52f1fd6
4 changed files with 79 additions and 1 deletions

View file

@ -1531,7 +1531,7 @@ var PartialEvaluator = (function PartialEvaluatorClosure() {
for (var j = 0, jj = items.length; j < jj; j++) {
if (typeof items[j] === 'string') {
buildTextContentItem(items[j]);
} else {
} else if (isNum(items[j])) {
ensureTextContentItem();
// PDF Specification 5.3.2 states: