1
0
Fork 0
mirror of https://github.com/mozilla/pdf.js.git synced 2025-04-26 01:58:06 +02:00

[api-minor] Remove the disableCombineTextItems option

*Please note:* This parameter has never been used within the PDF.js library/viewer itself, and it was only ever added for backwards compatibility reasons.

This parameter was added in PR 7475, over six years ago, to try and optionally maintain the previous *default* text-extraction behaviour.
However as part of the general text-extraction improvements in PR 13257, almost two years ago, the `disableCombineTextItems` functionality was accidentally "broken" in various ways. Note how the only (very basic) unit-test was updated in a way that doesn't really make sense, since generally speaking you'd expect that using the option should result in *more* (or at least the same number of) text-items. Furthermore there's also the recent issue 16209, where the option causes almost all textContent to be concatenated together.

Hence this patch proposes that we simply remove the `disableCombineTextItems` option since it's essentially unused/untested functionality, as evident from the fact that it took almost two years for someone to notice that it's broken.
This commit is contained in:
Jonas Jenwald 2023-03-30 13:36:42 +02:00
parent 09da8026b6
commit 5063a6f2a9
6 changed files with 11 additions and 41 deletions

View file

@ -741,7 +741,7 @@ class WorkerMessageHandler {
});
handler.on("GetTextContent", function (data, sink) {
const pageIndex = data.pageIndex;
const { pageIndex, includeMarkedContent } = data;
pdfManager.getPage(pageIndex).then(function (page) {
const task = new WorkerTask("GetTextContent: page " + pageIndex);
@ -755,8 +755,7 @@ class WorkerMessageHandler {
handler,
task,
sink,
includeMarkedContent: data.includeMarkedContent,
combineTextItems: data.combineTextItems,
includeMarkedContent,
})
.then(
function () {