How to correctly extract text from a pdf using pdf.js

后端 未结 4 1673
生来不讨喜
生来不讨喜 2020-12-15 10:30

I\'m new to ES6 and Promise. I\'m trying pdf.js to extract texts from all pages of a pdf file into a string array. And when extraction is done, I want to parse the array som

4条回答
  •  有刺的猬
    2020-12-15 10:49

    If you use the PDFViewer component, here is my solution that doesn't involve any promise or asynchrony:

    function getDocumentText(viewer) {
        let text = '';
        for (let i = 0; i < viewer.pagesCount; i++) {
            const { textContentItemsStr } = viewer.getPageView(i).textLayer;
            for (let item of textContentItemsStr)
                text += item;
        }
        return text;
    }
    

提交回复
热议问题