Extract PDF text by coordinates

前端 未结 6 766
半阙折子戏
半阙折子戏 2021-02-04 20:03

I\'d like to know if there\'s some PDF library in Microsoft .NET being able of extracting text by giving coordinates.

For example (in pseudo-code):

<         


        
6条回答
  •  忘了有多久
    2021-02-04 20:44

    This should work:

    RenderFilter[] filters = new RenderFilter[1];
    LocationTextExtractionStrategy regionFilter = new LocationTextExtractionStrategy();
    filters[0] = new RegionTextRenderFilter(new Rectangle(llx,lly,urx,ury));
    FilteredTextRenderListener strategy = new FilteredTextRenderListener(regionFilter, filters);
    
    String result = PdfTextExtractor.GetTextFromPage(pdfReader, i, strategy);
    Console.WriteLine(result);
    

提交回复
热议问题