Skip adding empty tables to PDF when parsing XHTML using ITextSharp

后端 未结 1 1852
耶瑟儿~
耶瑟儿~ 2020-12-21 19:09

ITextSharp throws an error when you attempt to create a PdfTable with 0 columns.

I have a requirement to take XHTML that is generated using an XSLT transformation an

相关标签:
1条回答
  • 2020-12-21 19:39

    You should be able to write your own tag processor that accounts for that scenario by subclassing iTextSharp.tool.xml.html.AbstractTagProcessor. In fact, to make your life even easier you can subclass the already existing more specific iTextSharp.tool.xml.html.table.Table:

    public class TableTagProcessor : iTextSharp.tool.xml.html.table.Table {
    
        public override IList<IElement> End(IWorkerContext ctx, Tag tag, IList<IElement> currentContent) {
            //See if we've got anything to work with
            if (currentContent.Count > 0) {
                //If so, let our parent class worry about it
                return base.End(ctx, tag, currentContent);
            }
    
            //Otherwise return an empty list which should make everyone happy
            return new List<IElement>();
        }
    }
    

    Unfortunately, if you want to use a custom tag processor you can't use the shortcut XMLWorkerHelper class and instead you'll need to parse the HTML into elements and add them to your document. To do that you'll need an instance of iTextSharp.tool.xml.IElementHandler which you can create like:

    public class SampleHandler : iTextSharp.tool.xml.IElementHandler {
        //Generic list of elements
        public List<IElement> elements = new List<IElement>();
        //Add the supplied item to the list
        public void Add(IWritable w) {
            if (w is WritableElement) {
                elements.AddRange(((WritableElement)w).Elements());
            }
        }
    }
    

    You can use the above with the following code which includes some sample invalid HTML.

    //Hold everything in memory
    using (var ms = new MemoryStream()) {
    
        //Create new PDF document 
        using (var doc = new Document()) {
            using (var writer = PdfWriter.GetInstance(doc, ms)) {
    
                doc.Open();
    
                //Sample HTML
                string html = "<table><tr><td>Hello</td></tr></table><table></table>";
    
                //Create an instance of our element helper
                var XhtmlHelper = new SampleHandler();
    
                //Begin pipeline
                var htmlContext = new HtmlPipelineContext(null);
    
                //Get the default tag processor
                var tagFactory = iTextSharp.tool.xml.html.Tags.GetHtmlTagProcessorFactory();
    
                //Add an instance of our new processor
                tagFactory.AddProcessor(new TableTagProcessor(), new string[] { "table" });
    
                //Bind the above to the HTML context part of the pipeline
                htmlContext.SetTagFactory(tagFactory);
    
                //Get the default CSS handler and create some boilerplate pipeline stuff
                var cssResolver = XMLWorkerHelper.GetInstance().GetDefaultCssResolver(false);
                var pipeline = new CssResolverPipeline(cssResolver, new HtmlPipeline(htmlContext, new ElementHandlerPipeline(XhtmlHelper, null)));//Here's where we add our IElementHandler
    
                //The worker dispatches commands to the pipeline stuff above
                var worker = new XMLWorker(pipeline, true);
    
                //Create a parser with the worker listed as the dispatcher
                var parser = new XMLParser();
                parser.AddListener(worker);
    
                //Finally, parse our HTML directly.
                using (TextReader sr = new StringReader(html)) {
                    parser.Parse(sr);
                }
    
                //The above did not touch our document. Instead, all "proper" elements are stored in our helper class XhtmlHelper
                foreach (var element in XhtmlHelper.elements) {
                    //Add these to the main document
                    doc.Add(element);
                }
    
                doc.Close();
    
            }
        }
    }
    
    0 讨论(0)
提交回复
热议问题