This was a pretty interesting problem, so +1 to the question.
The first step was to lookup whether or not iTextSharp XML Worker supports the HTML
td
tag. The mappings can be found in the source in iTextSharp.tool.xml.html.Tags. There you find td
is mapped to iTextSharp.tool.xml.html.table.TableData, which makes the job of implementing a custom tag processor a little easier. I.e. all we need to do inherit from the class and override End()
:
public class TableDataProcessor : TableData
{
/*
* a **very** simple implementation of the CSS writing-mode property:
* https://developer.mozilla.org/en-US/docs/Web/CSS/writing-mode
*/
bool HasWritingMode(IDictionary attributeMap)
{
bool hasStyle = attributeMap.ContainsKey("style");
return hasStyle
&& attributeMap["style"].Split(new char[] { ';' })
.Where(x => x.StartsWith("writing-mode:"))
.Count() > 0
? true : false;
}
public override IList End(
IWorkerContext ctx,
Tag tag,
IList currentContent)
{
var cells = base.End(ctx, tag, currentContent);
var attributeMap = tag.Attributes;
if (HasWritingMode(attributeMap))
{
var pdfPCell = (PdfPCell) cells[0];
// **always** 'sideways-lr'
pdfPCell.Rotation = 90;
}
return cells;
}
}
As noted in the inline comments, this is a very simple implementation for your specific needs. You'll need to add extra logic to support any other writing-mode CSS property value, and include any sanity checks.
UPDATE
Based on the comment left by @Daniel, it's not clear how to add custom CSS
when converting the HTML
to PDF
. First the updated HTML:
string XHTML = @"
Table with Vertical Text
Table without Vertical Text
";
Then a small snippet of custom CSS:
string CSS = @"
body {font-size: 12px;}
table {border-collapse:collapse; margin:8px;}
.light-yellow {background-color:#ffff99;}
td {border:1px solid #ccc;padding:4px;}
";
The slightly difficult part is the extra setup - you can't use the simple out of the box XMLWorkerHelper.GetInstance().ParseXHtml()
commonly seen here at SO. Here's a simple helper method that should get you started:
public void ConvertHtmlToPdf(string xHtml, string css)
{
using (var stream = new FileStream(OUTPUT_FILE, FileMode.Create))
{
using (var document = new Document())
{
var writer = PdfWriter.GetInstance(document, stream);
document.Open();
// instantiate custom tag processor and add to `HtmlPipelineContext`.
var tagProcessorFactory = Tags.GetHtmlTagProcessorFactory();
tagProcessorFactory.AddProcessor(
new TableDataProcessor(),
new string[] { HTML.Tag.TD }
);
var htmlPipelineContext = new HtmlPipelineContext(null);
htmlPipelineContext.SetTagFactory(tagProcessorFactory);
var pdfWriterPipeline = new PdfWriterPipeline(document, writer);
var htmlPipeline = new HtmlPipeline(htmlPipelineContext, pdfWriterPipeline);
// get an ICssResolver and add the custom CSS
var cssResolver = XMLWorkerHelper.GetInstance().GetDefaultCssResolver(true);
cssResolver.AddCss(css, "utf-8", true);
var cssResolverPipeline = new CssResolverPipeline(
cssResolver, htmlPipeline
);
var worker = new XMLWorker(cssResolverPipeline, true);
var parser = new XMLParser(worker);
using (var stringReader = new StringReader(xHtml))
{
parser.Parse(stringReader);
}
}
}
}
Instead of rehashing an explanation of the example code above, see the documentation (iText removed documentation, linked to Wayback Machine) to get a better idea of why you need to setup the parser that way.
Also note:
- XML Worker does not support all CSS2/CSS3 properties, so you may need to experiment with what works or doesn't work with regards to how close you want the PDF to look to the HTML displayed in the browser.
- The
HTML
snippet removed the p
tag, since the style can be applied directly to the td
tag.
- The inline
width
property. If omitted the columns will be variable widths that match if the text had been rendered horizontally.
Tested with iTextSharp and XML Worker versions 5.5.9 Here's the updated result: