I am using wkhtmltopdf.exe (version 0.12.0 final) to generate pdf files from html files, I do this with .NET C#
My problem is getting javascript, stylesheets and images to work by only specifying relative paths in the html. Right now I have it working if I use absolute paths. But it doesn't work with relative paths, which makes the whole html generation a bit to complicated. I have boiled what I do down to the following example:
string CMDPATH = @"C:\Program Files\wkhtmltopdf\bin\wkhtmltopdf.exe";
string HTML = string.Format(
"<div><img src=\"{0}\" /></div><div><img src=\"{1}\" /></div><div>{2}</div>",
"./sohlogo.png",
"./ACLASS.jpg",
DateTime.Now.ToString());
WriteFile(HTML, "test.html");
Process p;
ProcessStartInfo psi = new ProcessStartInfo();
psi.FileName = CMDPATH;
psi.UseShellExecute = false;
psi.WorkingDirectory = AppDomain.CurrentDomain.BaseDirectory;
psi.CreateNoWindow = true;
psi.RedirectStandardInput = true;
psi.RedirectStandardOutput = true;
psi.RedirectStandardError = true;
psi.Arguments = "-q - -";
p = Process.Start(psi);
StreamWriter stdin = p.StandardInput;
stdin.AutoFlush = true;
stdin.Write(HTML);
stdin.Dispose();
MemoryStream pdfstream = new MemoryStream();
CopyStream(p.StandardOutput.BaseStream, pdfstream);
p.StandardOutput.Close();
pdfstream.Position = 0;
WriteFile(pdfstream, "test.pdf");
p.WaitForExit(10000);
int test = p.ExitCode;
p.Dispose();
I have tried relative paths like: "./sohlogo.png" and simply "sohlogo.png" both displays correctly in the browser via the html file. But none of them work in the pdf file. There is no data in the error stream.
The following commandline works like a charm with the relative paths:
"c:\Program Files\wkhtmltopdf\bin\wkhtmltopdf.exe" test.html test.pdf
I could really need some input at this stage. So any help is much appreciated!
Just for reference the WriteFile and CopyStream methods looks like this:
public static void WriteFile(MemoryStream stream, string path)
{
using (FileStream writer = new FileStream(path, FileMode.Create))
{
byte[] bytes = stream.ToArray();
writer.Write(bytes, 0, bytes.Length);
writer.Flush();
}
}
public static void WriteFile(string text, string path)
{
using (StreamWriter writer = new StreamWriter(path))
{
writer.WriteLine(text);
writer.Flush();
}
}
public static void CopyStream(Stream input, Stream output)
{
byte[] buffer = new byte[32768];
int read;
while ((read = input.Read(buffer, 0, buffer.Length)) > 0)
{
output.Write(buffer, 0, read);
}
}
EDIT: My Workaround for Neo Nguyen.
I could not get this to work with relative paths. So what I did instead was a method that prepends all paths with a root path. It solves my problem so maybe it will solve yours:
/// <summary>
/// Prepends the basedir x in src="x" or href="x" to the input html text
/// </summary>
/// <param name="html">the initial html</param>
/// <param name="basedir">the basedir to prepend</param>
/// <returns>the new html</returns>
public static string MakeRelativePathsAbsolute(string html, string basedir)
{
string pathpattern = "(?:href=[\"']|src=[\"'])(.*?)[\"']";
// SM20140214: tested that both chrome and wkhtmltopdf.exe understands "C:\Dir\..\image.png" and "C:\Dir\.\image.png"
// Path.Combine("C:/
html = Regex.Replace(html, pathpattern, new MatchEvaluator((match) =>
{
string newpath = UrlEncode(Path.Combine(basedir, match.Groups[1].Value));
if (!string.IsNullOrEmpty(match.Groups[1].Value))
{
string result = match.Groups[0].Value.Replace(match.Groups[1].Value, newpath);
return result;
}
else
{
return UrlEncode(match.Groups[0].Value);
}
}));
return html;
}
private static string UrlEncode(string url)
{
url = url.Replace(" ", "%20").Replace("#", "%23");
return url;
}
I tried different System.Uri.Escape*** methods like System.Uri.EscapeDataString(). But they ended up doing to severe url encoding for wkhtmltopdf to understand it. Because of lack of time I just did the quick and dirty UrlEncode above.
Looking quickly, I think the trouble might be with
psi.WorkingDirectory = AppDomain.CurrentDomain.BaseDirectory;
I think that is where the paths are pointing at. I'm assuming that
"c:\Program Files\wkhtmltopdf\bin\wkhtmltopdf.exe" test.html test.pdf
working means that your image referenced inside test.html
as src="mlp.png"
is at c:\Program Files\wkhtmltopdf\bin\mlp.png
, right? I think that it works because your image file is in the same folder as wkhtmltopdf... so try setting the WorkingDirectory
to that directory and see what happens.
来源:https://stackoverflow.com/questions/21775572/wkhtmltopdf-relative-paths-in-html-with-redirected-in-out-streams-wont-work