Well this is probably kind of a silly question but I\'m wondering if there\'s any way to have the generated markup in Jekyll to preserve the indentation of the Liquid-tag. World
We can accomplish this by writing a custom Liquid filter to tidy the html, and then doing {{content | tidy }}
to include the html.
A quick search suggests that the ruby tidy gem may not be maintained but that nokogiri is the way to go. This will of course mean installing the nokogiri gem.
See advice on writing liquid filters, and Jekyll example filters.
An example might look something like this: in _plugins
, add a script called tidy-html.rb containing:
require 'nokogiri'
module TextFilter
def tidy(input)
desired = Nokogiri::HTML::DocumentFragment.parse(input).to_html
end
end
Liquid::Template.register_filter(TextFilter)
(Untested)
Using a Liquid Filter
I managed to make this work using a liquid filter. There are a few caveats:
Your input must be clean. I had some curly quotes and non-printable chars that looked like whitespace in a few files (copypasta from Word or some such) and was seeing "Invalid byte sequence in UTF-8" as a Jekyll error.
It could break some things. I was using <i class="icon-file"></i>
icons from twitter bootstrap. It replaced the empty tag with <i class="icon-file"/>
and bootstrap did not like that. Additionally, it screws up the octopress {% codeblock %}
s in my content. I didn't really look into why.
While this will clean the output of a liquid variable such as {{ content }}
it does not actually solve the problem in the original post, which is to indent the html in context of the surrounding html. This will provide well formatted html, but as a fragment that will not be indented relative to tags above the fragment. If you want to format everything in context, use the Rake task instead of the filter.
-
require 'rubygems'
require 'json'
require 'nokogiri'
require 'nokogiri-pretty'
module Jekyll
module PrettyPrintFilter
def pretty_print(input)
#seeing some ASCII-8 come in
input = input.encode("UTF-8")
#Parsing with nokogiri first cleans up some things the XSLT can't handle
content = Nokogiri::HTML::DocumentFragment.parse input
parsed_content = content.to_html
#Unfortunately nokogiri-pretty can't use DocumentFragments...
html = Nokogiri::HTML parsed_content
pretty = html.human
#...so now we need to remove the stuff it added to make valid HTML
output = PrettyPrintFilter.strip_extra_html(pretty)
output
end
def PrettyPrintFilter.strip_extra_html(html)
#type declaration
html = html.sub('<?xml version="1.0" encoding="ISO-8859-1"?>','')
#second <html> tag
first = true
html = html.gsub('<html>') do |match|
if first == true
first = false
next
else
''
end
end
#first </html> tag
html = html.sub('</html>','')
#second <head> tag
first = true
html = html.gsub('<head>') do |match|
if first == true
first = false
next
else
''
end
end
#first </head> tag
html = html.sub('</head>','')
#second <body> tag
first = true
html = html.gsub('<body>') do |match|
if first == true
first = false
next
else
''
end
end
#first </body> tag
html = html.sub('</body>','')
html
end
end
end
Liquid::Template.register_filter(Jekyll::PrettyPrintFilter)
Using a Rake task
I use a task in my rakefile to pretty print the output after the jekyll site has been generated.
require 'nokogiri'
require 'nokogiri-pretty'
desc "Pretty print HTML output from Jekyll"
task :pretty_print do
#change public to _site or wherever your output goes
html_files = File.join("**", "public", "**", "*.html")
Dir.glob html_files do |html_file|
puts "Cleaning #{html_file}"
file = File.open(html_file)
contents = file.read
begin
#we're gonna parse it as XML so we can apply an XSLT
html = Nokogiri::XML(contents)
#the human() method is from nokogiri-pretty. Just an XSL transform on the XML.
pretty_html = html.human
rescue Exception => msg
puts "Failed to pretty print #{html_file}: #{msg}"
end
#Yep, we're overwriting the file. Potentially destructive.
file = File.new(html_file,"w")
file.write(pretty_html)
file.close
end
end