How to use the “about:” protocol of HTML5 in XSLT processors

故事扮演 提交于 2019-12-21 05:29:07

问题


The HTML5 draft specifies (at the moment at least), that the URI about:legacy-compat can be used for documents, that rely on an XML conforming doctype (which <!DOCTYPE html> isn't).

So I happen to have a bundle of HTML5-validating XML files, that start with:

<!DOCTYPE html SYSTEM "about:legacy-compat">

Unfortunately, when I use such an XHTML5 document with any XSLT processor like Xalan or Saxon, they naturally try to resolve the (unresolvable) URI.

Is there any way to bring them into ignoring the URI or faux-resolving it under the hood? The try to resolve it happens early in these documents, so for example Saxon's -dtd:off switch has no effect here.

Edit: The low-level approach sed -n '2,$p' <htmlfile> | otherapp unfortunately only works until I start to use the document() XPath function to load another XHTML5 file.

Edit 2: I played around with XML catalogs and got them to work with both Saxon and Xalan. However, then I get always a

java.net.MalformedURLException: unknown protocol: about

Well, it's not surprising, but how can I circumvent this? The URL should never be parsed, just thrown away.


回答1:


Put this Java file into $somepath/foo/about/

package foo.about;

import java.io.IOException;
import java.io.InputStream;
import java.io.StringBufferInputStream;
import java.net.URL;
import java.net.URLConnection;

public class Handler extends java.net.URLStreamHandler {

@Override
protected URLConnection openConnection(URL url) throws IOException  {               
    URLConnection res = new URLConnection(url) {

        @Override
        public void connect() throws IOException {
            connected = true;
        }
        @Override
        public InputStream getInputStream() throws IOException {
            return new StringBufferInputStream("<!ELEMENT html ANY>");
        }
    };
    return res;
 }
}

Now go in $somepath and compile it:

javac foo/about/Handler.java

Add the following arguments to the JVM when calling Saxon:

-Djava.protocol.handler.pkgs=foo -cp"$somepath"

Here is a modified shell script script (for *nix system but it it very similar for Windows):

#!/bin/sh

exec java -Djava.protocol.handler.pkgs=foo -classpath /usr/share/java/saxonb.jar:"$somepath" net.sf.saxon.Transform "$@"

You may want to adapt using your local saxonb-xslt script if it doesn't work.



来源:https://stackoverflow.com/questions/6917514/how-to-use-the-about-protocol-of-html5-in-xslt-processors

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!