问题
The HTML5 draft specifies (at the moment at least), that the URI about:legacy-compat
can be used for documents, that rely on an XML conforming doctype (which <!DOCTYPE html>
isn't).
So I happen to have a bundle of HTML5-validating XML files, that start with:
<!DOCTYPE html SYSTEM "about:legacy-compat">
Unfortunately, when I use such an XHTML5 document with any XSLT processor like Xalan or Saxon, they naturally try to resolve the (unresolvable) URI.
Is there any way to bring them into ignoring the URI or faux-resolving it under the hood? The try to resolve it happens early in these documents, so for example Saxon's -dtd:off
switch has no effect here.
Edit: The low-level approach sed -n '2,$p' <htmlfile> | otherapp
unfortunately only works until I start to use the document()
XPath function to load another XHTML5 file.
Edit 2: I played around with XML catalogs and got them to work with both Saxon and Xalan. However, then I get always a
java.net.MalformedURLException: unknown protocol: about
Well, it's not surprising, but how can I circumvent this? The URL should never be parsed, just thrown away.
回答1:
Put this Java file into $somepath/foo/about/
package foo.about;
import java.io.IOException;
import java.io.InputStream;
import java.io.StringBufferInputStream;
import java.net.URL;
import java.net.URLConnection;
public class Handler extends java.net.URLStreamHandler {
@Override
protected URLConnection openConnection(URL url) throws IOException {
URLConnection res = new URLConnection(url) {
@Override
public void connect() throws IOException {
connected = true;
}
@Override
public InputStream getInputStream() throws IOException {
return new StringBufferInputStream("<!ELEMENT html ANY>");
}
};
return res;
}
}
Now go in $somepath and compile it:
javac foo/about/Handler.java
Add the following arguments to the JVM when calling Saxon:
-Djava.protocol.handler.pkgs=foo -cp"$somepath"
Here is a modified shell script script (for *nix system but it it very similar for Windows):
#!/bin/sh
exec java -Djava.protocol.handler.pkgs=foo -classpath /usr/share/java/saxonb.jar:"$somepath" net.sf.saxon.Transform "$@"
You may want to adapt using your local saxonb-xslt script if it doesn't work.
来源:https://stackoverflow.com/questions/6917514/how-to-use-the-about-protocol-of-html5-in-xslt-processors