sgml to xml conversion

后端 未结 4 1147
夕颜
夕颜 2020-12-30 15:11

I have a following sample sgml data from my .sgm file and I want convert this in to xml



xyz
<         


        
相关标签:
4条回答
  • 2020-12-30 15:42

    Maybe you can use the osx SGML to XML converter. It is part of the OpenSP package (based on SP, originally written by James Clark).

    • http://openjade.sourceforge.net/doc/index.htm
    • http://www.jclark.com/sp/index.htm
    0 讨论(0)
  • 2020-12-30 16:05

    Others have already given some good advice. Here's one way of putting it all together by first converting the input SGML to well-formed XML and then using XSLT to transform that to the exact format you need.

    Converting your SGML to well-formed XML

    The osx tool from the OpenSP package suggested by mzjn is a good tool for this. Since your SGML markup omits end tags, you need to have a DTD from which the correct nesting of elements can be determined. If you don't have a DTD, you need to create one. For your example input, it could be as simple as this:

    <!ELEMENT toplevel o o (viewed)+>
    
    <!ELEMENT viewed - o (#PCDATA,cite)>
    <!ELEMENT cite - o (yr,pno)>
    <!ELEMENT yr - o (#PCDATA)>
    <!ELEMENT pno - o (#PCDATA)>
    
    <!ATTLIST pno cite CDATA #REQUIRED>
    

    You also need to add a proper doctype declaration to the beginning of your SGML file. Assuming you have your DTD in file viewed.dtd.

    <!DOCTYPE toplevel SYSTEM "viewed.dtd" >
    

    With this addition, you should now be able use osx to convert the SGML to XML. (It won't be able to convert the processing instructions which start with a / as those are not allowed in XML, and will emit a warning about them.)

    osx input.sgm > input.xml
    

    Transforming the resulting XML to your desired format

    For the above case, you could use something like the following XSLT stylesheet:

    <xsl:stylesheet version="1.0"
                    xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
      <xsl:output method="xml" indent="yes"/>
      <xsl:template match="VIEWED">
        <index1>
          <num viewed="{normalize-space(text())}"/>
          <heading>
            <xsl:value-of select="normalize-space(text())"/>
          </heading>
          <index-refs>
            <xsl:apply-templates select="CITE"/>
          </index-refs>
        </index1>
      </xsl:template>
    
      <xsl:template match="CITE">
        <link caseno="{PNO/@CITE}"/>
      </xsl:template>
    
    </xsl:stylesheet>
    
    0 讨论(0)
  • 2020-12-30 16:05

    Why XSLT? I doubt you can map SGML to XML Infoset or XDM...

    I think that you should better use the language made for this task: DSSSL (Document Style Semantics and Specification Language)

    This is the predecessor of XSLT. The author is James Clark. And this is the his site.

    0 讨论(0)
  • 2020-12-30 16:06

    Can the SGML-Reader, originally developed by Chris Lovett help in solving this problem?

    0 讨论(0)
提交回复
热议问题