XML vs comma delimited text files

后端 未结 12 2693
面向向阳花
面向向阳花 2021-02-20 08:25

Ok, I\'ve read a couple books on XML and wrote programs to spit it out and what not. But here\'s the question. Both a comma delimited file and a XML file are \"human readable.

相关标签:
12条回答
  • 2021-02-20 09:08

    XML will describe the content and also has a ton of supporting libraries in a variety of languages... but it can be bloated. If the receiving end of the csv is aware of the layout and it is tabular, I don't see anything wrong with it.

    0 讨论(0)
  • 2021-02-20 09:11

    Advantages

    A number of advantages XML has over CSV:

    • Hierarchical data organization
    • Automatic data validation (XML Schemas or DTDs)
    • Easily convert formats (using XSL)
    • Easy to identify relational structure
    • Can be used in combination with XML-RPC
    • Suitable for object persistence (marshalling)
    • Simplifies business-to-business communications
    • Helpful related technologies (XPath, DOM)
    • Tight integration with modern Web browsers
    • Extract, Transform, and Load (ETL) tools
    • Backwards file format compatibility (version attribute)
    • Digital signatures

    It completely depends on the problem domain and what you are trying to solve.

    Example

    The last item is something that many people miss when writing web pages. Consider the situation where you have a large data store of songs. Songs have artists, albums, beats per minute, and so forth. You could export the data to XML, write a simple stylesheet to render the XML as XHTML, then point the browser at the XML page. The browser will render the XML as a web page.

    You cannot do that with CSV.

    Disadvantages

    Joel Spolsky has a great article on why XML is a poor choice as a complex data store: it is slow. (Unlike a database, which can retrieve previous or next records with a single CPU instruction, traversing records in an XML document is much slower.) Arguably, this could be considered an optimization problem, resolved by waiting 18 months. Thus:

    • Slower to parse than other formats
    • Syntactical redundancy can detract from readability
    • Document bloat could affect storage costs
    • Cannot easily model overlapping (non-hierarchical) data structures
    • Poorly designed XML file formats are not uncommon (in my experience; citation needed)

    Related Question

    See also: Why Should I Use A Human Readable File Format.

    0 讨论(0)
  • 2021-02-20 09:12

    It all depends on what you need to do. If you need more complexity in your data structures than a simple "flat" row structure can give. for example hierarchical data, then XML is a great choice.

    0 讨论(0)
  • 2021-02-20 09:14

    XML also has complimentary technologies surrounding it: XmlDom, XPath, XSLT, XSD, Xml Schemas

    0 讨论(0)
  • 2021-02-20 09:16

    The fact that XML is human readable does not mean that has been made with the idea of having it read (or even edited) directly by humans.

    XML has a nice set of properties that make it a good choice for many cases, in particular when you have the human resources to deal with the additional burden that such properties inevitably bring in: validation, well defined standard, a lot of tools, a very flexible architecture, it maps nicely to a tree model, which is what many programs use. Its human readability is an added value that simplifies debugging (try to do debugging of a binary file...), inspection and small changes for trivial cases.

    CSV on the other hand is easy, quick and linear, although many dialects exist, and parsing it well is far from trivial (and with the added problem that it looks trivial!). For most applications involving table of data, CSV is the perfect choice.

    In general, however, there are cases of data representation you can solve with XML but you cannot solve with CSV (for example, a tree). On the other hand, any data that can be represented in CSV can also be represented in XML, although it's not guaranteed (and indeed is also verified) that it will be more efficient (in terms of space, ease of parsing etc). It's a matter of "degrees of freedom" of your format. XML has a higher value of degree of freedom. CSV is lower. The hype behind XML is also relative to this fact.

    Don't fall victim of the hammer syndrome: when you have a hammer (XML), everything looks like a nail (something that you have to solve with XML). Reality is much different and nuanced. XML is cool, but it's not the answer to any problem.

    0 讨论(0)
  • 2021-02-20 09:16

    I like to think of the primary distinction in this case as XML is TREE based, while CSV is TABLE-based.

    That is, you can nest and re-nest and omit and generally make a complex TREE structure in XML, whereas you can only make simple 2D tables in CSV.

    0 讨论(0)
提交回复
热议问题