Building an HTML Diff/Patch Algorithm

后端 未结 3 654
日久生厌
日久生厌 2021-02-04 03:48

A description of what I\'m going to accomplish:

  • Input 2 (N is not essential) HTML documents.
  • Standardize the HTML format
  • Diff the two documents
3条回答
  •  情话喂你
    2021-02-04 04:25

    I know this questions is related to python but you could take a look 3DM - XML 3-way Merging and Differencing Tool (default implementation in java) but here is the actual paper describing the algorithm used http://www.cs.hut.fi/~ctl/3dm/thesis.pdf, and here is the link to the site.

    Drawback to this is that you do have to cleanup the document and be able to pars it as XML.

提交回复
热议问题