Finding a DOI in a document or page

前端 未结 7 1829
悲&欢浪女
悲&欢浪女 2021-01-29 21:43

The DOI system places basically no useful limitations on what constitutes a reasonable identifier. However, being able to pull DOIs out of PDFs, web pages, etc. is quite useful

相关标签:
7条回答
  • 2021-01-29 22:20

    This is a really old and answered question, but here's another potential substitute.

    \b10\.(\d+\.*)+[\/](([^\s\.])+\.*)+\b

    This assumes that white space is not part of the DOI.

    Haven't tested this for false positives, but it seems to be able to find all the edge cases mentioned in this page.

    0 讨论(0)
提交回复
热议问题