The DOI system places basically no useful limitations on what constitutes a reasonable identifier. However, being able to pull DOIs out of PDFs, web pages, etc. is quite useful
Here is my go at it:
(10[.][0-9]{4,}[^\s"/<>]*/[^\s"<>]+)
And a couple of valid edge cases where this doesn't fail, but others seem to do:
10.1007.10/978-3-642-28108-2_19
(fictitious example, see @Ju9OR comment)Also, correctly discards some falsy (X|HT)ML stuff like: