Extract IBAN from text with Python

ぐ巨炮叔叔 提交于 2021-02-08 15:10:28

问题


I want to extract IBAN numbers from text with Python. The challenge here is, that the IBAN itself can be written in so many ways with spaces bewteen the numbers, that I find it difficult to translate this in a usefull regex pattern.

I have written a demo version which tries to match all German and Austrian IBAN numbers from text.

^DE([0-9a-zA-Z]\s?){20}$

I have seen similar questions on stackoverflow. However, the combination of different ways to write IBAN numbers and also extracting these numbers from text, makes it very difficult to solve my problem.

Hope you can help me with that!


回答1:


You can use

\b(?:DE|AT)(?:\s?[0-9a-zA-Z]){18}(?:(?:\s?[0-9a-zA-Z]){2})?\b

See the regex demo. Details:

  • \b - word boundary
  • (?:DE|AT) - DE or AT
  • (?:\s?[0-9a-zA-Z]){18} - eighteen occurrences of an optional whitespace and then an alphanumeric char
  • (?:(?:\s?[0-9a-zA-Z]){2})? - an optional occurrence of two sequences of an optional whitespace and an alphanumeric char
  • \b - word boundary.



回答2:


ISO landcode Verification# Bank# Account#
Germany 2a 2n 8n 10n
Austria 2a 2n 5n 11n

Note: a - alphabets (letters only), n - numbers (numbers only)

So the main difference is really the length in digits. That means you could try:

\b(?:DE(?:\s*\d){20}|AT(?:\s*\d){18})\b(?!\s*\d)

See the online demo.


  • \b - Word-boundary.
  • (?: - Open 1st non-capturing group.
    • DE - Match uppercase "DE" literally.
    • (?:- Open 2nd non-capturing group.
      • \s*\d - Zero or more spaces upto a single digit.
      • ){20} - Close 2nd non-capturing group and match it 20 times.
    • | - Or:
    • AT - Match uppercase "AT" literally.
    • (?:- Open 3rd non-capturing group.
      • \s*\d - Zero or more spaces upto a single digit.
      • ){18} - Close 2nd non-capturing group and match it 20 times.
    • ) - Close 1st non-capturing group.
  • \b - Word-boundary.
  • (?!\s*\d) - Negative lookahead to prevent any trailing digits.

It does show that your Austrian IBAN numbers are invalid. If you wish to extract up to the point where they would still be valid, I guess you can remove \b(?!\s*\d)



来源:https://stackoverflow.com/questions/65735039/extract-iban-from-text-with-python

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!