In the pattern attribute of an input tag, I am using the following regular expression for validation of the US Federal Tax ID field.
pattern=\"^([07][1-7]|1[0-6]
I have written a regex which satisfy the requirements.
pattern="^((?!11-1111111)(?!22-2222222)(?!33-3333333)(?!44-4444444)(?!55-5555555)(?!66-6666666)(?!77-7777777)(?!88-8888888)(?!99-9999999)(?!12-3456789)(?!00-[0-9]{7})([0-9]{2}-[0-9]{7}))*$"
Though it is not an efficient one. Anyone has more efficient regex?
Tim linked to the format description of Federal Tax Identification Numbers (EINs) which says that the first two numbers can be any of 83 numbers. The remaining numbers can be anything. It would therefore, in fact, be easier to make a negative pattern that catches 07, 08, 09, 17, 18, 19, 28, 29, 49, 69, 70, 78, 79, 89, 96, 97:
^(?![01][789]|2[89]|[46]9|7[089]|89|9[67])\d\d-\d{7}$
But this doesn't catch EINs like 11-1111111
or 22-2222222
: You presumably want to catch these because it resembles the type of pattern that people flood-fill the form field with. You can catch repeated patterns like this:
^(\d)\1-\1{7}$
But becareful here: You cannot be sure that they're not valid. Every single of 11-1111111
, 22-2222222
, 33-3333333
, etc. are valid according to the definition (since 11, 22, 33, etc.) are all valid compus code prefixes.
So you'll eventually deny someone access whose EIN is legitimate.
The drawback of having a whitelist of EIN prefixes is that your software now has an external dependency on a list of numbers, and your package manager will not notify you about when legislation changes. Since it is very hard to predict the lifetime of software, this program might eventually deny EINs that are valid in the future. You'll have to weigh some costs here; making a really short regex is perhaps not the best thing you can achieve here.