RegEx for matching UK Postcodes

前端 未结 30 2503
广开言路
广开言路 2020-11-22 01:38

I\'m after a regex that will validate a full complex UK postcode only within an input string. All of the uncommon postcode forms must be covered as well as the usual. For in

30条回答
  •  慢半拍i
    慢半拍i (楼主)
    2020-11-22 01:46

    Through empirical testing and observation, as well as confirming with https://en.wikipedia.org/wiki/Postcodes_in_the_United_Kingdom#Validation, here is my version of a Python regex that correctly parses and validates a UK postcode:

    UK_POSTCODE_REGEX = r'(?P[A-Z]{1,2})(?P(?:[0-9]{1,2})|(?:[0-9][A-Z]))(?P[0-9])(?P[A-Z]{2})'

    This regex is simple and has capture groups. It does not include all of the validations of legal UK postcodes, but only takes into account the letter vs number positions.

    Here is how I would use it in code:

    @dataclass
    class UKPostcode:
        postcode_area: str
        district: str
        sector: int
        postcode: str
    
        # https://en.wikipedia.org/wiki/Postcodes_in_the_United_Kingdom#Validation
        # Original author of this regex: @jontsai
        # NOTE TO FUTURE DEVELOPER:
        # Verified through empirical testing and observation, as well as confirming with the Wiki article
        # If this regex fails to capture all valid UK postcodes, then I apologize, for I am only human.
        UK_POSTCODE_REGEX = r'(?P[A-Z]{1,2})(?P(?:[0-9]{1,2})|(?:[0-9][A-Z]))(?P[0-9])(?P[A-Z]{2})'
    
        @classmethod
        def from_postcode(cls, postcode):
            """Parses a string into a UKPostcode
    
            Returns a UKPostcode or None
            """
            m = re.match(cls.UK_POSTCODE_REGEX, postcode.replace(' ', ''))
    
            if m:
                uk_postcode = UKPostcode(
                    postcode_area=m.group('postcode_area'),
                    district=m.group('district'),
                    sector=m.group('sector'),
                    postcode=m.group('postcode')
                )
            else:
                uk_postcode = None
    
            return uk_postcode
    
    
    def parse_uk_postcode(postcode):
        """Wrapper for UKPostcode.from_postcode
        """
        uk_postcode = UKPostcode.from_postcode(postcode)
        return uk_postcode
    

    Here are unit tests:

    @pytest.mark.parametrize(
        'postcode, expected', [
            # https://en.wikipedia.org/wiki/Postcodes_in_the_United_Kingdom#Validation
            (
                'EC1A1BB',
                UKPostcode(
                    postcode_area='EC',
                    district='1A',
                    sector='1',
                    postcode='BB'
                ),
            ),
            (
                'W1A0AX',
                UKPostcode(
                    postcode_area='W',
                    district='1A',
                    sector='0',
                    postcode='AX'
                ),
            ),
            (
                'M11AE',
                UKPostcode(
                    postcode_area='M',
                    district='1',
                    sector='1',
                    postcode='AE'
                ),
            ),
            (
                'B338TH',
                UKPostcode(
                    postcode_area='B',
                    district='33',
                    sector='8',
                    postcode='TH'
                )
            ),
            (
                'CR26XH',
                UKPostcode(
                    postcode_area='CR',
                    district='2',
                    sector='6',
                    postcode='XH'
                )
            ),
            (
                'DN551PT',
                UKPostcode(
                    postcode_area='DN',
                    district='55',
                    sector='1',
                    postcode='PT'
                )
            )
        ]
    )
    def test_parse_uk_postcode(postcode, expected):
        uk_postcode = parse_uk_postcode(postcode)
        assert(uk_postcode == expected)
    

提交回复
热议问题