How to get correct list position in multi-byte string using preg_match
问题 I am currently matching HTML using this code: preg_match('/<\/?([a-z]+)[^>]*>|&#?[a-zA-Z0-9]+;/u', $html, $match, PREG_OFFSET_CAPTURE, $position) It matches everything perfect, however if I have a multibyte character, it counts it as 2 characters when giving back the position. For example the returned $match array would give something like: array 0 => array 0 => string '<br />' (length=6) 1 => int 132 1 => array 0 => string 'br' (length=2) 1 => int 133 The real number for the <br /> match is