Is there any way to find (even a best guess) the \"printed\" length of a string in python? E.g. \'potaa\\bto\' is 8 characters in len
but only 6 characters wide pr
At least for the ANSI TTY escape sequence, this works:
import re
strip_ANSI_pat = re.compile(r"""
\x1b # literal ESC
\[ # literal [
[;\d]* # zero or more digits or semicolons
[A-Za-z] # a letter
""", re.VERBOSE).sub
def strip_ANSI(s):
return strip_ANSI_pat("", s)
s = 'potato\x1b[01;32mpotato\x1b[0;0mpotato'
print s, len(s)
s1=strip_ANSI(s)
print s1, len(s1)
Prints:
potato[01;32mpotato[0;0mpotato 32
potatopotatopotato 18
For backspaces \b or vertical tabs or \r vs \n -- it depends how and where it is printed, no?
The printed length of a string depends on the type of the string.
Normal strings in python 2.x are in utf-8. The length of utf-8 is equal to the bytes in String. Change the type to unicode, len() delivers now printed signs. So Formatting works:
value = 'abcäöücdf'
len_value = len(value)
len_uvalue = len(unicode(value,'utf-8'))
size = self['size'] + len_value-len_uvalue
print value[:min(len(value),size)].ljust(size)
The bash shell had exactly the same need, in order to know when the user's typed input wraps to the next line, in the presence of non-printable characters in the prompt string. Their solution was to not even try - instead, they require that anyone setting a prompt string put \[
and \]
around non-printing portions of the prompt. The printed length is calculated to be the length of the string, with these special sequences and all text between them filtered out. (The special sequences are omitted on output, of course.)