Filtering out ANSI escape sequences

匿名 (未验证) 提交于 2019-12-03 01:57:01

问题:

I have a python script that's trying to interpret a trace of data written to and read from stdout and stdin, respectively. The problem is that this data is riddled with ANSI escapes I don't care about. These escapes are JSON encoded, so they look like "\033[A" and "\033]0;". I don't actually need to interpret the codes, but I do need to know how many characters are included in each (you'll notice the first sequence is 6 characters while the second is 7). Is there a straightforward way to filter out these codes from the strings I have?

回答1:

The complete regexp for Control Sequences (aka ANSI Escape Sequences) is

/(\x9B|\x1B\[)[0-?]*[ -\/]*[@-~]/

Refer to ECMA-48 Section 5.4 and ANSI escape code



回答2:

Another variant:

def strip_ansi_codes(s):     """     >>> import blessings     >>> term = blessings.Terminal()     >>> foo = 'hidden'+term.clear_bol+'foo'+term.color(5)+'bar'+term.color(255)+'baz'     >>> repr(strip_ansi_codes(foo))     u'hiddenfoobarbaz'     """     return re.sub(r'\x1b\[([0-9,A-Z]{1,2}(;[0-9]{1,2})?(;[0-9]{3})?)?[m|K]?', '', s)


回答3:

#!/usr/bin/env python import re  ansi_pattern = '\033\[((?:\d|;)*)([a-zA-Z])' ansi_eng = re.compile(ansi_pattern)  def strip_escape(string=''):     lastend = 0     matches = []     newstring = str(string)     for match in ansi_eng.finditer(string):         start = match.start()         end = match.end()         matches.append(match)     matches.reverse()     for match in matches:         start = match.start()         end = match.end()         string = string[0:start] + string[end:]     return string  if __name__ == '__main__':     import sys     import os      lname = sys.argv[-1]     fname = os.path.basename(__file__)     if lname != fname:         with open(lname, 'r') as fd:             for line in fd.readlines():                 print strip_escape(line).rstrip()     else:         USAGE = '%s ' % fname         print USAGE


回答4:

This worked for me:

re.sub(r'\x1b\[[\d;]+m', '', s)


回答5:

It's far from perfect, but this regex may get you somwhere:

import re text = r'begin \033[A middle \033]0; end' print re.sub(r'\\[0-9]+(\[|\])[0-9]*;?[A-Z]?', '', text)

It already removes your two examples correctly.



回答6:

FWIW, this Python regex seemed to work for me. I don't actually know if it's accurate, but empirically it seems to work:

r'\\033[\[\]]([0-9]{1,2}([;@][0-9]{0,2})*)*[mKP]?'


标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!