Split based on Regex python

前端 未结 1 602
旧巷少年郎
旧巷少年郎 2021-01-23 22:16

I have a string like below

\"‘‘Apple’’ It is create by Steve Jobs (He was fired and get hired) ‘‘Microsoft’’ Bill Gates was the richest man in the world ‘‘Oracle         


        
1条回答
  •  傲寒
    傲寒 (楼主)
    2021-01-23 23:00

    One option is to use re.findall with the following pattern:

    ‘‘(.*?)’’ (.*?)(?= ‘‘|$)
    

    This will capture, in separate groups, the company name and description, for each match found in the input. Note that we use the lookahead (?= ‘‘|$) as the end of the current description, which either occurs at the start of the next entry, or the end of the input.

    inp = "‘‘Apple’’ It is create by Steve Jobs (He was fired and get hired) ‘‘Microsoft’’ Bill Gates was the richest man in the world ‘‘Oracle’’ It is a database company"
    matches = re.findall('‘‘(.*?)’’ (.*?)(?= ‘‘|$)', inp)
    companyList = [row[0] for row in matches]
    descriptionList = [row[1] for row in matches]
    print(companyList)
    print(descriptionList)
    

    This prints:

    ['Apple', 'Microsoft', 'Oracle']
    ['It is create by Steve Jobs (He was fired and get hired)',
     'Bill Gates was the richest man in the world', 'It is a database company']
    

    0 讨论(0)
提交回复
热议问题