How can I break a string into nested tokens?

后端 未结 2 427
旧时难觅i
旧时难觅i 2021-01-24 11:32

I have strings made up of Boolean terms and equations, like so

x=1 AND (x=2 OR x=3) AND NOT (x=4 AND x=5) AND (x=5) AND y=1

I would like to break up

2条回答
  •  被撕碎了的回忆
    2021-01-24 12:29

    I suppose you could do something like this:

    operators = ["AND NOT", "AND"]
    sepChar = ":"
    yourInputString = yourInputString.replace("(","").replace(")","") # remove the parenthesis
    
    # Replace your operators with the separator character
    for op in operators:
        yourInputString = yourInputString.replace(op,sepChar)
    
    # output of your string so far
    # yourInputString
    # 'x=1 : x=2 OR x=3 : x=4 : x=5 : x=5 : y=1'
    
    # Create a list with the separator character
    operationsList = youtInputString.split(sepChar) 
    
    # operationsList
    # ['x=1', 'x=2 OR x=3', 'x=4', 'x=5', 'x=5', 'y=1']
    
    # For the second result, let's do another operation list:
    operators2 = ["OR"]
    output = []
    
    # Loop to find the other operators
    for op in operationsList:
        for operator in operators2:
            if operator in op:
                op = op.split(operator)
        output.append(op)
    
    # output:
    # [['x=1'], ['x=2', 'x=3'], ['x=4'], ['x=5'], ['x=5'],['y=1']]
    
    

    In this case, I used ":" as separation character, but you can change it according to your needs. Please let me know if this helps!

    Edit

    For a parenthesis nesting approach, I came with something brilliant:

    import re
    operators = ["AND NOT","AND","OR"]
    
    # Substitute parenthesis
    yourInputString = yourInputString.replace("(","[").replace(")","]")
    
    # yourInputString
    # "[x=1 AND [x=2 OR x=3] AND NOT [x=4 AND x=5] AND [x=5] AND y=1]"
    
    # Replace your operators
    for op in operators:
        yourInputString = yourInputString(op,",")
    
    # yourInputString
    # "[x=1 , [x=2 , x=3] , [x=4 , x=5] , [x=5] , y=1]"
    
    # Find matches like x = 5 and substitue with 'x = 5'
    compiler = re.compile(r"[xyz]{1}=\d")
    matches = compiler.findall(yourInputString)
    
    # matches
    # ['x=1', 'x=2', 'x=3', 'x=4', 'x=5', 'x=5', 'y=1']
    
    # Convert the list into unique outputs
    matches = list(set(matches))
    
    # matches
    # ['x=1', 'x=2', 'x=3', 'x=4', 'x=5', 'y=1']
    
    # Replace your matches to add quotes to each element
    for match in matches:
        yourInputString = yourInputString.replace(match,f"'{match}'")
    
    
    # yourInputString
    # "['x=1' , ['x=2' , 'x=3'] , ['x=4' , 'x=5'] , ['x=5'] , 'y=1']"
    
    # Here is the special move, convert your text into list
    myList = eval(yourInputString)
    
    # myList
    # ['x=1', ['x=2', 'x=3'], ['x=4', 'x=5'], ['x=5'], 'y=1']
    

    Let me know if that helped! Best!

提交回复
热议问题