Parsing Selector struct with alternating tokens using Boost Spirit X3

后端 未结 2 1827
深忆病人
深忆病人 2021-01-27 04:04

I am trying to parse the following struct:

struct Selector {
    std::string element;
    std::string id;
    std::vector         


        
2条回答
  •  小鲜肉
    小鲜肉 (楼主)
    2021-01-27 04:34

    Maybe it is not, what you want to have, then please inform me and I will delete the answer, but for this somehow simple parsing, you do not need Boost and neither Spirit.

    A simple regex will do to split of the given string into a token. We can observe the following:

    • An "element" name starts at the begin of the line and is a string of alpha numerical characters.
    • the "id" starts always with a hash #
    • and, the class names always start with a dot .

    So, we can form a single regex to match those 3 types of tokens.

    ((^\w+)|[\.#]\w+)
    

    You may look here for an explanation of the regex.

    Then we can write a simple program that reads selectors, splits it into tokens and then assigns those to the Selector struct.

    Please see the following example. This should give you an idea on how it could be done.

    #include 
    #include 
    #include 
    #include 
    #include 
    #include 
    #include 
    
    struct Selector {
        std::string element;
        std::string id;
        std::vector classes;
    };
    
    std::stringstream inputFileStream{ R"(element1#id1.class11.class12.class13.class14
    element2#id2.class21.class22
    #id3.class31.class32.class33.class34.class35
    .class41.class42,class43#id4
    .class51#id5.class52.class53.class54.class55.class56
    )"};
    
    //std::regex re{R"(([\.#]?\w+))"};
    std::regex re{ R"(((^\w+)|[\.#]\w+))" };
    
    int main() {
    
        std::vector selectors{};
    
        // Read all lines of the source file
        for (std::string line{}; std::getline(inputFileStream, line); ) {
    
            // Split the line with selector string into tokens
            std::vector tokens(std::sregex_token_iterator(line.begin(), line.end(), re), {});
    
            // Here we will store the one single selector
            Selector tempSelector{};
    
            // Go though all tokens and check the type of them
            for (const std::string& token : tokens) {
    
                // Depending on the structure element type, add it to the correct structure element field
                if (token[0] == '#') tempSelector.id = std::move(token.substr(1));
                else if (token[0] == '.') tempSelector.classes.emplace_back(token.substr(1));
                else if (std::isalnum(token[0])) tempSelector.element = token;
                else std::cerr << "\n*** Error: Invalid token found: " << token << "\n";
            }
            // Add the new selector to the vector of selectors
            selectors.push_back(std::move(tempSelector));
        }
    
    
        // Show debug output
        for (const Selector& s : selectors) {
            std::cout << "\n\nSelector\n\tElement:\t" << s.element << "\n\tID:\t\t" << s.id << "\n\tClasses:\t";
            for (const std::string& c : s.classes)
                std::cout << c << " ";
        }
        std::cout << "\n\n";
    
        return 0;
    }
    

    Of course we could do a more sophisticated regex with some additional checking.

提交回复
热议问题