Parsing Selector struct with alternating tokens using Boost Spirit X3

后端 未结 2 1828
深忆病人
深忆病人 2021-01-27 04:04

I am trying to parse the following struct:

struct Selector {
    std::string element;
    std::string id;
    std::vector         


        
相关标签:
2条回答
  • 2021-01-27 04:34

    Maybe it is not, what you want to have, then please inform me and I will delete the answer, but for this somehow simple parsing, you do not need Boost and neither Spirit.

    A simple regex will do to split of the given string into a token. We can observe the following:

    • An "element" name starts at the begin of the line and is a string of alpha numerical characters.
    • the "id" starts always with a hash #
    • and, the class names always start with a dot .

    So, we can form a single regex to match those 3 types of tokens.

    ((^\w+)|[\.#]\w+)
    

    You may look here for an explanation of the regex.

    Then we can write a simple program that reads selectors, splits it into tokens and then assigns those to the Selector struct.

    Please see the following example. This should give you an idea on how it could be done.

    #include <iostream>
    #include <vector>
    #include <regex>
    #include <sstream>
    #include <string>
    #include <iterator>
    #include <cctype>
    
    struct Selector {
        std::string element;
        std::string id;
        std::vector<std::string> classes;
    };
    
    std::stringstream inputFileStream{ R"(element1#id1.class11.class12.class13.class14
    element2#id2.class21.class22
    #id3.class31.class32.class33.class34.class35
    .class41.class42,class43#id4
    .class51#id5.class52.class53.class54.class55.class56
    )"};
    
    //std::regex re{R"(([\.#]?\w+))"};
    std::regex re{ R"(((^\w+)|[\.#]\w+))" };
    
    int main() {
    
        std::vector<Selector> selectors{};
    
        // Read all lines of the source file
        for (std::string line{}; std::getline(inputFileStream, line); ) {
    
            // Split the line with selector string into tokens
            std::vector<std::string> tokens(std::sregex_token_iterator(line.begin(), line.end(), re), {});
    
            // Here we will store the one single selector
            Selector tempSelector{};
    
            // Go though all tokens and check the type of them
            for (const std::string& token : tokens) {
    
                // Depending on the structure element type, add it to the correct structure element field
                if (token[0] == '#') tempSelector.id = std::move(token.substr(1));
                else if (token[0] == '.') tempSelector.classes.emplace_back(token.substr(1));
                else if (std::isalnum(token[0])) tempSelector.element = token;
                else std::cerr << "\n*** Error: Invalid token found: " << token << "\n";
            }
            // Add the new selector to the vector of selectors
            selectors.push_back(std::move(tempSelector));
        }
    
    
        // Show debug output
        for (const Selector& s : selectors) {
            std::cout << "\n\nSelector\n\tElement:\t" << s.element << "\n\tID:\t\t" << s.id << "\n\tClasses:\t";
            for (const std::string& c : s.classes)
                std::cout << c << " ";
        }
        std::cout << "\n\n";
    
        return 0;
    }
    

    Of course we could do a more sophisticated regex with some additional checking.

    0 讨论(0)
  • 2021-01-27 04:38

    I've written similar answers before:

    • Parsing CSS with Boost.Spirit X3 (a treasure trove for more complete CSS parsing in both Qi and X3)
    • Using boost::spirit to parse named parameters in any order (Qi and X3 in the comments)
    • Boost Spirit x3: parse into structs
    • Combining rules at runtime and returning rules

    I don't think you can directly fusion-adapt. Although if you are very motivated (e.g. you already have the adapted structs) you could make some generic helpers off that.

    To be fair, a little bit of restructuring in your code seems pretty nice to me, already. Here's my effort to make it more elegant/convenient. I'll introduce a helper macro just like BOOST_FUSION_ADAPT_XXX, but not requiring any Boost Fusion.

    Let's Start With The AST

    As always, I like to start with the basics. Understanding the goal is half the battle:

    namespace Ast {
        using boost::optional;
    
        struct Selector {
            // These selectors always 
            //  - start with 1 or no elements, 
            //  - could contain 1 or no ids, and
            //  - could contain 0 to n classes.
            optional<std::string> element;
            optional<std::string> id;
            std::vector<std::string> classes;
    
            friend std::ostream& operator<<(std::ostream& os, Selector const&s) {
                if  (s.element.has_value()) os << s.element.value();
                if  (s.id.has_value())      os << "#" << s.id.value();
                for (auto& c : s.classes)   os << "." << c;
                return os;
            }
        };
    }
    

    Note that I fixed the optionality of some parts to reflect real life.

    You could use this to detect repeat-initialization of element/id fields.

    Magic Sauce (see below)

    #include "propagate.hpp"
    DEF_PROPAGATOR(Selector, id, element, classes)
    

    We'll dig into this later. Suffice it to say it generates the semantic actions that you had to tediously write.

    Main dish

    Now, we can simplify the parser rules a lot, and run the tests:

    int main() {
        auto name        = as<std::string>[x3::alpha >> *x3::alnum];
        auto idRule      = "#" >> name;
        auto classesRule = +("." >> name);
    
        auto selectorRule
            = x3::rule<class TestClass, Ast::Selector>{"selectorRule"}
            = +( name        [ Selector.element ]
               | idRule      [ Selector.id ]
               | classesRule [ Selector.classes ]
               )
            ;
    
        for (std::string const& input : {
                "element#id.class1.class2.classn",
                "element#id.class1",
                ".class1#id.class2.class3",
                "#id.class1.class2",
                ".class1.class2#id",
            })
        {
            Ast::Selector sel;
            std::cout << std::quoted(input) << " -->\n";
            if (x3::parse(begin(input), end(input), selectorRule >> x3::eoi, sel)) {
                std::cout << "\tSuccess: " << sel << "\n";
            } else {
                std::cout << "\tFailed\n";
            }
        }
    }
    

    See it Live On Wandbox, printing:

    "element#id.class1.class2.classn" -->
        Success: element#id.class1.class2.classn
    "element#id.class1" -->
        Success: element#id.class1
    ".class1#id.class2.class3" -->
        Success: #id.class1.class2.class3
    "#id.class1.class2" -->
        Success: #id.class1.class2
    ".class1.class2#id" -->
        Success: #id.class1.class2
    

    The Magic

    Now, how did I generate those actions? Using a little bit of Boost Preprocessor:

    #define MEM_PROPAGATOR(_, T, member) \
        Propagators::Prop<decltype(std::mem_fn(&T::member))> member { std::mem_fn(&T::member) };
    
    #define DEF_PROPAGATOR(type, ...) \
        struct type##S { \
            using T = Ast::type; \
            BOOST_PP_SEQ_FOR_EACH(MEM_PROPAGATOR, T, BOOST_PP_VARIADIC_TO_SEQ(__VA_ARGS__)) \
        } static const type {};
    

    Now, you might see that it defines static const variables named like the Ast types.

    You're free to call this macro in another namespace, say namespace Actions { }

    The real magic is Propagators::Prop<F> which has a bit of dispatch to allow for container attributes and members. Otherwise it just relays to x3::traits::move_to:

    namespace Propagators {
        template <typename F>
        struct Prop {
            F f;
            template <typename Ctx>
            auto operator()(Ctx& ctx) const {
                return dispatch(x3::_attr(ctx), f(x3::_val(ctx)));
            }
          private:
            template <typename Attr, typename Dest>
            static inline void dispatch(Attr& attr, Dest& dest) {
                call(attr, dest, is_container(attr), is_container(dest));
            }
    
            template <typename T>
            static auto is_container(T const&)           { return x3::traits::is_container<T>{}; }
            static auto is_container(std::string const&) { return boost::mpl::false_{}; }
    
            // tags for dispatch
            using attr_is_container = boost::mpl::true_;
            using attr_is_scalar    = boost::mpl::false_;
            using dest_is_container = boost::mpl::true_;
            using dest_is_scalar    = boost::mpl::false_;
    
            template <typename Attr, typename Dest>
            static inline void call(Attr& attr, Dest& dest, attr_is_scalar, dest_is_scalar) {
                x3::traits::move_to(attr, dest);
            }
            template <typename Attr, typename Dest>
            static inline void call(Attr& attr, Dest& dest, attr_is_scalar, dest_is_container) {
                dest.insert(dest.end(), attr);
            }
            template <typename Attr, typename Dest>
            static inline void call(Attr& attr, Dest& dest, attr_is_container, dest_is_container) {
                dest.insert(dest.end(), attr.begin(), attr.end());
            }
        };
    }
    

    BONUS

    A lot of the complexity in the propagator type is from handling container attributes. However, you don't actually need any of that:

    auto name = as<std::string>[x3::alpha >> *x3::alnum];
    
    auto selectorRule
        = x3::rule<class selector_, Ast::Selector>{"selectorRule"}
        = +( name        [ Selector.element ]
           | '#' >> name [ Selector.id ]
           | '.' >> name [ Selector.classes ]
           )
        ;
    

    Is more than enough, and the propagation helper can be simplified to:

    namespace Propagators {
        template <typename F> struct Prop {
            F f;
            template <typename Ctx>
            auto operator()(Ctx& ctx) const {
                return call(x3::_attr(ctx), f(x3::_val(ctx)));
            }
          private:
            template <typename Attr, typename Dest>
            static inline void call(Attr& attr, Dest& dest) {
                x3::traits::move_to(attr, dest);
            }
            template <typename Attr, typename Elem>
            static inline void call(Attr& attr, std::vector<Elem>& dest) {
                dest.insert(dest.end(), attr);
            }
        };
    }
    

    As you can see evaporating the tag dispatch has a beneficial effect.

    See the simplified version Live On Wandbox again.

    FULL LISTING

    For posterity on this site:

    • test.cpp

      //#define BOOST_SPIRIT_X3_DEBUG
      #include <boost/spirit/home/x3.hpp>
      #include <iostream>
      #include <iomanip>
      
      namespace x3 = boost::spirit::x3;
      
      namespace Ast {
          using boost::optional;
      
          struct Selector {
              // These selectors always 
              //  - start with 1 or no elements, 
              //  - could contain 1 or no ids, and
              //  - could contain 0 to n classes.
              optional<std::string> element;
              optional<std::string> id;
              std::vector<std::string> classes;
      
              friend std::ostream& operator<<(std::ostream& os, Selector const&s) {
                  if  (s.element.has_value()) os << s.element.value();
                  if  (s.id.has_value())      os << "#" << s.id.value();
                  for (auto& c : s.classes)   os << "." << c;
                  return os;
              }
          };
      }
      
      #include "propagate.hpp"
      DEF_PROPAGATOR(Selector, id, element, classes)
      
      #include "as.hpp"
      int main() {
          auto name = as<std::string>[x3::alpha >> *x3::alnum];
      
          auto selectorRule
              = x3::rule<class selector_, Ast::Selector>{"selectorRule"}
              = +( name        [ Selector.element ]
                 | '#' >> name [ Selector.id ]
                 | '.' >> name [ Selector.classes ]
                 )
              ;
      
          for (std::string const& input : {
                  "element#id.class1.class2.classn",
                  "element#id.class1",
                  ".class1#id.class2.class3",
                  "#id.class1.class2",
                  ".class1.class2#id",
              })
          {
              Ast::Selector sel;
              std::cout << std::quoted(input) << " -->\n";
              if (x3::parse(begin(input), end(input), selectorRule >> x3::eoi, sel)) {
                  std::cout << "\tSuccess: " << sel << "\n";
              } else {
                  std::cout << "\tFailed\n";
              }
          }
      }
      
    • propagate.hpp

      #pragma once
      #include <boost/preprocessor/cat.hpp>
      #include <boost/preprocessor/seq/for_each.hpp>
      #include <functional>
      
      namespace Propagators {
          template <typename F> struct Prop {
              F f;
              template <typename Ctx>
              auto operator()(Ctx& ctx) const {
                  return call(x3::_attr(ctx), f(x3::_val(ctx)));
              }
            private:
              template <typename Attr, typename Dest>
              static inline void call(Attr& attr, Dest& dest) {
                  x3::traits::move_to(attr, dest);
              }
              template <typename Attr, typename Elem>
              static inline void call(Attr& attr, std::vector<Elem>& dest) {
                  dest.insert(dest.end(), attr);
              }
          };
      }
      
      #define MEM_PROPAGATOR(_, T, member) \
          Propagators::Prop<decltype(std::mem_fn(&T::member))> member { std::mem_fn(&T::member) };
      
      #define DEF_PROPAGATOR(type, ...) \
          struct type##S { \
              using T = Ast::type; \
              BOOST_PP_SEQ_FOR_EACH(MEM_PROPAGATOR, T, BOOST_PP_VARIADIC_TO_SEQ(__VA_ARGS__)) \
          } static const type {};
      
    • as.hpp

      #pragma once
      #include <boost/spirit/home/x3.hpp>
      
      namespace {
          template <typename T>
          struct as_type {
              template <typename...> struct tag{};
              template <typename P>
              auto operator[](P p) const {
                  return boost::spirit::x3::rule<tag<T,P>, T> {"as"}
                         = p;
              }
          };
      
          template <typename T>
              static inline const as_type<T> as = {};
      }
      
    0 讨论(0)
提交回复
热议问题