Boost Karma generator for composition of classes

后端 未结 2 948
名媛妹妹
名媛妹妹 2021-01-07 08:51

I\'ve the following class diagram:

There\'s some unused class like BinaryOperator, but my real code needs them so I want to keep them also in t

相关标签:
2条回答
  • 2021-01-07 08:55

    Thanks, the fact is that json is only one of many formats that I must to use, some format is proprietary and there are no libraries, so I want to use an uniform way for all. I've decided to use json for the question because is known to community more than, for example, asciimath or other formats created by us – Jepessen 9 hours ago

    This changes nothing about my recommendation. If anything, it really emphasizes that you don't want arbitrary restrictions imposed.

    The problems with Karma

    • Karma is an "inline" DSL for statically generated generators. They work well for statically typed things. Your AST uses dynamic polymorphism.

      That removes any chance of writing a succinct generator barring the use of many, complicated semantic actions. I don't remember writing many explicit answers related to Karma, but the problems with both dynamic polymorphism and semantic actions are much the same on the Qi side:

      • How can I use polymorphic attributes with boost::spirit::qi parsers?
      • Boost Spirit: "Semantic actions are evil"?

      The key draw backs all apply, except obviously that AST creation is not happening, so the performance effect of allocations is less severe than with Qi parsers.

      However, the same logic still stands: Karma generators are statically combined for efficiency. However your dynamic type hierarchy precludes most of that efficiency. In other words, you are not the target audience for Karma.

    • Karma has another structural limitation that will bite here, regardless of the way your AST is designed: it's (very) hard to make use of stateful rules to do pretty printing.

      This is, for me, a key reason to practically never use Karma. Even if pretty printing isn't a goal you can still get similar mileage just generating output visiting the AST using Boost Fusion directly (we used this in our project to generate different versions of OData XML and JSON representations of API types for use in restful APIs).

      Granted, there are some stateful generating tasks that have custom directives builtin to Karma, and sometimes they hit the sweet spot for rapid prototyping, e.g.

      • Writing a Boost ublas matrix to a text file
      • Though it gets wieldy quickly Inconsistent Generator directive column behavior in boost karma. Many quirks arise due to the way sub-generators are "counted" w.r.t. the columns[] directive

    Let's Do It Anyways

    Because I'm not a masochist, I'll do borrow a concept from the other answer: creating an intermediate representation that facilitates Karma a lot better.

    In this sample the intermediate representation can be exceedingly simple, but I suspect your other requirements like "for example, asciimath or other formats created by us" will require a more detailed design.

    ///////////////////////////////////////////////////////////////////////////////
    // A simple intermediate representation
    #include <boost/variant.hpp>
    namespace output_ast {
        struct Function;
        struct Value;
        using Expression = boost::variant<Function, Value>;
    
        using Arguments = std::vector<Expression>;
    
        struct Value    { std::string name, value; };
        struct Function { std::string name; Arguments args; };
    }
    

    Firstly, because we're going to use Karma, we do need to actually adapt the intermediate representation:

    #include <boost/fusion/include/struct.hpp>
    BOOST_FUSION_ADAPT_STRUCT(output_ast::Value, name, value)
    BOOST_FUSION_ADAPT_STRUCT(output_ast::Function, name, args)
    

    A generator

    Here's the simplest generator I can think of, give and take 2 things:

    • I have tweaked it for considerable time to get some "readable" format. It gets simpler if you remove all insignificant whitespace.
    • I opted to not store redundant information (such as the static "type" representation in the intermediate representation). Doing so would slightly uncomplicate, mostly by making the type rule more similar to name and value.
    namespace karma_json {
        namespace ka = boost::spirit::karma;
    
        template <typename It>
        struct Generator : ka::grammar<It, output_ast::Expression()> {
            Generator() : Generator::base_type(expression) {
                expression = function|value;
    
                function
                    = "{\n  " << ka::delimit(",\n  ") 
                       [name << type(+"Function") ]
                    << arguments 
                    << "\n}"
                    ;
    
                arguments = "\"arguments\": [" << -(("\n  " << expression) % ",") << ']';
    
                value
                    = "{\n  " << ka::delimit(",\n  ") 
                        [name << type(+"Value") ]
                    << value_ 
                    << "\n}"
                    ;
    
                type   = "\"type\":\"" << ka::string(ka::_r1) << "\"";
                string = '"' << *('\\' << ka::char_("\\\"") | ka::char_) << '"';
                name   = "\"name\":" << string;
                value_ = "\"value\":" << string;
            }
    
          private:
            ka::rule<It, output_ast::Expression()> expression;
            ka::rule<It, output_ast::Function()> function;
            ka::rule<It, output_ast::Arguments()> arguments;
            ka::rule<It, output_ast::Value()> value;
            ka::rule<It, std::string()> string, name, value_;
            ka::rule<It, void(std::string)> type;
        };
    }
    

    Post Scriptum

    I was making the simplified take for completeness. And ran into this excellent demonstration of completely unobvious attribute handling quirks. The following (just stripping whitespace handling) does not work:

    function = '{' << ka::delimit(',') [name << type] << arguments << '}';
    value = '{' << ka::delimit(',') [name << type] << value_ << '}' ;
    

    You can read the error novel here in case you like drama. The problem is that the delimit[] block magically consolidates the attributes into a single string (huh). The error message reflects that the string attribute has not been consumed when e.g. starting the arguments generator.

    The most direct way to treat the symptom would be to break up the attribute, but there's no real way:

    function = '{' << ka::delimit(',') [name << ka::eps << type] << arguments << '}';
    value = '{' << ka::delimit(',') [name << ka::eps << type] << value_ << '}' ;
    

    No difference

    function = '{' << ka::delimit(',') [ka::as_string[name] << ka::as_string[type]] << arguments << '}';
    value = '{' << ka::delimit(',') [ka::as_string[name] << ka::as_string[type]] << value_ << '}' ;
    

    Would be nice if it actually worked. No amount of adding includes or replacing with incantations like ka::as<std::string>()[...] made the compilation error go away.²

    So, to just end this sob-story, we'll stoop to the mind-numbingly tedious:

    function = '{' << name << ',' << type << ',' << arguments << '}';
    arguments = "\"arguments\":[" << -(expression % ',') << ']';
    

    See the section labeled "Simplified Version" below for the live demo.

    Using it

    The shortest way to generate using that grammar is to create the intermediate representation:

    ///////////////////////////////////////////////////////////////////////////////
    // Expression -> output_ast
    struct serialization {
        static output_ast::Expression call(Expression const* e) {
            if (auto* f = dynamic_cast<Function const*>(e)) {
                output_ast::Arguments args;
                for (auto& a : f->m_arguments) args.push_back(call(a));
                return output_ast::Function { f->getName(), args };
            }
    
            if (auto* v = dynamic_cast<Value const*>(e)) {
                return output_ast::Value { v->getName(), v->getValue() };
            }
    
            return {};
        }
    };
    
    auto to_output(Expression const* expression) {
        return serialization::call(expression);
    }
    

    And use that:

    using It = boost::spirit::ostream_iterator;
    std::cout << format(karma_json::Generator<It>{}, to_output(plus1));
    

    Full Demo

    Live On Wandbox¹

    #include <boost/lexical_cast.hpp>
    #include <iostream>
    #include <vector>
    
    struct Expression {
        virtual std::string getName() const = 0;
    };
    
    struct Value : Expression {
        virtual std::string getValue() const = 0;
    };
    
    struct IntegerValue : Value {
        IntegerValue(int value) : m_value(value) {}
        virtual std::string getName() const override { return "IntegerValue"; }
        virtual std::string getValue() const override { return boost::lexical_cast<std::string>(m_value); }
    
      private:
        int m_value;
    };
    
    struct Function : Expression {
        void addArgument(Expression *expression) { m_arguments.push_back(expression); }
        virtual std::string getName() const override { return m_name; }
    
      protected:
        std::vector<Expression *> m_arguments;
        std::string m_name;
    
        friend struct serialization;
    };
    
    struct Plus : Function {
        Plus() : Function() { m_name = "Plus"; }
    };
    
    ///////////////////////////////////////////////////////////////////////////////
    // A simple intermediate representation
    #include <boost/variant.hpp>
    namespace output_ast {
        struct Function;
        struct Value;
        using Expression = boost::variant<Function, Value>;
    
        using Arguments = std::vector<Expression>;
    
        struct Value    { std::string name, value; };
        struct Function { std::string name; Arguments args; };
    }
    
    #include <boost/fusion/include/struct.hpp>
    BOOST_FUSION_ADAPT_STRUCT(output_ast::Value, name, value)
    BOOST_FUSION_ADAPT_STRUCT(output_ast::Function, name, args)
    
    #include <boost/spirit/include/karma.hpp>
    namespace karma_json {
        namespace ka = boost::spirit::karma;
    
        template <typename It>
        struct Generator : ka::grammar<It, output_ast::Expression()> {
            Generator() : Generator::base_type(expression) {
                expression = function|value;
    
                function
                    = "{\n  " << ka::delimit(",\n  ") 
                       [name << type(+"Function") ]
                    << arguments 
                    << "\n}"
                    ;
    
                arguments = "\"arguments\": [" << -(("\n  " << expression) % ",") << ']';
    
                value
                    = "{\n  " << ka::delimit(",\n  ") 
                        [name << type(+"Value") ]
                    << value_ 
                    << "\n}"
                    ;
    
                type   = "\"type\":\"" << ka::string(ka::_r1) << "\"";
                string = '"' << *('\\' << ka::char_("\\\"") | ka::char_) << '"';
                name   = "\"name\":" << string;
                value_ = "\"value\":" << string;
            }
    
          private:
            ka::rule<It, output_ast::Expression()> expression;
            ka::rule<It, output_ast::Function()> function;
            ka::rule<It, output_ast::Arguments()> arguments;
            ka::rule<It, output_ast::Value()> value;
            ka::rule<It, std::string()> string, name, value_;
            ka::rule<It, void(std::string)> type;
        };
    }
    
    ///////////////////////////////////////////////////////////////////////////////
    // Expression -> output_ast
    struct serialization {
        static output_ast::Expression call(Expression const* e) {
            if (auto* f = dynamic_cast<Function const*>(e)) {
                output_ast::Arguments args;
                for (auto& a : f->m_arguments) args.push_back(call(a));
                return output_ast::Function { f->getName(), args };
            }
    
            if (auto* v = dynamic_cast<Value const*>(e)) {
                return output_ast::Value { v->getName(), v->getValue() };
            }
    
            return {};
        }
    };
    
    auto to_output(Expression const* expression) {
        return serialization::call(expression);
    }
    
    int main() {
        // Build expression 4 + 5 + 6 as 4 + (5 + 6)
        Function *plus1 = new Plus();
        Function *plus2 = new Plus();
        Value *iv4 = new IntegerValue(4);
        Value *iv5 = new IntegerValue(5);
        Value *iv6 = new IntegerValue(6);
        plus2->addArgument(iv5);
        plus2->addArgument(iv6);
        plus1->addArgument(iv4);
        plus1->addArgument(plus2);
    
        // Generate json string here, but how?
        using It = boost::spirit::ostream_iterator;
        std::cout << format(karma_json::Generator<It>{}, to_output(plus1));
    }
    

    The Output

    The generator is being as as readable/robust/functional as I'd like (there are quirks related to delimiters, there are issues when type contain characters that would need to be quoted, there's no stateful indentation).

    The result doesn't look as expected, though it's valid JSON:

    {
      "name":"Plus",
      "type":"Function",
      "arguments": [
      {
      "name":"IntegerValue",
      "type":"Value",
      "value":"4"
    },
      {
      "name":"Plus",
      "type":"Function",
      "arguments": [
      {
      "name":"IntegerValue",
      "type":"Value",
      "value":"5"
    },
      {
      "name":"IntegerValue",
      "type":"Value",
      "value":"6"
    }]
    }]
    }
    

    Fixing it is... a nice challenge if you want to try it.

    The Simplified Version

    The simplified version, complete with attribute-handling workaround documented above:

    Live On Coliru

    namespace karma_json {
        namespace ka = boost::spirit::karma;
    
        template <typename It>
        struct Generator : ka::grammar<It, output_ast::Expression()> {
            Generator() : Generator::base_type(expression) {
                expression = function|value;
    
                function = '{' << name << ',' << type << ',' << arguments << '}';
                arguments = "\"arguments\":[" << -(expression % ',') << ']';
    
                value = '{' << name << ',' << type << ',' << value_ << '}' ;
    
                string = '"' << *('\\' << ka::char_("\\\"") | ka::char_) << '"';
                type   = "\"type\":" << string;
                name   = "\"name\":" << string;
                value_ = "\"value\":" << string;
            }
    
          private:
            ka::rule<It, output_ast::Expression()> expression;
            ka::rule<It, output_ast::Function()> function;
            ka::rule<It, output_ast::Arguments()> arguments;
            ka::rule<It, output_ast::Value()> value;
            ka::rule<It, std::string()> string, name, type, value_;
        };
    }
    

    Yields the following output:

    {"name":"Plus","type":"Function","arguments":[{"name":"IntegerValue","type":"Value","value":"4"},{"name":"Plus","type":"Function","arguments":[{"name":"IntegerValue","type":"Value","value":"5"},{"name":"IntegerValue","type":"Value","value":"6"}]}]}
    

    I'm inclined to think this is a much better cost/benefit ratio than the failed attempt at "pretty" formatting. But the real story here is that the maintenance cost is through the roof anyways.


    ¹ Interestingly, Coliru exceeds the compilation time... This too could be an argument guiding your design descisions

    ² makes you wonder how many people actually use Karma day-to-day

    0 讨论(0)
  • 2021-01-07 09:11

    I'd advise against using Karma to generate JSON. I'd advise strongly against ADAPT_ADT (it's prone to very subtle UB bugs and it means you're trying to adapt something that wasn't designed for it. Just say no).

    Here's my take on it. Let's take the high road and be as unintrusive as possible. That means

    • We can't just overload operator<< to print json (because you may want to naturally print the expressions instead)
    • It also means that what ever function is responsible for generating the JSON doesn't

      • have to bother with json implementation details
      • have to bother with pretty formatting
    • Finally, I wouldn't want to intrude on the expression tree with anything JSON specific. The most that could be acceptable is an opaque friend declaration.


    A simple JSON facility:

    This might well be the most simplistic JSON representation, but it does the required subset and makes a number of smart choices (supporting duplicate properties, retaining property order for example):

    #include <boost/variant.hpp>
    namespace json {
        // adhoc JSON rep
        struct Null {};
        using String = std::string;
    
        using Value = boost::make_recursive_variant<
            Null,
            String,
            std::vector<boost::recursive_variant_>,
            std::vector<std::pair<String, boost::recursive_variant_> >
        >::type;
    
        using Property = std::pair<String, Value>;
        using Object = std::vector<Property>;
        using Array = std::vector<Value>;
    }
    

    That's all. This is fully functional. Let's prove it


    Pretty Printing JSON

    Like with the Expression tree itself, let's not hardwire this, but instead create a pretty-printing IO manipulator:

    #include <iomanip>
    namespace json {
    
        // pretty print it
        struct pretty_io {
            using result_type = void;
    
            template <typename Ref>
            struct manip {
                Ref ref;
                friend std::ostream& operator<<(std::ostream& os, manip const& m) {
                    pretty_io{os,""}(m.ref);
                    return os;
                }
            };
    
            std::ostream& _os;
            std::string _indent;
    
            void operator()(Value const& v) const {
                boost::apply_visitor(*this, v);
            }
            void operator()(Null) const {
                _os << "null";
            }
            void operator()(String const& s) const {
                _os << std::quoted(s);
            }
            void operator()(Property const& p) const {
                _os << '\n' << _indent; operator()(p.first);
                _os << ": ";            operator()(p.second);
            }
            void operator()(Object const& o) const {
                pretty_io nested{_os, _indent+"  "};
                _os << "{";
                bool first = true;
                for (auto& p : o) { first||_os << ","; nested(p); first = false; }
                _os << "\n" << _indent << "}";
            }
            void operator()(Array const& o) const {
                pretty_io nested{_os, _indent+"  "};
                _os << "[\n" << _indent << "  ";
                bool first = true;
                for (auto& p : o) { first||_os << ",\n" << _indent << "  "; nested(p); first = false; }
                _os << "\n" << _indent << "]";
            }
        };
    
        Value to_json(Value const& v) { return v; }
    
        template <typename T, typename V = decltype(to_json(std::declval<T const&>()))>
        pretty_io::manip<V> pretty(T const& v) { return {to_json(v)}; }
    }
    

    The to_json thing dubs as a handy ADL-enabled extension point, you can already us it now:

    std::cout << json::pretty("hello world"); // prints as a JSON String
    

    Connecting it up

    To make the following work:

    std::cout << json::pretty(plus1);
    

    All we need is the appropriate to_json overload. We could jot it all in there, but we might end up needing to "friend" a function named to_json, worse still, forward declare types from the json namespace (json::Value at the very least). That's too intrusive. So, let's add anothe tiny indirection:

    auto to_json(Expression const* expression) {
        return serialization::call(expression);
    }
    

    The trick is to hide the JSON stuff inside an opaque struct that we can then befriend: struct serialization. The rest is straightforward:

    struct serialization {
        static json::Value call(Expression const* e) {
            if (auto* f = dynamic_cast<Function const*>(e)) {
                json::Array args;
                for (auto& a : f->m_arguments)
                    args.push_back(call(a));
                return json::Object {
                    { "name", f->getName() },
                    { "type", "Function" },
                    { "arguments", args },
                };
            }
    
            if (auto* v = dynamic_cast<Value const*>(e)) {
                return json::Object {
                    { "name", v->getName() },
                    { "type", "Value" },
                    { "value", v->getValue() },
                };
            }
    
            return {}; // Null in case we didn't implement a node type
        }
    };
    

    Full Demo

    See it Live On Coliru

    #include <boost/lexical_cast.hpp>
    #include <iostream>
    #include <iomanip>
    #include <vector>
    
    struct Expression {
        virtual std::string getName() const = 0;
    };
    
    struct Value : Expression {
        virtual std::string getValue() const = 0;
    };
    
    struct IntegerValue : Value {
        IntegerValue(int value) : m_value(value) {}
        virtual std::string getName() const override { return "IntegerValue"; }
        virtual std::string getValue() const override { return boost::lexical_cast<std::string>(m_value); }
    
      private:
        int m_value;
    };
    
    struct Function : Expression {
        void addArgument(Expression *expression) { m_arguments.push_back(expression); }
        virtual std::string getName() const override { return m_name; }
    
      protected:
        std::vector<Expression *> m_arguments;
        std::string m_name;
    
        friend struct serialization;
    };
    
    struct Plus : Function {
        Plus() : Function() { m_name = "Plus"; }
    };
    
    ///////////////////////////////////////////////////////////////////////////////
    // A simple JSON facility
    #include <boost/variant.hpp>
    namespace json {
        // adhoc JSON rep
        struct Null {};
        using String = std::string;
    
        using Value = boost::make_recursive_variant<
            Null,
            String,
            std::vector<boost::recursive_variant_>,
            std::vector<std::pair<String, boost::recursive_variant_> >
        >::type;
    
        using Property = std::pair<String, Value>;
        using Object = std::vector<Property>;
        using Array = std::vector<Value>;
    }
    
    ///////////////////////////////////////////////////////////////////////////////
    // Pretty Print manipulator
    #include <iomanip>
    namespace json {
    
        // pretty print it
        struct pretty_io {
            using result_type = void;
    
            template <typename Ref>
            struct manip {
                Ref ref;
                friend std::ostream& operator<<(std::ostream& os, manip const& m) {
                    pretty_io{os,""}(m.ref);
                    return os;
                }
            };
    
            std::ostream& _os;
            std::string _indent;
    
            void operator()(Value const& v) const {
                boost::apply_visitor(*this, v);
            }
            void operator()(Null) const {
                _os << "null";
            }
            void operator()(String const& s) const {
                _os << std::quoted(s);
            }
            void operator()(Property const& p) const {
                _os << '\n' << _indent; operator()(p.first);
                _os << ": ";            operator()(p.second);
            }
            void operator()(Object const& o) const {
                pretty_io nested{_os, _indent+"  "};
                _os << "{";
                bool first = true;
                for (auto& p : o) { first||_os << ","; nested(p); first = false; }
                _os << "\n" << _indent << "}";
            }
            void operator()(Array const& o) const {
                pretty_io nested{_os, _indent+"  "};
                _os << "[\n" << _indent << "  ";
                bool first = true;
                for (auto& p : o) { first||_os << ",\n" << _indent << "  "; nested(p); first = false; }
                _os << "\n" << _indent << "]";
            }
        };
    
        Value to_json(Value const& v) { return v; }
    
        template <typename T, typename V = decltype(to_json(std::declval<T const&>()))>
        pretty_io::manip<V> pretty(T const& v) { return {to_json(v)}; }
    }
    
    ///////////////////////////////////////////////////////////////////////////////
    // Expression -> JSON
    struct serialization {
        static json::Value call(Expression const* e) {
            if (auto* f = dynamic_cast<Function const*>(e)) {
                json::Array args;
                for (auto& a : f->m_arguments)
                    args.push_back(call(a));
                return json::Object {
                    { "name", f->getName() },
                    { "type", "Function" },
                    { "arguments", args },
                };
            }
    
            if (auto* v = dynamic_cast<Value const*>(e)) {
                return json::Object {
                    { "name", v->getName() },
                    { "type", "Value" },
                    { "value", v->getValue() },
                };
            }
    
            return {};
        }
    };
    
    auto to_json(Expression const* expression) {
        return serialization::call(expression);
    }
    
    int main() {
        // Build expression 4 + 5 + 6 as 4 + (5 + 6)
        Function *plus1 = new Plus();
        Function *plus2 = new Plus();
        Value *iv4 = new IntegerValue(4);
        Value *iv5 = new IntegerValue(5);
        Value *iv6 = new IntegerValue(6);
        plus2->addArgument(iv5);
        plus2->addArgument(iv6);
        plus1->addArgument(iv4);
        plus1->addArgument(plus2);
    
        // Generate json string here, but how?
    
        std::cout << json::pretty(plus1);
    }
    

    Output is picture-perfect from your question:

    {
      "name": "Plus",
      "type": "Function",
      "arguments": [
        {
          "name": "IntegerValue",
          "type": "Value",
          "value": "4"
        },
        {
          "name": "Plus",
          "type": "Function",
          "arguments": [
            {
              "name": "IntegerValue",
              "type": "Value",
              "value": "5"
            },
            {
              "name": "IntegerValue",
              "type": "Value",
              "value": "6"
            }
          ]
        }
      ]
    }
    
    0 讨论(0)
提交回复
热议问题