I know that this is an older question now, but I thought I'd throw my 2 pence in!
With boost you get the opportunity to I'm write some data validation in your classes; this is good because the data definition and the checks for validity are all in one place.
With GPB the best you can do is to put comments in the .proto file and hope against all hope that whoever is using it reads it, pays attention to it, and implements the validity checks themselves.
Needless to say this is unlikely and unreliable if your relying on someone else at the other end of a network stream to do this with the same vigour as oneself. Plus if the constraints on validity change, multiple code changes need to be planned, coordinated and done.
Thus I consider GPB to be inappropriate for developments where there is little opportunity to regularly meet and talk with all team members.
==EDIT==
The kind of thing I mean is this:
message Foo
{
int32 bearing = 1;
}
Now who's to say what the valid range of bearing
is? We can have
message Foo
{
int32 bearing = 1; // Valid between 0 and 359
}
But that depends on someone else reading this and writing code for it. For example, if you edit it and the constraint becomes:
message Foo
{
int32 bearing = 1; // Valid between -180 and +180
}
you are completely dependent on everyone who has used this .proto updating their code. That is unreliable and expensive.
At least with Boost serialisation you're distributing a single C++ class, and that can have data validity checks built right into it. If those constraints change, then no one else need do any work other than making sure they're using the same version of the source code as you.
Alternative
There is an alternative: ASN.1. This is ancient, but has some really, really, handy things:
Foo ::= SEQUENCE
{
bearing INTEGER (0..359)
}
Note the constraint. So whenever anyone consumes this .asn file, generates code, they end up with code that will automatically check that bearing
is somewhere between 0 and 359. If you update the .asn file,
Foo ::= SEQUENCE
{
bearing INTEGER (-180..180)
}
all they need to do is recompile. No other code changes are required.
You can also do:
bearingMin INTEGER ::= 0
bearingMax INTEGER ::= 360
Foo ::= SEQUENCE
{
bearing INTEGER (bearingMin..<bearingMax)
}
Note the <
. And also in most tools the bearingMin and bearingMax can appear as constants in the generated code. That's extremely useful.
Constraints can be quite elaborate:
Garr ::= INTEGER (0..10 | 25..32)
Look at Chapter 13 in this PDF; it's amazing what you can do;
Arrays can be constrained too:
Bar ::= SEQUENCE (SIZE(1..5)) OF Foo
Sna ::= SEQUENCE (SIZE(5)) OF Foo
Fee ::= SEQUENCE
{
boo SEQUENCE (SIZE(1..<6)) OF INTEGER (-180<..<180)
}
ASN.1 is old fashioned, but still actively developed, widely used (your mobile phone uses it a lot), and far more flexible than most other serialisation technologies. About the only deficiency that I can see is that there is no decent code generator for Python. If you're using C/C++, C#, Java, ADA then you are well served by a mixture of free (C/C++, ADA) and commercial (C/C++, C#, JAVA) tools.
I especially like the wide choice of binary and text based wireformats. This makes it extremely convenient in some projects. The wireformat list currently includes:
- BER (binary)
- PER (binary, aligned and unaligned. This is ultra bit efficient. For example, and INTEGER constrained between
0
and 15
will take up only 4 bits
on the wire)
- OER
- DER (another binary)
- XML (also XER)
- JSON (brand new, tool support is still developing)
plus others.
Note the last two? Yes, you can define data structures in ASN.1, generate code, and emit / consume messages in XML and JSON. Not bad for a technology that started off back in the 1980s.
Versioning is done differently to GPB. You can allow for extensions:
Foo ::= SEQUENCE
{
bearing INTEGER (-180..180),
...
}
This means that at a later date I can add to Foo
, and older systems that have this version can still work (but can only access the bearing
field).
I rate ASN.1 very highly. It can be a pain to deal with (tools might cost money, the generated code isn't necessarily beautiful, etc). But the constraints are a truly fantastic feature that has saved me a whole ton of heart ache time and time again. Makes developers whinge a lot when the encoders / decoders report that they've generated duff data.
Other links:
- Good intro
- Open source C/C++ compiler
- Open source compiler, does ADA too AFAIK
- Commercial, good
- Commercial, good
- Try it yourself online
Observations
To share data:
- Code first approaches (e.g. Boost serialisation) restrict you to the original language (e.g. C++), or force you to do a lot of extra work in another language
- Schema first is better, but
- A lot of these leave big gaps in the sharing contract (i.e. no constraints). GPB is annoying in this regard, because it is otherwise very good.
- Some have constraints (e.g. XSD, JSON), but suffer patchy tool support.
- For example, Microsoft's xsd.exe actively ignores constraints in xsd files (MS's excuse is truly feeble). XSD is good (from the constraints point of view), but if you cannot trust the other guy to use a good XSD tool that enforces them for him/her then the worth of XSD is diminished
- JSON validators are ok, but they do nothing to help you form the JSON in the first place, and aren't automatically called. There's no guarantee that someone sending you JSON message have run it through a validator. You have to remember to validate it yourself.
- ASN.1 tools all seem to implement the constraints checking.
So for me, ASN.1 does it. It's the one that is least likely to result in someone else making a mistake, because it's the one with the right features and where the tools all seemingly endeavour to fully implement those features, and it is language neutral enough for most purposes.
To be honest, if GPB added a constraints mechanism that'd be the winner. XSD is close but the tools are almost universally rubbish. If there were decent code generators of other languages, JSON schema would be pretty good.
If GPB had constraints added (note: this would not change any of the wire formats), that'd be the one I'd recommend to everyone for almost every purpose. Though ASN.1's uPER is very useful for radio links.