[GCC-XML] Xrtti.h

Bryan Ischo bji-gccxml at ischo.com
Thu Apr 26 02:50:17 EDT 2007


> Our differences as far as I undersand.
>
> Your scheme,
> 1. Define CPP_Interface through class definitions and using basic types.
> 2. Use GCCXML to generate CPP_Interface_XML
> 3. Use CPP_Interface_XML to generate serialization code.

Yes, exactly.

> My scheme,
> 1. Define Basic Types and User Defined types in XSD.
> 2. a.  A parser would understand and generate the CPP or Python or Perl
> classes for class declarations
>     b. The parser would also generate Serialization methods which will be
> say in binary form.

Yes, I understand now what you are saying.  I realize that XSD is a
standard document format and that you will have the advantages that go
along with that.

> I understand with your dislike of XML for serialization output, but I see
> no reason why C++ class
> would win over XSD for interface specification.

That is a good point; I have also thought of this - why not use a separate
data type definition language (XSD based or other IDL type thing) and then
generate the C++ code (or Java code, Python code, etc) instead?

It's a difference between starting with C++ and getting serialization, or
starting with XSD/IDL and getting serializable C++.

I think of my approach as more like adding a missing feature to C++ - more
comprehensive reflection (Xrtti), and serialization.  Just like Java has
built-in serialization support, I had hoped to fill in the gaps for C++
and add the same features to it, in a way that as nearly as possible
seemed like a natural feature of the language.

It's just a matter of different approaches; I can certainly see the value
of  what you propose, because it is language-agnostic and can be used to
generate bindings for any language at all.  On the other hand, to
accomplish this it has to define a new language of its own (the XSD
specification, IDL, whatever) that the developer has to manage.  I am
going with my approach because it means that a C++ developer has nothing
to do except write C++ classes as they normally would (with some caveats
that I discuss below), and they get serialization nearly "for free" with
Xrtti + XrttiSerial (or whatever I end up calling my serialization
library).

Probably the best of both worlds would be to have tools which:

1. Generate a common XSD format from C++ class definitions
2. Generate C++ class definitions from a common XSD format
3. Generate C++ serialization from the XSD format

Then, the developer could choose to start with the XSD and generate their
C++ classes (as in 1), or start with C++ classes and generate XSD (as in
2).  Either way, using (3), they end up with both C++ classes and the
means for serializing them.  And other languages could be supported too.

While I agree that this provides the most flexibility, and I will
certainly consider altering my serialization project in the future to use
an intermediate description language like your XSD, for the time being I
am going to do the simpler thing of just always starting with C++ and
using Xrtti and my serialization library to add serialization to any C++
code.  It solves the problem I am trying to solve, in a way that requires
minimal extra steps for the developer.

> Infact, in your scheme XML is indeed in one of the steps before feeding to
> your code generator.

Yes, sadly, this is true.  XML is, to me, such a pig of a document format,
that it is very hard to work with.  In the free software world, it is very
hard to find a reasonable XML parser.  As far as I can tell, expat is far
away the most popular and widely used simple XML parser (that doesn't come
attached with a huge framework of other software).  And expat is so very
very kludgy in its API (in my opinion), it represents to me the problem
with XML: it is so hard to write parsers for, that no one bothers to write
good parsers for it.  XML parsers end up being just barely "good enough",
with the exhausted developer who has written the XML parser left with no
energy to make the parser really good, such as being able to parse XML
documents in fragments instead of one big chunk.

Anyway, gccxml generates XML (unfortunately) and my xrttigen program does
use expat to parse the XML, and I am glad that expat exists despite my
complaints because otherwise I'd have to write a parser for XML myself.

> I do have one trick problem with defining interface, how do you handle
> conditional definitions?
>
> In example,
> class response
> {
>         bool status;
>          UserDefinedTypeSuccessRecord record;
> }
> If we want to define UserDefinedTypeSuccessRecord is valid or serialized
> only if status is set to true.

Unfortunately C++ has some real problems when it comes to serialization;
the language is a little too free-form to allow just any old C++ classes
to be serialized without programmer intervention.  In cases like you have
mentioned there are two options, as far as I can tell:

1. Provide mechanisms for the developer to put custom code in to guide
serialization in places where it cannot be automatically deduced from the
class structure (Boost and s11n and others work this way, requiring
*every* class to have custom serialization code written by hand by the
developer).

2. Require the developer to follow certain conventions when defining their
C++ classes so that the serializer can always do the right thing.

For my purposes, I will be using (2).  This means that for example if a
developer defines a class with an array:

class response
{
      int *idArray;
};

Then the developer will have to follow a specific convention for declaring
how many elements are in the idArray at the time that serialization is
done on the class; in my case, I will require that there be a member
called "idArray_count" which gives the number of elements.  So the
developer would be required to do:

class response
{
      int *idArray;
      int idArray_count;
};

The serializer would give an error and refuse to serialize any class which
doesn't do this.

This means that not every C++ class can be serialized, but the
requirements for supporting serialization are really pretty minimal and
not hard to implement.

I wish it were possible to automatically serialize every class but
unfortunately without developer support (option 1 that I outlined above)
this is not possible.  I would rather not complicate the developer's life
by making them write code to custom serialize classes; I think it is
cleaner and neater to simply require a few conventions that developers
must follow when defining serializable classes.

To bring this back to your example above, I would probably say that the
developer should instead define their class like this:

class response
{
      bool status;
      UserDefinedTypeSuccessRecord record[1];
      int record_count;
};

And the developer would have to make sure that record_count was 1 whenever
record was to be used.  The developer could even drop status in favor of
just record_count, which would be 1 on success and 0 on error, and serve
both purposes of indicating status, and indicating how many valid entries
are in the record array (zero or one).

> We can continue the conversation in private if we are boring the rest of
> the group :-) My id is shiva at qualcomm.com

This list gets very little traffic, and I don't think it's annoying too
many people.  If anyone wants us to go private, please speak up.  I will
not be offended :)

Thank you, and best wishes,
Bryan

------------------------------------------------------------------------
Bryan Ischo                bryan at ischo.com            2001 Mazda 626 GLX
Hamilton, New Zealand      http://www.ischo.com     RedHat Fedora Core 5
------------------------------------------------------------------------





More information about the gccxml mailing list