Kolab XML Format: Proposal for an XSD friendly update

Wed Oct 19 13:04:04 CEST 2011

On 19.10.2011 12:15, Florian v. Samson wrote:
> Hello Christian,
>

Hi Florian,

> Am Dienstag, 18. Oktober 2011 um 18:46:24 schrieb Christian 
> Mollekopf:
>>
>> Because the various implementations of the Kolab XML Format are
>> difficult to maintain and are very error prone, the idea of a 
>> library to
>> read/write the XML objects came up. Till and Volker from KDAB 
>> pointed
>> out that using databindings based on an XML Schema (XSD) would be 
>> the
>> ideal tool to develop such a library. The process of writing this 
>> schema
>> brought up several problems with the format which I'm going to 
>> outline
>> here.
>>
>> == Why do we need a schema ==
>>
>> The current format specification is not very explicit about some
>> details and up to interpretation in these parts. A schema would give 
>> us
>> a much stricter specification which also allows XML files to be
>> validated against the spec.
>
> IIRC, that was the reason, why the Kolab developers crated a Relax-NG 
> schema
> 6 years ago:
> 
> <http://kolab.org/cgi-bin/viewcvs-kolab.cgi/doc/kolab-formats/validation/>
> (BTW, this is linked on Kolab.org's frontpage.)
> It may be a useful starting point.
>

Yes, thanks. Relax-NG is indeed a lot more flexible to write a schema 
which can validate XML files with undefined order of elements or unknown 
tags, this is due to the fact that Relax-NG schemas don't have to be 
deterministic while XSD schemas have this requirement. It is however, 
for the same reason, not suitable for databindings.

>> ...
>>
>> == Conclusion ==
>>
>> Because of these reasons I propose to change the Kolab XML Format in
>> the following ways:
>>
>> - disallow unknown tags
>
> Not allowing any tags undefined by the Kolab format goes way too far, 
> as it
> inhibits any client-specific extensions.
> E.g. for clients which have to use a different "native" format 
> internally
> (evolution-kolab, Toltec, etc.) this is necessary for storing 
> information
> which cannot be mapped to Kolab XML but must be retained.  Hence
> disallowing any unknown tags would either render these clients 
> non-working
> or these wrapper-tags would have to be defined in the Kolab format.
> There are a couple of other reasons, some of which already have been 
> pointed
> out, e.g. by Thomas.
>

This is clear, and yes these tags would have to go into the new 
specification.

>> - include now used unknown tags into the format
>
> Which ones?  All?  How do you plan to find out about them?  Reading 
> all
> client's source code?  How do you assess exhaustively, which 
> Kolab-XML
> capable clients do exist?
>

Yes, all. For our standard clients we can do it ourselves by reading 
the sourcecode or similar.
Note that most of the values I've found so far in Kontact would really 
be a valuable addition to the format.
For the clients of other vendors, I don't think it is unreasonable to 
ask vendors to tell us what they need for the next major upgrade (that 
is 3.0).
I don't see how we have to assess which Kolab-XML clients exist. We'll 
provide an upgrade path for the current specification (or better for the 
part we can),
and we'll provide an easy way to switch to the new format 
(databinding), the rest is really up to the clients if they want to be 
Kolab 3.0 compatible.

>> - make the order of the elements a requirement
>
> Ack.  This has limits, though.
>

How does this have limits?

>> - introduce a Kolab namespace
>
> This is scarce use of namespaces, in the light of the fact, that this 
> is a
> new fundamental requirement for clients to support XML namespaces.
>

As the bindings will handle all this I don't see a real problem. As 
said, with the changes the two formats are not compatible anyways, so it 
is really Kolab Format 3.0. Namespaces are needed for various 
applications and a general good practice without any drawbacks, so I 
don't see a reason not to introduce a namespace in the process of the 
format update.

>> I believe we can significantly improve the Kolab XML format this 
>> way.
>
> IMHO that should not be a isolated goal: the Kolab XML format has no 
> value
> on its own right, only Kolab-clients utilising it have any practical 
> value
> for customers and users.  Hence companies only can charge customers 
> for
> working software, not for a nice storage format.
> Surely a well defined Kolab format can ease the job of Kolab client
> developers, but as discussed extensively in the past on this list, 
> the
> developers of the various Kolab clients have vastly different 
> requirements,
> constraints, aims, resources (primarily time and manpower) and 
> vocality.
>

Agreed. But the kolab xml format inconsistencies and lack of bindings 
is a real world problem we're fighting with and which costs money. I 
doubt that we're the only ones fighting with that and believe therefore 
that it is in everyones interest to have a better solution for everyone.
Also note that our current clients are not really fully complying to 
the spec which leads to application interoperability problems. These are 
real world problems which cost manpower and money and I deem the 
proposed solution the best for everyone.

Cheers,
Christian

> Cheers
> 	Florian
>
> _______________________________________________
> Kolab-format mailing list
> Kolab-format at kolab.org
> https://kolab.org/mailman/listinfo/kolab-format