Background information for KEP #17
Christian Mollekopf
mollekopf at kolabsys.com
Mon Dec 12 14:34:33 CET 2011
Hey,
As a follow-up to the announcement of KEP #17 I'd like to provide you with
some more background information.
The approach used differs slightly from what I had lined out in my last mail
[1] in the way that the xCal/xCard based xml objects are now fully RFC
compliant. We hope to improve the sustainability of the format this way, as
well as lowering the adoption barrier by other projects.
The new specification will provide us with a normative, canonical storage
format, eliminating the interoperability problems of xCal/xCard. I believe
that this will be a solid base for us to build a server where interoperability
between different clients truly exists.
On this occasion I would also like to share this summary, explaining some of
the conclusions we've come to:
"When we speak about formats, we need to keep in mind we have external and
internal formats. External formats are used for transport between applications
by different parties, each of which have their own internal format, typically,
e.g. Microsoft Exchange has a different internal format from Lotus Notes.
Zarafa has largely copied the Microsoft format, but stores it in its own way
in a SQL database, the same is true for virtually any major groupware
solution.
All these solutions then interact through external formats, most importantly
iCalendar and vCard, which are interpreted by each application into their
respective internal formats for processing.
The rationale for this is that internal formats need to be sparse, normative
and canonical. iCalendar & vCard are neither, giving them a vendor specific
flair closely related to their internal format, which is sometimes even derived
from iCalendar & vCard.
This was the root cause of the interoperability issues that Kolab had in its
first version, and are still experienced by CalDav servers today which went
down the same route.
So Kolab XML in Kolab Format 2.0 was copying the approach of many other
solutions to have a different internal storage format against which to work.
This, as we have seen, has its own pitfalls. Most importantly, it is a lot of
effort to create a clean specification and implement it well. It is even harder
to be aware of all the conceptual issues and thinking that went into the
external formats by the various vendors participating in them trying to make
sure their concepts could be expressed in those formats.
The result was the sometimes outright buggy specification of Kolab Format, with
very non-normative bits, and places where it cannot model the reality of
groupware well enough, as demonstrated in KEP 2, among others.
In addition, an internal storage format then needs to be maintained for all
supported clients, which is expensive and deviations between those clients can
become painful on a variety of levels.
In order to address this last point, we started discussing a libkolabformat
library in C++ which would provide a SWIG wrapped interface against various
languages in order to allow a single code base to maintain that internal
storage format.
It could also provide consistency checks, and if based upon an XSD description
to generate code, would allow all sorts of added benefits, e.g. having an
automatic object validation processor and so on. Plus minor changes in the
format become possible without touching the API for the clients, which means
everyone's job gets a lot easier.
So far, so good, and definitely sensible.
In theory the advantages would exist whether we do this in a completely
arbitrary schema in some obscure African language or re-use Kolab XML 2.0 or
whatever, with one important difference.
If we choose something that we maintain 100%, we have 100% of the effort.
On top of that, when looking at all the issues we would like to address in the
internal storage format, we quickly realized that for virtually anything we
always had to look at how iCalendar and vCard were modelling things in order
to make sure that we can express things with a certain fidelity.
So the question came up why not directly build upon the recently published
xCalendar / xCard RFCs which are also XML and are an assembly of all the
various concepts of all the applications out there. So instead of repeating
the exercise 100%, we take the existing 80% and then only provide 20% of
normative effort to make the formats usable for long term storage.
The particular twist in this would be not to repeat what Kolab Version 1 and
CalDav have done, but rather work from the existing RFCs and make them
normative and canonical in nature.
For the object types where such RFCs do not yet exist we will still have to do
the 100%, but then Calendar & Contacts are among the more complex objects, so
gaining 80% on them will most likely make a major difference.
In consequence, everything we would write would then be RFC compliant, and we
would know that where we model a particular concept in the Kolab format, we
have done it in a way that is semantically compatible to the most widely used
external format.
So we know we have less likelihood of having to break existing semantics in
the format, that we can add and even extent functionality in compatible ways,
and that the resulting object should be migration friendly. It also allows us
the differentiating claim that even our internal format is Open Standards
based.
In combination with the format library it means that existing clients will
have an easy time becoming full Kolab clients, and that parsing of the
external format will be mappable against our internal format.
They will obviously never replace their iCalendar / vCard libraries against
our library, as the person years of effort that went into those cannot be
easily duplicated, nor could we hope to ever be that complete in our
implementation due to resource constraints.
But even more importantly. We don't want to do this.
Going for full read compliance means we would move the interoperability issues
from the client side where they occasionally affect one client at a time, into
the database backend, where they affect all clients all the time.
So whatever we do, we definitely want to stay with a normative and canonical
storage format. This could be based upon Kolab XML version 1.1, or it could be
partially re-based upon xCalendar / xCard with some additional normalization,
The resulting format then has a realistic chance of actually being adopted by
other projects and solutions as their internal storage format, I believe, as
the combination of RFC & additional normative effort actually provides added
value of Kolab over other approaches.
Also I think this will be less effort for us to maintain, freeing resources for
more interesting and fun stuff."
Based on this I'd like to get a discussion started rather sooner than later.
You may start firing questions, critics, suggestions or any other comments =)
Cheers,
Christian
[1] http://kolab.org/pipermail/kolab-format/2011-November/001559.html
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: This is a digitally signed message part.
URL: <http://lists.kolab.org/pipermail/format/attachments/20111212/cd65e465/attachment.sig>
More information about the format
mailing list