martin: doc/kolab-formats kolab-file-format.lyx,NONE,1.1
cvs at intevation.de
cvs at intevation.de
Tue Mar 30 13:38:42 CEST 2004
Author: martin
Update of /kolabrepository/doc/kolab-formats
In directory doto:/tmp/cvs-serv13520/kolab-formats
Added Files:
kolab-file-format.lyx
Log Message:
Martin K.: Proposal for new fileformat allowing for concurrent use of Outlook and KDE Kolab clients on the same resource (e.g. a shared folder)
--- NEW FILE: kolab-file-format.lyx ---
#LyX 1.3 created this file. For more info see http://www.lyx.org/
\lyxformat 221
\textclass docbook
\language english
\inputencoding auto
\fontscheme default
\graphics default
\paperfontsize default
\papersize Default
\paperpackage a4
\use_geometry 0
\use_amsmath 0
\use_natbib 0
\use_numerical_citations 0
\paperorientation portrait
\secnumdepth 3
\tocdepth 3
\paragraph_separation indent
\defskip medskip
\quotes_language english
\quotes_times 2
\papercolumns 1
\papersides 1
\paperpagestyle default
\layout Title
\added_space_top vfill \added_space_bottom vfill
Kolab Storage Format
\layout Date
28th March 2004
\layout Author
\begin_inset ERT
status Collapsed
\layout Standard
<firstname>
\end_inset
Martin
\begin_inset ERT
status Collapsed
\layout Standard
</firstname><surname>
\end_inset
Konold
\begin_inset ERT
status Collapsed
\layout Standard
</surname>
\end_inset
\layout Abstract
A cross platform groupware solution requires in case of simultaneous use
of shared data e.g.
shared calendar or shared contacts a common fileformat independent of operating
system and Kolab client used to access the data.
Unfortunately the internal data structures of the popular MAPI based Microsoft
Outlook are very different from the internet standards RFC 2445 and RFC
2026.
Different approaches to solve the issue have been proposed.
Sofar no final solution has been found.
This document provides an alternative approach to the problem using a redundant
data technique.
\layout Section
Kolab Storage Format dilemma
\layout Standard
In Kolab 1.0 we basically have two incompatible data formats for the KDE
Kolab Client (KMail) and the Outlook Plugins (Bynari Connector 1.x or Toltec
Connector 1.x).
While the KDE Kolab client uses icalendar based clear text representation
of groupware data the Outlook Plugins use serialized TNEF encoded MAPI
objects.
\layout Standard
\begin_inset Float table
wide false
collapsed false
\layout Caption
Outlook and KDE Kolab Client Dataformat
\begin_inset LatexCommand \ref{Dataformat}
\end_inset
\layout Standard
\begin_inset Tabular
<lyxtabular version="3" rows="8" columns="2">
<features>
<column alignment="center" valignment="top" leftline="true" width="0">
<column alignment="center" valignment="top" leftline="true" rightline="true" width="0">
<row topline="true" bottomline="true">
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\layout Standard
Outlook
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\layout Standard
KDE Client
\end_inset
</cell>
</row>
<row topline="true" bottomline="true">
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\layout Standard
tnef
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\layout Standard
ical
\end_inset
</cell>
</row>
<row topline="true" bottomline="true">
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\layout Standard
\begin_inset Formula $\vec{{M}}$
\end_inset
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\layout Standard
\begin_inset Formula $\vec{{N}}$
\end_inset
\end_inset
</cell>
</row>
<row topline="true">
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\layout Standard
\begin_inset Formula $M_{1}$
\end_inset
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\layout Standard
\begin_inset Formula $N_{1}$
\end_inset
\end_inset
</cell>
</row>
<row topline="true">
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\layout Standard
\begin_inset Formula $M_{2}$
\end_inset
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\layout Standard
\begin_inset Formula $N_{2}$
\end_inset
\end_inset
</cell>
</row>
<row topline="true">
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\layout Standard
\begin_inset Formula $M_{3}$
\end_inset
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\layout Standard
\begin_inset Formula $N_{3}$
\end_inset
\end_inset
</cell>
</row>
<row topline="true">
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\layout Standard
\begin_inset Formula $M_{4}$
\end_inset
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\layout Standard
\end_inset
</cell>
</row>
<row topline="true" bottomline="true">
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\layout Standard
\begin_inset Formula $M_{5}$
\end_inset
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\layout Standard
\end_inset
</cell>
</row>
</lyxtabular>
\end_inset
\end_inset
\layout Standard
As shown in Table XXX the Outlook Plugin and and the KDE Kolab Client use
very different representation of data using different encodings and a different
number of attributes or properties.
This is due to the fact that e.g.
Outlook internally is using a totally different data model while the KDE
Kolab Client simply uses icalendar internally.
MS Outlook is MAPI based and uses a MAPI Storage Container.
All currently available Outlook Plugins for Kolab use a PST file.
The PST file can be considered as beeing similar to a hierarchical database
using an access path called the MAPI entry ID in order to reference an
object.
\layout Standard
Within a MAPI object there are numerious properties similar to icalendar
attributes.
The number of MAPI properties is typically much larger than the number
of icalendar atttributes describing the same object e.g.
a calendar entry or an contact (vcard).
The later fact is shown in table XXX.
For every icalendar attribute there is a corresponding MAPI property but
not for every MAPI property is an corresponding icalendar attribute.
\layout Standard
This leads to the fact that a mapping from MAPI to icalendar is
\emph on
not
\emph default
reverseable.
\layout Standard
\begin_inset Formula \[
\vec{{M}\rightarrow\vec{{N}\rightarrow\vec{{M'}}}}\]
\end_inset
\layout Standard
\begin_inset Formula \[
\vec{{M}\neq\vec{{M'}}}\]
\end_inset
\layout Section
Kolab Storage Format 2.0
\layout Standard
The proposed solution to the dilemma as described in the previous chapter
is the following method based on redundant storage of attribute and properties
while maintaining the coherency of the data on the application level.
We hereby assume that the KDE Kolab Client is able to parse a TNEF encoded
MAPI object using libktnef in order to retrieve any relevant data and that
the Outlook plugin is able to parse icalendar objects.
Actually in order to be able to parse a tnef encoded MAPI object not much
about MAPI object needs to be understood.
I verified that a simple brute force reengineering effort would be sufficient
in order to figure out which properties contain the required data.
The biggest effort is to find the most suitable property containing the
often redundant data.
E.g.
a MAPI object representing a calendar entry contains the names of the attendend
ees in
\emph on
multiple
\emph default
properties simultaneously.
A lot of this work has already been done for the libktnef implementation.
\layout Standard
I assume that the redundancy within MAPI is due to the long history (
\begin_inset Formula $>10$
\end_inset
years) of the Windows MAPI compatibility cruft.
Using the data from the parsing of the tnef encoded MAPI the KDE Kolab
client is then able to create a valid and complete icalendar object.
The KDE Kolab Client can now easily store the icalendar object on the Kolab
server.
On the other hand the KDE Kolab Client is not able to either update nor
create a tnef encoded MAPI object.
In case the KDE Kolab Client changes any attribute as is very common with
shared folders the actual data stored in the icalendar object is out of
sync with the original MAPI object.
\layout Standard
In case a KDE Kolab Client again accesses the same object it has to detect
the fact that the MAPI already got previously synchronized to the icalendar
object and therefor the KDE Kolab Client has to consider the icalendar
object to be determinative.
\layout Standard
On the other hand if the Outlook Kolab Client using an appropriate plugin
sees the modified object then this plugin has first to read the outdated
TNEF encoded MAPI object and then figure out that the corresponding icalendar
object is newer than the MAPI object.
The plugin then parses the icalendar object and used this newer data in
order to update the outdated data of the MAPI object.
This method allows for preservation of MAPI specific properties not available
in the icalendar object while the KDE Kolab client is not required to keep
the tnef encoded MAPI obejct uptodate.
\layout Standard
\series bold
Short summary:
\series default
The basic idea is that the KDE Kolab client is able to parse both the icalendar
ojects and the tnef encoded MAPI objects but only able to write icalendar
objects to the Kolab server.
On the other hand the Outlook Kolab client is expected to be able to parse
its own tnef encoded MAPI object and in addition icalendar objects as written
by the KDE Kolab client.
This means that we only require both clients to be able to
\emph on
parse both formats
\emph default
for
\emph on
reading
\emph default
while they only need to be able to
\emph on
write
\emph default
their
\emph on
own native
\emph default
formats.
\layout Subsection
Implementation
\layout Standard
The Kolab Storage format is implemented using multi-part mime messages as
defined in RFC 2045.
The Content-Type of TNEF encoded MAPI object is
\series bold
application/ms-tnef
\series default
.
The Content-Type of icalendar object is
\series bold
text/calendar
\series default
.
Each part of the multi-part mime message has a
\series bold
Content-Description
\series default
Header Field describing a serial number.
This serial number is required in order to be able to figure out which
part is uptodate and therefore authorative.
\layout Standard
\begin_inset Formula $\vec{M_{00}\rightarrow^{OL}\vec{{M_{01}}\rightarrow^{KDE}\frac{\vec{{M_{01}}}}{\vec{{N_{01}}}}}}$
\end_inset
\begin_inset Formula $\rightarrow^{KDE}\frac{\vec{{M_{01}}}}{N_{02}}\rightarrow^{OL}\frac{\vec{{M_{03}}}}{N_{02}}$
\end_inset
\begin_inset Formula $\rightarrow^{KDE}\frac{\vec{{M_{03}}}}{N_{04}}$
\end_inset
\layout Standard
In the above example the Outlook Kolab client creates a TNEF encoded multi-part
mime message
\begin_inset Formula $\vec{M_{0}}$
\end_inset
containing a single part with the serial number 0.
This TNEF encoded MAPI object is then subsequentially read and by an Outlook
Kolab client.
The resulting modified object
\begin_inset Formula $\vec{{M_{1}}}$
\end_inset
is then given the updated serial number 1.
In the next step a KDE Kolab client reads the multi-part mime message containin
g only a single part with the Content-Type TNEF and notices that there is
no icalendar part currently present.
Consequently the KDE Kolab client parses the TNEF encoded part
\begin_inset Formula $\vec{{M_{1}}}$
\end_inset
for all required information and creates
\begin_inset Formula $\vec{{N_{1}}.}$
\end_inset
In case the KDE Kolab client then changes the object
\begin_inset Formula $\vec{{N_{1}}}$
\end_inset
it has to increase the serial number to
\begin_inset Formula $\vec{{N_{2}}}$
\end_inset
.
The serial number of the TNEF encoded MAPI object is never modified by
the KDE Kolab client and vice versa.
In this example the subsecquent read/modify is done by the Outlook client
and therefore this client has to firstly read the outdated TNEF part and
then update this MAPI object using the uptodate data from the icalendar
part.
Due to the fact that in this step there is actually a read followed by
a modify the Outlook Kolab client writes the
\begin_inset Formula $\vec{{M_{3}}}$
\end_inset
part while preserving the
\begin_inset Formula $\vec{{N_{2}}}$
\end_inset
part from the KDE Kolab client.
Lastly a KDE Kolab client accesses the message reads the icalendar part
and notices that the TNEF part is more recent than the icalendar part.
Again the TNEF part is parsed and the icalendar part is updated.
In case the KDE Kolab client would not change the data it would write
\begin_inset Formula $\vec{{N_{3}}}$
\end_inset
while preserving
\begin_inset Formula $\vec{{M_{3}}}$
\end_inset
but in this example the KDE Kolab client modified the data and consequently
wrote
\begin_inset Formula $\vec{{N_{4}}}$
\end_inset
while preserving
\begin_inset Formula $\vec{{M_{3}}}$
\end_inset
\layout Section
Kolab Storage Format 3.0
\layout Standard
With Kolab 3.0 I want to propose a new data format for Kolab based on an
extension of the RFC 2445.
This new format will avoid the need of two parts per mime message and avoid
unnecessary redundancy.
\the_end
More information about the commits
mailing list