martin: doc/kolab-formats kolab-file-format.lyx,NONE,1.1

cvs at intevation.de cvs at intevation.de
Tue Mar 30 13:38:42 CEST 2004


Author: martin

Update of /kolabrepository/doc/kolab-formats
In directory doto:/tmp/cvs-serv13520/kolab-formats

Added Files:
	kolab-file-format.lyx 
Log Message:
Martin K.: Proposal for new fileformat allowing for concurrent use of Outlook and KDE Kolab clients on the same resource (e.g. a shared folder)


--- NEW FILE: kolab-file-format.lyx ---
#LyX 1.3 created this file. For more info see http://www.lyx.org/
\lyxformat 221
\textclass docbook
\language english
\inputencoding auto
\fontscheme default
\graphics default
\paperfontsize default
\papersize Default
\paperpackage a4
\use_geometry 0
\use_amsmath 0
\use_natbib 0
\use_numerical_citations 0
\paperorientation portrait
\secnumdepth 3
\tocdepth 3
\paragraph_separation indent
\defskip medskip
\quotes_language english
\quotes_times 2
\papercolumns 1
\papersides 1
\paperpagestyle default

\layout Title
\added_space_top vfill \added_space_bottom vfill 
Kolab Storage Format
\layout Date

28th March 2004 
\layout Author


\begin_inset ERT
status Collapsed

\layout Standard
<firstname>
\end_inset 

Martin
\begin_inset ERT
status Collapsed

\layout Standard
</firstname><surname>
\end_inset 

Konold
\begin_inset ERT
status Collapsed

\layout Standard
</surname>
\end_inset 


\layout Abstract

A cross platform groupware solution requires in case of simultaneous use
 of shared data e.g.
 shared calendar or shared contacts a common fileformat independent of operating
 system and Kolab client used to access the data.
 Unfortunately the internal data structures of the popular MAPI based Microsoft
 Outlook are very different from the internet standards RFC 2445 and RFC
 2026.
 Different approaches to solve the issue have been proposed.
 Sofar no final solution has been found.
 This document provides an alternative approach to the problem using a redundant
 data technique.
\layout Section

Kolab Storage Format dilemma
\layout Standard

In Kolab 1.0 we basically have two incompatible data formats for the KDE
 Kolab Client (KMail) and the Outlook Plugins (Bynari Connector 1.x or Toltec
 Connector 1.x).
 While the KDE Kolab client uses icalendar based clear text representation
 of groupware data the Outlook Plugins use serialized TNEF encoded MAPI
 objects.
 
\layout Standard


\begin_inset Float table
wide false
collapsed false

\layout Caption

Outlook and KDE Kolab Client Dataformat
\begin_inset LatexCommand \ref{Dataformat}

\end_inset 


\layout Standard


\begin_inset  Tabular
<lyxtabular version="3" rows="8" columns="2">
<features>
<column alignment="center" valignment="top" leftline="true" width="0">
<column alignment="center" valignment="top" leftline="true" rightline="true" width="0">
<row topline="true" bottomline="true">
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\layout Standard

Outlook
\end_inset 
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text

\layout Standard

KDE Client
\end_inset 
</cell>
</row>
<row topline="true" bottomline="true">
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\layout Standard

tnef
\end_inset 
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text

\layout Standard

ical
\end_inset 
</cell>
</row>
<row topline="true" bottomline="true">
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\layout Standard


\begin_inset Formula $\vec{{M}}$
\end_inset 


\end_inset 
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text

\layout Standard


\begin_inset Formula $\vec{{N}}$
\end_inset 


\end_inset 
</cell>
</row>
<row topline="true">
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\layout Standard


\begin_inset Formula $M_{1}$
\end_inset 


\end_inset 
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text

\layout Standard


\begin_inset Formula $N_{1}$
\end_inset 


\end_inset 
</cell>
</row>
<row topline="true">
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\layout Standard


\begin_inset Formula $M_{2}$
\end_inset 


\end_inset 
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text

\layout Standard


\begin_inset Formula $N_{2}$
\end_inset 


\end_inset 
</cell>
</row>
<row topline="true">
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\layout Standard


\begin_inset Formula $M_{3}$
\end_inset 


\end_inset 
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text

\layout Standard


\begin_inset Formula $N_{3}$
\end_inset 


\end_inset 
</cell>
</row>
<row topline="true">
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\layout Standard


\begin_inset Formula $M_{4}$
\end_inset 


\end_inset 
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text

\layout Standard

\end_inset 
</cell>
</row>
<row topline="true" bottomline="true">
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\layout Standard


\begin_inset Formula $M_{5}$
\end_inset 


\end_inset 
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text

\layout Standard

\end_inset 
</cell>
</row>
</lyxtabular>

\end_inset 


\end_inset 


\layout Standard

As shown in Table XXX the Outlook Plugin and and the KDE Kolab Client use
 very different representation of data using different encodings and a different
 number of attributes or properties.
 This is due to the fact that e.g.
 Outlook internally is using a totally different data model while the KDE
 Kolab Client simply uses icalendar internally.
 MS Outlook is MAPI based and uses a MAPI Storage Container.
 All currently available Outlook Plugins for Kolab use a PST file.
 The PST file can be considered as beeing similar to a hierarchical database
 using an access path called the MAPI entry ID in order to reference an
 object.
 
\layout Standard

Within a MAPI object there are numerious properties similar to icalendar
 attributes.
 The number of MAPI properties is typically much larger than the number
 of icalendar atttributes describing the same object e.g.
 a calendar entry or an contact (vcard).
 The later fact is shown in table XXX.
 For every icalendar attribute there is a corresponding MAPI property but
 not for every MAPI property is an corresponding icalendar attribute.
\layout Standard

This leads to the fact that a mapping from MAPI to icalendar is 
\emph on 
not
\emph default 
 reverseable.
\layout Standard


\begin_inset Formula \[
\vec{{M}\rightarrow\vec{{N}\rightarrow\vec{{M'}}}}\]

\end_inset 


\layout Standard


\begin_inset Formula \[
\vec{{M}\neq\vec{{M'}}}\]

\end_inset 


\layout Section

Kolab Storage Format 2.0
\layout Standard

The proposed solution to the dilemma as described in the previous chapter
 is the following method based on redundant storage of attribute and properties
 while maintaining the coherency of the data on the application level.
 We hereby assume that the KDE Kolab Client is able to parse a TNEF encoded
 MAPI object using libktnef in order to retrieve any relevant data and that
 the Outlook plugin is able to parse icalendar objects.
 Actually in order to be able to parse a tnef encoded MAPI object not much
 about MAPI object needs to be understood.
 I verified that a simple brute force reengineering effort would be sufficient
 in order to figure out which properties contain the required data.
 The biggest effort is to find the most suitable property containing the
 often redundant data.
 E.g.
 a MAPI object representing a calendar entry contains the names of the attendend
ees in 
\emph on 
multiple
\emph default 
 properties simultaneously.
 A lot of this work has already been done for the libktnef implementation.
 
\layout Standard

I assume that the redundancy within MAPI is due to the long history (
\begin_inset Formula $>10$
\end_inset 

 years) of the Windows MAPI compatibility cruft.
 Using the data from the parsing of the tnef encoded MAPI the KDE Kolab
 client is then able to create a valid and complete icalendar object.
 The KDE Kolab Client can now easily store the icalendar object on the Kolab
 server.
 On the other hand the KDE Kolab Client is not able to either update nor
 create a tnef encoded MAPI object.
 In case the KDE Kolab Client changes any attribute as is very common with
 shared folders the actual data stored in the icalendar object is out of
 sync with the original MAPI object.
\layout Standard

In case a KDE Kolab Client again accesses the same object it has to detect
 the fact that the MAPI already got previously synchronized to the icalendar
 object and therefor the KDE Kolab Client has to consider the icalendar
 object to be determinative.
\layout Standard

On the other hand if the Outlook Kolab Client using an appropriate plugin
 sees the modified object then this plugin has first to read the outdated
 TNEF encoded MAPI object and then figure out that the corresponding icalendar
 object is newer than the MAPI object.
 The plugin then parses the icalendar object and used this newer data in
 order to update the outdated data of the MAPI object.
 This method allows for preservation of MAPI specific properties not available
 in the icalendar object while the KDE Kolab client is not required to keep
 the tnef encoded MAPI obejct uptodate.
\layout Standard


\series bold 
Short summary:
\series default 
 The basic idea is that the KDE Kolab client is able to parse both the icalendar
 ojects and the tnef encoded MAPI objects but only able to write icalendar
 objects to the Kolab server.
 On the other hand the Outlook Kolab client is expected to be able to parse
 its own tnef encoded MAPI object and in addition icalendar objects as written
 by the KDE Kolab client.
 This means that we only require both clients to be able to 
\emph on 
parse both formats
\emph default 
 for 
\emph on 
reading
\emph default 
 while they only need to be able to 
\emph on 
write
\emph default 
 their 
\emph on 
own native
\emph default 
 formats.
\layout Subsection

Implementation
\layout Standard

The Kolab Storage format is implemented using multi-part mime messages as
 defined in RFC 2045.
 The Content-Type of TNEF encoded MAPI object is 
\series bold 
application/ms-tnef
\series default 
.
 The Content-Type of icalendar object is 
\series bold 
text/calendar
\series default 
.
 Each part of the multi-part mime message has a 
\series bold 
Content-Description
\series default 
 Header Field describing a serial number.
 This serial number is required in order to be able to figure out which
 part is uptodate and therefore authorative.
 
\layout Standard


\begin_inset Formula $\vec{M_{00}\rightarrow^{OL}\vec{{M_{01}}\rightarrow^{KDE}\frac{\vec{{M_{01}}}}{\vec{{N_{01}}}}}}$
\end_inset 


\begin_inset Formula $\rightarrow^{KDE}\frac{\vec{{M_{01}}}}{N_{02}}\rightarrow^{OL}\frac{\vec{{M_{03}}}}{N_{02}}$
\end_inset 


\begin_inset Formula $\rightarrow^{KDE}\frac{\vec{{M_{03}}}}{N_{04}}$
\end_inset 


\layout Standard

In the above example the Outlook Kolab client creates a TNEF encoded multi-part
 mime message 
\begin_inset Formula $\vec{M_{0}}$
\end_inset 

containing a single part with the serial number 0.
 This TNEF encoded MAPI object is then subsequentially read and by an Outlook
 Kolab client.
 The resulting modified object 
\begin_inset Formula $\vec{{M_{1}}}$
\end_inset 

is then given the updated serial number 1.
 In the next step a KDE Kolab client reads the multi-part mime message containin
g only a single part with the Content-Type TNEF and notices that there is
 no icalendar part currently present.
 Consequently the KDE Kolab client parses the TNEF encoded part 
\begin_inset Formula $\vec{{M_{1}}}$
\end_inset 

for all required information and creates 
\begin_inset Formula $\vec{{N_{1}}.}$
\end_inset 

In case the KDE Kolab client then changes the object 
\begin_inset Formula $\vec{{N_{1}}}$
\end_inset 

 it has to increase the serial number to 
\begin_inset Formula $\vec{{N_{2}}}$
\end_inset 

.
 The serial number of the TNEF encoded MAPI object is never modified by
 the KDE Kolab client and vice versa.
 In this example the subsecquent read/modify is done by the Outlook client
 and therefore this client has to firstly read the outdated TNEF part and
 then update this MAPI object using the uptodate data from the icalendar
 part.
 Due to the fact that in this step there is actually a read followed by
 a modify the Outlook Kolab client writes the 
\begin_inset Formula $\vec{{M_{3}}}$
\end_inset 

part while preserving the 
\begin_inset Formula $\vec{{N_{2}}}$
\end_inset 

part from the KDE Kolab client.
 Lastly a KDE Kolab client accesses the message reads the icalendar part
 and notices that the TNEF part is more recent than the icalendar part.
 Again the TNEF part is parsed and the icalendar part is updated.
 In case the KDE Kolab client would not change the data it would write 
\begin_inset Formula $\vec{{N_{3}}}$
\end_inset 

while preserving 
\begin_inset Formula $\vec{{M_{3}}}$
\end_inset 

but in this example the KDE Kolab client modified the data and consequently
 wrote 
\begin_inset Formula $\vec{{N_{4}}}$
\end_inset 

while preserving 
\begin_inset Formula $\vec{{M_{3}}}$
\end_inset 


\layout Section

Kolab Storage Format 3.0
\layout Standard

With Kolab 3.0 I want to propose a new data format for Kolab based on an
 extension of the RFC 2445.
 This new format will avoid the need of two parts per mime message and avoid
 unnecessary redundancy.
\the_end





More information about the commits mailing list