About CalDAV

Wed Jan 12 01:17:19 CET 2005

On 11. Jan 2005, at 12:13 Uhr, Martin Konold wrote:
> Please explain. To my understanding the CalDAV heavily relies on 
> coherency.
> How do you intend to keep coherency with etags?

I'm not sure what your actual question is or whether we use same the 
same terminology. What is the situation you want to avoid?

Etags (in combination with HTTP if-match and if-none-match headers) 
ensure that an object is only updated in case it wasn't modified in the 
meantime (eg by a different client/user) and that an object is only 
created if no other object is living at the same URL (if-none-match).
Instead of locking the URL in question (which is also possible in 
WebDAV), you pass on the etag of the version your variant of the object 
is based on and the server decides whether the object update can be 
allowed (otherwise returns a conflict status code).
For trivial implementations the iCal sequence should be an excellent 
choice for the etag.

>> Please point out what issues you see with etags in offline
>> synchronisation scenarios.
> The intended CalDAV clients cannot deal with etags to my knowledge.

CalDAV clients MUST be able to deal with etags as per specification. 
CalDAV strictly requires full HTTP, WebDAV and DASL (and is IMHO a 
pretty "fat" specification due to this).

"Intended" and "cannot" bite anyway, since there are no CalDAV clients 
yet ...

>> HTTP/1.1 draft as well as point to the broad support for caching proxy
>> servers used in practice.
> The caching for HTTP is to my knowledge intended for static data.

I don't know what you consider static data. HTTP caches are kept 
consistent by querying and comparing just the etag from the server 
prior delivery. This allows for excellent scalability and speedup, 
especially when using the cache server in front of a processing intense 
backend implementation (like the regular OGo server, sigh ;-).

Of course the "ratio" between read/write is different for CalDAV than 
for some HTML pages, but the mechanism applies.

>> As an application consider a public calendar webservice like the Yahoo
>> TV schedule - a single calendar is accessed by millions of people.
>> Something like this will give you big headache with IMAP4
> Why? Only modifications to individual events need to be regularly 
> fetched.

IMAP4 doesn't even allow modifications of events, you should know that 
;-) Anyway, the same is true for CalDAV. The operations are
a) DASL query on the current state of etags, either all or related to 
some tombstone
b) retrieval of all new or changed objects
As mentioned before IMAP4 has a minor advantage in the size of the 
changeset.

> This works perfectly with IMAP4. The nice thing about IMAP4 is that the
> effort is O(1)!

Various issues:
- IMAP4 is not stateless, high traffic sides need to have the sockets 
open for much
   longer (and you can't have much more than a few thousands open 
sockets on a single
   system)
- since you can't keep the connection open at this scale, it requires a 
lot more
   TCP/IP roundtrips - potentially a login, select, retrieve, etc for 
each object
- IMAP4 is harder to cluster (yes, you can do that, but its way more 
complicated than
   with just putting a Squid in front)

Not that this is a proof at all ;-), but Microsoft is even using HTTP 
for Outlook access to HotMail.

>> obviously this isn't in the focus of Kolab either).
> Kolab suits well to very big public calendars.

Well, thats more an issue of the storage, not of the protocol (which is 
fixed to Cyrus in Kolab, but open in CalDAV). I suppose you can hold 
tens of thousands of items in a single Cyrus which should be enough for 
almost any application you can imagine.

Anyway, I didn't question that. *Big* calendars are not really hard 
since maintaining a collection of 50.000 objects or more is no rocked 
science.
The issue is access scalability.

Yet we don't need to finish that discussion since it has little 
practical relevance at all even for the largest agency or enterprise 
setups. And for enduser portals, Yahoo and Web.de are supposed to make 
their minds up on their own ;-)

>> Unless there is a terminology mismatch HTTP/1.1 supports message
>> pipelining on a single socket since the initial revision (several 
>> years
>> old).
>
> Pipelining in the context of IMAP4 means that messages can be fetched 
> like
>
> 1 C FETCH 1
> 2 C FETCH 2
> 3 C FETCH 3
>
> and then the server may send the results in any order.
>
> This helps to reduce latency a lot. Think about slow / high latency 
> links.

OK, so we are in line here. Thats what pipelining means in the HTTP 
context and is a standard HTTP/1.1 feature. Also see:

   http://www.mozilla.org/projects/netlib/http/pipelining-faq.html

A minor difference is that the results must be returned in order. I 
would be interested in what condition ordering might be an issue.
In case there is one, you still have the option of using a DASL query 
without sort ordering to accomplish incremental fetches (fetch 3 
objects at once in arbitary order).

>> But HTTP itself does intentionally not provide a "back-channel" for
>> (web-scale) scalability reasons.
> This means that the client must poll often in order to stay current 
> while with
> Kolab the server tells the client that the calendar changed.

No, that just means that HTTP doesn't provide the functionality and 
that you need a different protocol to gain that functionality. While 
this isn't included in HTTP, it *is* intended for CalDAV, and thats 
what we are comparing ;-)
Summary: this means a CalDAV client must *not* poll the server.

>> you'll notice that this is supposed to get covered using XMPP 
>> (Jabber),
> Jabber is very inefficient and expensive across slow links e.g. GPRS.

Jabber is used in huge setups (much larger than Cyrus deployments) and 
scales extremely well - this is a proven fact you can hardly attack.

But yes, on very slow links it has the usual XML overhead, though 
calling it expensive is exaggerated. A GPRS connection is usually 
per-user and a XMPP notification packet might be around ~200 bytes. 
This is large compared to 10 bytes or something for the IMAP4 back 
channel.
GPRS is ~56K? So you can send out 165 notifications in a second. Since 
notifications are only send in case something actually changes and can 
be coalesced in non-trivial implementations this is hardly an issue 
even on 9.6 modem lines.

Premature optimization isn't any good either ;-)

>> retrieval and upload is more network efficient in HTTP since it
>> supports gzip/deflate transport encoding per default (and this feature
>> is widely supported).
> Doing gzip/deflate in the clients is trivial for Kolab clients and 
> scales much
> better than server side gzip/gunzip.

True. Since the server doesn't understand the content, it doesn't need 
to decompress it. Valid point.

>> Notably with HTTP you can easily support both (XML and iCal encoded
>> iCal entities) using content negotiation without the requirement to
>> change any other part of the infrastructure.
> We can do the very same with plain IMAP folder annotation or message 
> headers.

While I don't understand the folder annotation point, also true (you 
can use the message content-type).
Which makes me wonder why you don't support this in practice and force 
Kolab1 users to upgrade to Kolab2.

>>> Q: Why not iCalendar?
>>> A: Well iCalendar is fine for doing invitations e.g. via email
>>> transport. This
>>> is done in Kolab 1 and Kolab 2. iCalendar is unfortunately not well
>>> defined
>>> for doing storage of calendars which are supposed to be accessed by
>>> multiple
>>> client implementations.
>> Please elaborate. Whats the issue here?
> The iCalendar standard is not precise enough. It leaves to many valid 
> manners
> to express the same.

Since you claim that the standard is precise enough for doing 
invitations I'm surprised that you claim it isn't precise enough for 
storage. Isn't exactly the same precision required to perform correct 
iMIP?

Also I would be interested in
b) what exactly is duplicate in iCalendar (please give an example of a 
construct
    which expresses the same fact)
b) what isn't precise enough - saying that there are multiple manners 
to express
    the same is not the same like precision. The latter only requires 
that all
    variants are precisely specified. To my best knowledge this is the 
case.

iCalendar is quite complex, but this isn't because it duplicates stuff 
but because it models an extremely wide range of real world calendaring 
issues.

>> You need to compare "Kolab/XML" with "iCalendar" and if you do, there
>> is no advantage here for Kolab/XML.
> As said multiple times before I don't believe XML is the holy grail. 
> If some
> group comes up with an improved iCal like standard I see no reason not 
> to
> switch to it in the future.
>
> As of today this fixed iCal like standard does not exisist.

You didn't answer the point. Of course there are few (no?) clients 
which implement iCalendar completely. But since you only talk about 
Kolab clients, you could have used a specified subset of iCalendar 
which is implemented instead of doing something completely new from 
scratch.

Of course the fixed iCalendar standard does exist. It just isn't 
implemented fully by anyone. Nitpicking, result is the same of course 
;-)

>> I have another Q:
>> Q: Why not the xCal draft?
> It is unfortunately incomplete for our purposes. (Mainly OL stuff 
> missing)

Just out of interest, could you give some examples on Outlook features 
which you support in Kolab/XML are missing in xCal?

Notably - being XML - xCal would allow you to use *any* extension just 
by declaring a new XML namespace for them. Core XML feature to reuse 
base specifications and enhancing them with local features.

Greets,
   Helge
-- 
http://docs.opengroupware.org/Members/helge/
OpenGroupware.org