About CalDAV

Martin Konold martin.konold at erfrakon.de
Sat Jan 29 18:34:33 CET 2005


Am Mittwoch, 12. Januar 2005 01:17 schrieb Helge Hess:

Hi Helge,


may I suggest to move the discussion about storage formats to 

https://kolab.org/mailman/listinfo/kolab-format

> > How do you intend to keep coherency with etags?

> same terminology. What is the situation you want to avoid?
>
> Etags (in combination with HTTP if-match and if-none-match headers)
> ensure that an object is only updated in case it wasn't modified in the
> meantime (eg by a different client/user)

How do you know that the object has not been modified in meantime without 
locks/being online?

> "Intended" and "cannot" bite anyway, since there are no CalDAV clients
> yet ...

Which leads to the question why you claim that the working Kolab specification 
is a NIH phenomena....Actually I claim that the CalDAV people should closely 
look at how Kolab solved the issue and maybe adopt some of the semantics, 
formats etc.

> > The caching for HTTP is to my knowledge intended for static data.
>
> I don't know what you consider static data. HTTP caches are kept
> consistent by querying and comparing just the etag from the server
> prior delivery.

A typical calendar event is a _very_ small amount of data. So the overhead to 
query the original server before fetching the actual data from a proxy can 
easily lead to a lower speed.

> This allows for excellent scalability and speedup, 
> especially when using the cache server in front of a processing intense
> backend implementation (like the regular OGo server, sigh ;-).

Actually this is what I mostly don't like about OGo. It is very fat and needs 
enormous processing power compared to Kolab. 

The http cache is supposed to hide this problem while imho it cannot really 
fix it because you want to keep coherency.

> Of course the "ratio" between read/write is different for CalDAV than
> for some HTML pages, but the mechanism applies.

With calendaring and caching/disconnected Kolab clients the typical R/W ration 
on the server is 1:1.

> > Why? Only modifications to individual events need to be regularly
> > fetched.
>
> IMAP4 doesn't even allow modifications of events, you should know that

This is a _big_ advantage of IMAP4. "Modification" of events is modelled by a 
appending a new object followed by a delete of the old object in the store. 

This model allows for _extreme_ lock free performance. 

Even with relational databases plain appends are much faster than 
modifications of rows.

> a) DASL query on the current state of etags, either all or related to
> some tombstone
> b) retrieval of all new or changed objects

In order to retrieve all new messages I can blindly ask the server to send me 
all new stuff with

1 C FETCH <lastUID>:*

So the new messages get immediately delivered to the client. The server can 
decide in which order to send messages e.g. allowing the server to first send 
small messages and later send the bigger messages.

While the new messages arrive the IMAP4 clients asks in between for an update 
of the IMAP FLAGS.

> - IMAP4 is not stateless, high traffic sides need to have the sockets
> open for much
>    longer (and you can't have much more than a few thousands open
> sockets on a single
>    system)

If the number of sockets is too high the server may simply close some idle 
connections. (I think http servers do the same)

> - since you can't keep the connection open at this scale, it requires a
> lot more
>    TCP/IP roundtrips - potentially a login, select, retrieve, etc for
> each object

This is the very same with HTTP. For a scalable server doing this for each 
object does not solve the issue but is counter productive.

> - IMAP4 is harder to cluster (yes, you can do that, but its way more
> complicated than
>    with just putting a Squid in front)

Squid does not help with clustering at all. Actually finally the load remains 
on the backend server.

Actually with Kolab 2 we already have full support for multi-location setups. 
Nobody is preventing to use the multi-location feature to simply scale the 
performance in a single location.

Actually the _big_ advantage of the Kolab multi-location support is that you 
don't need that single big backend server.

Last but not least the Kolab architecture actually also allows for IMAP 
caches. Technically the disconnected IMAP we use with Kolab is nothing else 
than a persistent cache on the client side.

If the need would really arise (I doubt that this happens anytime soon) we can 
easily implement an IMAP cache which is then the equivalent to Squid.

> Not that this is a proof at all ;-), but Microsoft is even using HTTP
> for Outlook access to HotMail.

IMHO this is no proof at all.

> Well, thats more an issue of the storage, not of the protocol (which is
> fixed to Cyrus in Kolab, but open in CalDAV). I suppose you can hold
> tens of thousands of items in a single Cyrus which should be enough for
> almost any application you can imagine.

My test server has about 1.000.000 objects in a single partition. I cannote see any degradition sofar.

> The issue is access scalability.

IMAP is scalable. With the Kolab approach (using disconnected IMAP) it is trivial to even add IMAP accelleration caches to the game. (Though I think this will only be useful with much more than 10.000 concurrent users.

> Yet we don't need to finish that discussion since it has little
> practical relevance at all even for the largest agency or enterprise
> setups. And for enduser portals, Yahoo and Web.de are supposed to make
> their minds up on their own ;-)

Well, Kolab could in a future version (maybe Kolab 3) offer many features which are interesting for businesses doing ASP.

> OK, so we are in line here. Thats what pipelining means in the HTTP
> context and is a standard HTTP/1.1 feature.

No. HTTP/1.1 requires replies to be send in order. IMAP on the other hand easily allows for reordering which can help to increased the percieved speed (latency) a lot. E.g when retrieving mails over a small bandwidth link. The server may reorder the sending of messages so that it firstly sends the smaller messages.

> > Kolab the server tells the client that the calendar changed.
>
> No, that just means that HTTP doesn't provide the functionality and
> that you need a different protocol to gain that functionality. While
> this isn't included in HTTP, it *is* intended for CalDAV, and thats
> what we are comparing ;-)
> Summary: this means a CalDAV client must *not* poll the server.

Sorry, this ist not about facts but about intentions....

> > Jabber is very inefficient and expensive across slow links e.g. GPRS.
>
> Jabber is used in huge setups (much larger than Cyrus deployments) and
> scales extremely well - this is a proven fact you can hardly attack.

You are not replying to my cllaim. I say that Jabber is very inefficient across slow links. I was not saying anything about scalabilty.

> GPRS is ~56K? So you can send out 165 notifications in a second.

GPRS is mainly charged on the traffic not on time. (Typically about 9¢ per 10KB). So a factor of 10 indeed does matter monetary wise!

> Which makes me wonder why you don't support this in practice and force
> Kolab1 users to upgrade to Kolab2.

Technically the Proko2 client is able to handle both formats. But for practical reasons we only support the more recent XML Format with Kolab2.

> >> You need to compare "Kolab/XML" with "iCalendar" and if you do, there
> >> is no advantage here for Kolab/XML.

Look, there are for example many things you can express with valid iCalendar which cannot be reasonably converted to the internal Outlook format. --> this leads to massive real world interoperability issues.

> > It is unfortunately incomplete for our purposes. (Mainly OL stuff
> > missing)
>
> Just out of interest, could you give some examples on Outlook features
> which you support in Kolab/XML are missing in xCal?

By definition xCal is an XML representation of iCalendar. To my knowledge xCal has not gained much support and the draft is already expired.

Regards,
-- martin

-- 
"I am committed to helping Ohio deliver its electoral votes to the
President next year."  -- 2004, Wally O'Dell - CEO of Diebold, Inc. 
e r f r a k o n - Stuttgart, Germany
Erlewein, Frank, Konold & Partner - Beratende Ingenieure und Physiker

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.kolab.org/pipermail/users/attachments/20050129/574590dc/attachment.html>


More information about the users mailing list