[Kolab-devel] 10.000 events in a Resource Calendar

Tue May 22 18:42:33 CEST 2012

Am Dienstag, den 22.05.2012, 17:29 +0100 schrieb Jeroen van Meeuwen
(Kolab Systems):
> On 2012-05-22 15:08, Martin Konold wrote:
> > Am Dienstag, 22. Mai 2012, 13:26:23 schrieb Jeroen van Meeuwen:
> >
> > Hi Jeroen,
> >
> >> > No this is not a flaw in any way. A delete operation is handled
> >> > exactly like
> >> > and together with write operations. (E.g. EVERY modify is actually 
> >> a
> >> > delete+write operation by nature of how Kolab storage works)
> >>
> >> Let's take a step back, because we're confusing the issue in OP.
> >
> > What does 'OP' mean?
> >
> 
> Original Post(er), which was about possibly finding ways to increase 
> the efficiency when operating raw IMAP.
> 
> >> The following *actually* happens when an event is deleted (whether 
> >> "the
> >> idea behind 0(1)" design or not);
> >>
> >> - Adding or editing an event to a calendar obviously adds a new 
> >> object
> >> to IMAP.
> >
> > Correct.
> >
> >> - To remove an event from a calendar, the message could be flagged
> >> \Deleted in IMAP, and (possibly) the folder is expunged (doesn't
> >> matter),
> >
> > Yes. A remove is mapped to a \Deleted flag. Why do you consider the 
> > obvious
> > worthwile to mention?
> 
> I'm merely adding the background of a line of thought, or my part of it 
> anyway, that brought us to where we are in this part of the 
> conversation.
> 
> >> It does bump UIDVALIDITY
> >
> > No, this is plain wrong.
> >
> 
> You're right, that is plain wrong. I'm sorry, I meant HIGHESTMODSEQ, 
> not UIDVALIDITY.
> 
> >> , but... see below.
> >>
> >> - The *client* is to trigger the Free/Busy update,
> >
> > Yes, this is implemented this way in order to keep the patchset small
> > and make the Kolab solution work with any unmodified standards 
> > compliant IMAP4 server.
> >
> > (An alternative would be to extend either IMAP4 syntax or IMAP4 
> > semantics.)
> >
> >> - CONDSTORE (required for UIDVALIDITY) is not enabled on Kolab 2.3
> >> (Cyrus IMAP 2.3) mailboxes by default,
> >
> > Sorry, this is technically plain wrong.
> >
> 
> Yes, indeed, you're right again. A daisy chain of errors, I spat onto 
> this list. My apologies, again.
> 
> >> - The Free/Busy mechanism has little to hold on to, to see what has
> >> changed, unless it maintains a local cache of at least the UIDs of 
> >> the
> >> message it used when it last generated the (partial) Free/Busy,
> >
> > Keeping such a cache for optimisation purposes is trivial and common
> > practice. Actually it is not required for a scalable solution but 
> > this fact is a minor
> > detail which could be discussed seperately. The size of the cache is 
> > a
> > negletable simple list of 32bit Integers e.g. 40K in the case of 
> > 10.000
> > events.
> >
> >> - Retrieval of relevant events to the relevant period in time could 
> >> be
> >> made faster using sorting and retrieving the newest objects first,
> >
> > This is common practise and trivial but doing sorting is plain wrong
> > and slow.
> >
> > A sorting approach is a typical relational database approach. There
> > is NO need to do any sorting if you leverage upon the IMAP protocol.
> >
> 
> Server-side sorting obviously does leverage the IMAP protocol.
> 
> > IMAP guarantees strong monotonous increasing UID values. Due to the
> > fact that IMAP does NOT know a modify every modified or new event 
> > results in a
> > new IMAP message which happens to have a UID > LASTSEENUID. (For 
> > briefity I will not
> > get into the details of removal).
> >
> > Therefore the simple rule that a "FETCH LASTSEEN+1:*" is sufficient.
> >
> 
> Naturally.
> 
> >> - The client triggering Free/Busy does not simply HEAD a URL and
> >> disconnects
> >
> > No this claim is wrong, ofcourse this is the case up to today.
> >
> 
> I'm not sure this parses. The claim is wrong but it is the case 
> to-date?
> 
> >> , as this would impede the slice of time any web server code
> >> has available to do what it needs to do. Therefore, a client keeps 
> >> open
> >> the connection (and uses GET/POST) until the web server performing 
> >> the
> >> Free/Busy updating is done. This is considered a blocking operation 
> >> for
> >> clients that cannot do this in the background.
> >
> > This is wrong. Please look at the code.
> >
> 
> The code of... the fbview Horde fork server-side,

The Horde fbview fork is something entirely different. It is meant to be
used for "viewing free/busy information", hence fbview. I don't think it
has been used very much and it can't be considered to be much more than
a weird hack.

>  or the client (and if 
> the latter, which client?).

>From Martins background I would assume Kontact. But I'm not 100%
certain. For Horde I have to admit that this is one of the things that
was not handled correctly - at least in some versions.

> 
> >> >> Euh, as far as I know, it is the client software that triggers an
> >> >> update of the free/busy, and not the Kolab server itself, and 
> >> unless
> >> >> the
> >> >> client is multi-threaded like Kontact it is also a blocking
> >> >> operation.
> >> >
> >> > Sorry this is non-sense.
> >>
> >> Thank you for your balanced and well-formulated opinion.
> >
> > I am sorry but how else should I call it. This is not an opinion but
> > a trivial provable fact that for every Kolab client the trigger by 
> > its very nature is
> > non blocking. After all it is a trigger.
> >
> >> As I've illustrated before, it's not like Kolab uses FPM or any 
> >> other
> >> FastCGI-like implementation,
> >
> > Don't think in terms of a web developer. Kolab does not require any 
> > of those
> > implementations in order have non blocking fb generation. (The 
> > current
> > implementation uses a daemon approach in order to avoid extra 
> > patching of
> > upstream resources. Though this is an implementation detail)
> >
> 
> Could you please explain this statement? As far as I know, there is no 
> daemon whatsoever. Perhaps you could point out the package that is 
> responsible for deploying such freebusy daemon in the sources of Kolab 
> 2.3.4?
> 
>    http://files.kolab.org/server/release/kolab-server-2.3.4/sources/
> 
> >> and it's not like the client can simple
> >> HEAD a URI and be done with it (close the connection).
> >
> > But this is exactly what happens. Therefore I call you assumptions
> > and claims nonsense.
> >
> 
> Right.
> 
> >> > Sorry, but you really got things wrong. The basic idea behind 
> >> Kolab
> >> > is NOT to think in terms of a relational database including terms 
> >> of
> >> > doing queries all
> >> > the time.
> >> >
> >> > This is the essential clue behind Kolab that it is so extremly
> >> > scalable.
> >> >
> >> > Introducing all these "query" concepts will lead to loosing this
> >> > unique
> >> > property.
> >>
> >> Well, unique != good and most certainly unique != best. At most, 
> >> unique
> >> <> common.
> >
> > In this case unique == good and I consider it insulting that you 
> > claim that
> > the existing scalable solution is inferiour to your "query" approach 
> > while
> > denying all evidence as seen in source and existing binaries.
> >
> 
> Insulting you certainly wasn't my intention, so I'm sorry if I did.
> 
> > IMHO query is slow, has scalability issues and should be avoided when
> > possible.
> >
> > Leveraging upon guaranteed protocal semantics is good practise 
> > upwards
> > compatible. On the the other hand mapping everything towards a 
> > relational
> > database even though the underlaying problem does not have relational
> > properties is abuse and leads at least to scalability issues.
> >
> 
> Well, I'd appreciate some elaboration on the insights that;
> 
> - "queries are slow",
> 
> - databases are not scalable,
> 
> >> To be honest, the "extremely scalable" argument is starting to get 
> >> to
> >> be completely wasted on me.
> >
> > I accept that you do not care about scalability but then please don't
> > ask for answers to scalability questions like having 10.000 events in 
> > a single
> > calendar.
> >
> 
> Oh, but don't get me wrong, please. I *do* care about scalability. I'm 
> just getting tired of hearing "No, not scalable" without the background 
> of why it (the suggested idea or development) is not scalable, or not 
> sufficiently scalable.
> 
> > From my experience both scalability and security MUST be designed 
> > into a
> > solution right from the beginning. Adding both later is extremly 
> > cumbersome
> > and most often not really solvable in a satisfactory manner.
> >
> 
> I couldn't agree more.
> 
> >> Every time it is used, it is used as the ultimate argument against
> >> something
> >
> > Most of the time it is used well founded as an argument against 
> > abusing
> > traditional web technology. (E.g. large scalable web solution like 
> > facebook,
> > google or twitter have moved away from traditional relational 
> > databases long
> > ago.)
> >
> 
> Nobody's setting in stone it MUST be SQLite, nor arguing *SQL is the 
> only option, it may very well be Cassandra, or whatever substitute does 
> the job, or nothing at all. Nobody's threatening the holy model of 
> "no-SQL storage". You're completely right when you say I haven't got the 
> faintest idea what all the fuzz is about.
> 
> >> , but it misses merit in that the scalability parameter to a
> >> Kolab deployment is never removed nor reduced by any of the 
> >> developments
> >> or ideas to move forward. While you may disagree with that, I have 
> >> to
> >> conclude "no-SQL storage" is being confused and arbitrarily 
> >> substituted
> >> with "caches, possibly in SQL".
> >
> > This is plain wrong. There seems to be a fundamental 
> > missunderstanding.
> >
> 
> There being a misunderstanding was exactly my point, I'm glad you 
> agree.
> 
> > I hope that I could anyway provide some insight. As I lack both time 
> > and
> > funding for actually working on Kolab 3 I hereby stop contributing to 
> > this
> > thread.
> >
> 
> Well, I'm sorry about that.
> 
> > Maybe sometime we can meet at some conference and have a beer 
> > together after
> > meeting before for about an hour in front of a black board. I am 
> > confident
> > that you would then understand better what this fuzz is all about.
> >
> 
> Likewise.
> 
> Kind regards,
> 
> Jeroen van Meeuwen
>