Event UID in the Subject?

Wed Jul 14 17:01:29 CEST 2004

On Wednesday 14 July 2004 13:16, Stuart K. Bingë wrote:
> On Wednesday, 14 July 2004 12:41, Bernhard Reiter wrote:
> > I think I have described this in detail in my email from yesterday 18:37
> > Message-Id: <200407131837.47538.bernhard at intevation.de>.
> > Did you see that message?
> >
> > > Yes that is essentially the main algorithm. Please, I'd love to hear
> > > your counter-arguments - you see very adverse to having the UID in the
> > > Subject, so I assume you have some valid arguments against the scheme.
> >
> > In short: Trying the imap uid first will save the search in most cases
> > (my estimation: 98%). You only need to save one additional value per
> > event to use that improvement.
>
> A couple of points:
>
> Firstly, please could you provide some benchmarks or other quantifiable
> data that identifies SEARCH operations on Cyrus to be considered "slow". I
> am under the impression that Cyrus caches the headers of messages, meaning
> searching would be quite quick.

Martin told me that Cyrus really has not has optimised implementation here.
(Because they believe they are not a database.)
Maybe he can contribute some numbers.

Any operation on the server done by many clients is significant.
Our design should consider 100.000 or more users.

> Secondly, while this whole UID caching mechanism that you and Martin have
> put forward does seem like a good idea, I really do not consider it
> practical.

That is why I have put the second question on the required effort,
I do not know much about horde's design.

> You see, it's not just "one additional value" that I would have to store -
> I can't add arbitrary values to the event hashes (well I can, but I can't
> retrieve them later), meaning I would have to implement a EUID -> IUID
> mapping (EUID = Event UID, IUID = Imap UID) within the PHP session. As I've
> said previously - all I get is a "retrieve the object with UID X" call.

This is about Horde design.
In other circumstances I would have said: just add this as variable to the links
you create for each event and save it in the page on the webclient this way.

> The UIDs that are used in Horde are at a minimum 32 characters long
> (MD5sums), however it is often the case that a UID is 64 or possibly more
> characters long, as we also use the UID to resolve which share (message
> folder) the object (mail) is actually stored in (*).

I guess I also do not know how the EUIDs get calculated initially.
My understanding was that other clients might also come up with them
and horde need to cope with them, if it reads them in an email.
Do you also need a mapping for those?

> And the problem is, this mapping would have to cater for every message I
> read, within every message folder I touch, for every user of the webclient,
> for a reasonable time period as I do not know when the next page load
> (message request) will come through, as again, I've said before. 

There is a limit to the time an information on screen can be valid.
I'd say 30 minutes should be enough. People usually understand this.

> Oh yes,
> and then there's the UIDValidity problem as well.

When the IUID is outdated, then you need to check for new messages
anyway.

> Now *this* "puts a lot of load on the server and limits scalability", as
> this would be done through PHP scripting in the session cache. Not a very
> practical/desireable solution.

Could be another missunderstanding on my part.
I though that it would be possible to let PHP for the webclient run
on a different machine and only point to the IMAP server.
Best would be if it just could use disconnected imap just like
any other clients, but online imap also works.
This makes the client a client process running on a server machine,
but not in _the_ kolab server machine.

> I'm not sure if you've used the webclient yet, but without a PHP
> accelerator it's really not that geat an experience to try. Horde is
> already a massive system without this gargantuan caching mechanism that has
> been proposed. That is why I would like to offload a lot of this probable
> "caching" performace hit to the IMAP server (which translates to a little
> hit to Cyrus, ala the SEARCH), and let Horde concentrate on providing
> web-based groupware functionality, as opposed to mirroring the data that
> Cyrus already holds.

Mirroring with on demand syncs is a lot more scalable, if a different
machine can be used.

> (*) This is done to cater for iTip requests  - when we send out an
> event/task request, etc, and receive a reply, we are given the UID of the
> corresponding object and a major optimisation for this is to know exactly
> where to find the message that contains the object, purely based on the UID
> (i.e. the UID also identifies the share that contains the object). If this
> were not the case we would have to search through *all* messages in *all*
> the users' (and possibly shared) folders to try and find the coresponding
> groupware object to update it as necessary, as we would have no idea where
> it originated from.

Hmmm, so you cannot deal with an answer of an uid that you did not create yourself?
Also when the event would be moved from one calender folder to another,
you would loose this connection?

Both problems sound like the horde code needs to put up caches on disc
to get faster which would also solve many of those problems.

> Hopefully this little diatribe reveals a bit more of the problem I am faced
> with, and why I think that a UID header is A Good Thing.

Yes it does and I really appreciate that you took the time to write it up.
Like I said before, I am seeking to further understanding here.
We can only improve Kolab together if we understand each other.

> I really don't mind if it's not implemented in the standard though - all
> that that means is I will revert to my old mechanism of manually putting in
> the UID headers for messages that don't already have them present. The only
> reason I asked to put this in the standard is to enable me to optimise my
> code a little, and thereby increasing Horde's performance, by taking out
> the "if (!present(UIDheader)) { insertUIDheader(); rewriteMessage(); }"
> code.
>
> Either way, I will be using the UID within the headers. I will also
> gaurantee to preserve the headers, for any other clients that wishes to add
> any additional headers themselves.

I agree with Martin that using those headers leans towards a design
that looks simple and might lead to many requests to Cyrus which 
could be avoided with better design. We certainly should fix the design
problems, but we cannot do this quickely.

So for this reason and that it cannot hurt 
I think we can have that UID in the subject.
However the problem about handling non-horde created EIDs
must be solved somehow.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 2145 bytes
Desc: signature
URL: <http://lists.kolab.org/pipermail/format/attachments/20040714/eeda97e2/attachment.p7s>