Request for Input: Storing Searches
Jeroen van Meeuwen (Kolab Systems)
vanmeeuwen at kolabsys.com
Wed Aug 31 14:13:24 CEST 2011
Christian Mollekopf wrote:
> > > - The action would then pick the results from the search which are in
> > > this resource, and tag them via ANNOTATE and create an xml object with
> > > the search info.
> > Where would this XML object be put?
> I'd imagine that we put those objects in the rootfolder or in a "Saved
> Searches" folder. In the "Saved Searches" folder we could then also create
> the optional server-side populated search directories.
There's no such thing as a 'root folder'.
If a 'Saved Searches' folder were to be used, all 'saved search' Kolab XML
objects would go into that one folder.
Sharing any particular saved search now becomes a problem,
Clients not compatible with KEP #9 are now helpless, since a top-level folder
of unknown type is encountered, but they have to descend in order to get to
The saved searches folder(s) cannot be pre-populated unless the Kolab XML
object for saved searches also states where any pre-populating should go out
to, which naturally is subject to too much change,
Keeping 'reference' objects in a 'saved search' folder creates the same
'subject to change' problem... if a reference object where to say,
user/john.doe/Contacts at example.org?uid=blabla, renaming the Contacts folder to
Kontacten would create a reference issue; any reference should be completely
> > > - The resource populates the virtual folder with virtual items based on
> > > the tags.
> > >
> > > => - No data duplication
> > This has always been "optional"; a saved search folder *could* be pre-
> > populated.
> > Imagine a saved search across 5 contact folders with 10.000 contacts on
> > average.
> > When NOT pre-populating the saved search folder with the search results,
> > you pay the cost every time the folder is opened. Maybe this cost is not
> > so great for a fat client with a local cache to query
> > (Kontact/akonadi/nepomuk/Disconnected IMAP), but for a web-interface...
> > well...
> Yes, since akonadi can create virtual folders it only has to populate them
> once AFAIK, and then result is then cached.
> For a webclient I guess you're right. But if we have the dataduplication
Again, the data duplication is *at one's option*. For web-interface, I say one
should pre-populate the search folder. For clients like Kontact (with client-
side, local caches), perhaps it's feasible to allow them to ignore the cached
results and go with a real-time search.
> and it should also be writable it looks somewhat error prone to me.
A clause in the KEP for these types of folders can be, that the content of the
folder SHOULD NOT be made writeable for any event other then 'update saved
> Especially if we have to implement that for every client.
We have to implement everything and anything for every client, in case you
> I reckon using akonadi as a cache for the webinterface would solve that
No, the caching layer is moot unless you also consider all clients use
akonadi. One cache (in one location) to rule them all being akonadi is not
necessarily the best way to go with this. This, however, is a different topic,
and we should talk about caching separately from the saved searches topic.
> > When pre-populating the saved search folder with the search results, you
> > pay the cost in "duplicate" storage (as explained, not on the server
> > side, perhaps on the client side if it's not intelligent enough to
> > de-duplicate).
> Well, the argument for doing it server side would be that it is available
> on any client (i.e. smartphone), but then read-only is the only option i
> see. If it is on the client side, this ends up to be essentially the same
> as the akonadi virtual folders.
Note that the "problem" or "difficulty" is not the editing of an object from
within the saved search, not for a client and not for a user.
It is the occurence of said object twice or more times in or across all
readable folders that is the first problem.
> > Another penalty in pre-populating the saved search folder could,
> > arguably, be that perhaps there's results in said saved search folder
> > that the person using the folder would otherwise not have access to.
> > However, this can also be considered a feature; "Share all contacts from
> > Vendors folder tagged with 'ict' with helpdesk personnel"
> If it is being populated on the client side I don't see how you could get
> access to items you shouldn't have access to.
You want to avoid your 350.000 clients from each having to iterate and re-
iterate against most of your infrastructure components themselves, just to
pre-populate and update their saved (contact) searches cache(s), if you can do
so periodically on the server-side, under your own control.
Please note that the folders of type 'contact' that the user has access to are
not the only things that are subject to change. Please also note that these
folders and other resources can be *huge*.
Saved searches are often not the smallest set of search results. They often
include a type of query that is a little less specific then
(mail=john.doe at example.org), and inherently can include a lot of attributes
that are not (cannot be?) indexed anywhere.
As such, saved searches are *hugely* expensive to execute.
When they happen on the client side, and on one particular client only,
configurable per client, then we're all fine. Kontact for instance can do
saved searches in a reasonable fashion because of Akonadi. A web interface
such as Horde or Roundcube however cannot. Please note a saved search for
these interfaces will have to;
1) execute within the PHP execution timeout (30 seconds),
2) stay within the memory_limit (64/128/192 MB),
3) do not use the user's credentials / privileges to query resources other
Furthermore, interesting saved searches discussions can be held over
particular types of Kolab clients such as Z-Push (and its clients). Not all
Kolab-consuming applications have a need for, an interface for, or a place for
saved searches or server-side caching of said saved searches.
Similarly, not all clients have a need for, or place for, a server-side cache
-such as Disconnected IMAP in Kontact.
That said, a folder clearly configured as a 'saved_search' folder can be
ignored (Z-Push?) or the contents thereof can be ignored (Kontact?), while the
contents may be the periodically updated search results, or the folder be
empty. Both cases serve all clients well, KEP #9 compatible or not.
Jeroen van Meeuwen
Senior Engineer, Kolab Systems AG
e: vanmeeuwen at kolabsys.com
t: +44 144 340 9500
m: +44 74 2516 3817
pgp: 9342 BF08
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the format