Getting rid of (most) annotations? (Is: Question: Individual annotations vs One large annotation)

Fri Sep 30 22:38:36 CEST 2011

Hi Georg,

On Wednesday 28 September 2011 18:23:32 Georg C. F. Greve wrote:
> There are use cases where XML objects are clearly the better approach, there
> are also cases where they are vastly inferior.

I yet have to encounter such a situation. Let me split out this discussion, 
because I believe getting rid of (almost) all usage of annotations in the 
Kolab Concept is the future. I'll answer the question about the one large 
annotation, under the condition we keep annotations in a different email.

> Typical examples are:
> 
>  * Anything for email folders:
> 
>  This will result in a mix of object types within email folders that will
>  raise the complexity for clients displaying the folders as they can no
>  longer rely upon the folder type, and non-Kolab clients will more easily
>  break the configuration.

Yes, I would not save the configuration for an email folder within this email 
folder. So it has to be saved in a configuration folder, probably in a must-
have configuration default folder if there is no other or in a subfolder.
But the Kolab Client knows it is there and where to find it.
Email client would not touch that folders, so no risk of breakage.

> The alternative results in the next point, and a good chance for data
> rot as now the meta data for one particular folder is no longer stored
> in that folder.

I consider this a low risk as email clients have been saving configuration
outside of the folder a lot. As long as the default place is there and the 
client know where to look, there is no added risk of data rot.

>  * More complexity
> 
> Where a per folder setting can easily be stored in a shared/private
> namespace, and conflict resolution between different settings is easy,
> doing the same through XML is more complex.

There is a hiearchy defined in KEP9, so a setting specifically for one
folder does not have a conflict. Conflict resolution for concurrent writes
can be handled like all other conflict resolution: the next interactive client
preferably asks the user or decides.

>  * Vastly inferior performance
> 
> Clients always need to parse everything to retrieve a single setting,
> likely across all folders, parsing each object, including filtering
>  through hundreds of thousands of email messages to make sure
> no configuration object is stored among them.

A subfolder or the default configuration folder would be easy to parse.
Quite likely a client would do this close to startup on this account,
because the prefered and recommended mode of operation is 
a fully cached client, it just needs a local table to all relevant 
configuration settings for a folder and then only needs to check changes in 
this area. I believe this will be fast enough for configuration settings.

> This is in the area of infinitely worse performance in comparison to
> annotations, which are a well understood and established mechanism
> by now and supported by virtually all the IMAP servers we care about.

The main point is that in a setting where a user accesses multiple email 
servers at once from one client, some of them will not be Kolab Servers.
The chance is very high to hit IMAP servers that do not support the 
annotations like Kolab Server does. Getting rid of annotations makes all
these servers accessible as additional Kolab storage and would thus 
make Kolab much more attractive. Still most users would want Kolab Servers as 
they offer additional services like freebusy list generation or boooking of 
resources.

> That said: There are clearly cases where XML objects are preferable.
> 
> In my view, the dividing line runs between transient and permanent
> information. Permanent information is everything that is potentially stored
> forever, so email, groupware objects, and user specific data, such as a
> personal dictionary, even colour sets for commonly used categories across
> all object types and the entire data tree.
> 
> Transient information is pure configuration data that loses its value with
> folders or objects disappearing, so often folder or object specific settings
> and meta data, e.g. folder colours, object specific caching or searching
> information, even the folder type.

In my view all parameters which are object specific should be within the Kolab 
object itself, e.g. in the XML. I don't think the difference between permanent 
information and transient information for one folder as zou describe it is 
that big.

> Another deciding bit can be the need to search for information quickly on
> the IMAP server without a full cache of everything, but this would have to
> be a very strong case to break the logic above, IMHO.

Fully caching all configuration folders and objects seems to be doable as this 
is not a lot information and memory is comparitively cheap. As the default 
locations are known, the places to look for are limited. As mentioned above 
the client could even cache the values in its own data structure and only look 
for changes since the last sync.

Best Regards,
Bernhard

-- 
Send with http://userbase.kde.org/Kontact_Touch from an ideapad running MeeGo