Question: Individual annotations vs One large annotation (conceptual riddle for the interested)
Jeroen van Meeuwen (Kolab Systems)
vanmeeuwen at kolabsys.com
Thu Sep 15 16:52:33 CEST 2011
On 15.09.2011 13:05, Georg C. F. Greve wrote:
> On Thursday 15 September 2011 11.47:49 Jeroen van Meeuwen wrote:
>> The "cost" is not when /etc/imapd.annotations.conf needs to be
>> altered,
>> *if* the consumer has not edited said file. The *cost* is implied
>> when
>> the consumer has a copy of that file that is modified outside of
>> package
>> management - in which case proper packaging methods will not want to
>> alter the file's contents.
>
> True, although I guess we will never be at the point where we won't
> define
> *any* additional annotations, even the folder-config annotation would
> have to
> be defined.
>
> So there will always be SOME edits of the annotation file.
>
Yes, but pushing them out regularly just because there will always be
some edits to the annotations file anyway would be a flawed
justification to just ignore the cost as a downside of this option,
while we have other options - so I had to point it out.
>> - Documenting the ability to opt-out of features by removing the
>> annotation, and documenting opting in, including all combinations of
>> annotation keys and values, troubleshooting for and resolving issues
>> with clients that may or may not assume a certain set of annotations
>> to
>> (not) be available,
>
> True.
>
> Although the ability to opt-out or block a certain feature
> installation wide
> in a reliable way would be an argument for the annotation per use
> case,
> because an annotation that is not defined is simply not usable,
> whereas a
> key/value pair for folder-config can be set regardless of whether it
> is meant to be set or not.
>
Opting in and out of features such as saved searches or color though
has not been made a requirement or feature request as of yet. Should it
become a requirement or feature request, we can run in different circles
about its feasibility to implement either server-side or client-side and
the value of the -then to be explored- use-cases.
>> For the overwriting part, it is a relatively simple clause to, on a
>> single annotation, preserve the existing contents;
>
> This is not the case that I was concerned about.
>
> Think of the following:
>
> Client A reads folder-config
>
> Client B reads folder-config
>
> Client A sets 'search' to new value
>
> Client B sets 'color' to new value
>
> The modification of 'search' by Client A is now lost in a way
First of all, the same argument would apply to both clients operating
either the 'search' or the 'color' separate annotation -if in the same
namespace.
Second, a more relevant concern would apply to both clients operating
what will end up being mutually exclusive, separate annotations or
mutually exclusive options in the values of separate annotations. The
risk here is greater, because it takes longer to obtain all annotations,
2*n(n-1)/2 parse them (and detect conflicts), increasing the interval in
which another client could supposedly change the original value of the
annotation or any other annotation.
Third, the same would apply to client A 'deleting' a message client B
is 'flagging' or any other combination of such.
Fourth, while the one client is writing (to the mailbox path in the
annotations database), the other client will get a big fat NO response
-if it attempts to write at the very same moment- as the mailbox
annotation database would be locked for the submission of entry by the
one client. If the other client is stupid enough to use yesterday's
annotations, or without polling for updates after a "NO" response,
notwithstanding unsolicited METADATA responses as defined by the RFC in
section after a client issues an ENABLE command with the METADATA
capability keyword.
Fifth, 'search' is more likely to be a shared annotation (value)
whereas 'color' is most likely set once in the shared annotation (value)
for the default, but edited in the private annotation (value).
> that would be
> fairly hard for the support department to track and resolve,
That depends on logging (verbosity) capabilities more so then using
separate annotations or one annotation.
>> - For each annotation, the shared as well as the private values
>
> ...if the annotation is defined private and shared for this use case.
>
No, for each annotation, the shared as well as the private - if only
the shared or private has been defined in the specification, it needs to
be cleaned up. You're right when your point was only *conflicts* need to
be searched for in those shared and/or private annotations defined in
the specification. That makes it an x*y(y-1)/2 mesh then, where x is one
or two and y is the number of annotations defined.
>> - With one annotation any potential conflict can be detected both
>> when merely 'visiting' the folder as well as when attempting to
>> 'alter'
>> the folder, whereas with multiple annotations the retrieving of all
>> annotations and values and resolving said conflicts is mandatory,
>
> Actually you only need to retrieve the annotations that pertain to
> whatever it
> is you're planning to do, e.g. a change of color can ignore search.
>
Wrong, as there may be annotations a client is unaware of, that may be
conflicting with annotations the client is planning on setting. Such can
be easily circumvented by having the client poll for configuration of
the folder in one location, where 'type_*' style keywords with '*'
representing a certain capability can simply be found.
> Only with the large annotations you always must read everything.
>
Wrong, the client *retrieves* the full annotation value but only needs
to read;
- the top-level keys (iterate those) to find;
- keys it understands,
- keys it doesn't understand ('type_*' style keys?), for which it
can derive whether or not such keys may be conflicting (naming
convention).
>> If a client not compatible with 'search' specifically where to
>> be
>> able to detect (potential for) conflict, it;
>>
>> 1) would not know to retrieve the '/vendor/kolab/search'
>> annotation, but
>> - it would also not know what /vendor/kolab/folder-type
>> 'search' was for, and
>> - any potentially pre-populated search data is completely
>> wasted on said client.
>
> Yes, although this is equally true for the large annotation, as this
> is about
> the new folder-type idea in KEP 15, and not the question whether or
> not to go
> with one annotation per use case or one large annotation.
>
Wrong, this is not equally true for the "large" annotation, as with the
large annotation (the client now aware of where to find folder specific
configuration) the folder-type is freed up and *can* be used for the
original object type, with pre-population allowing said client to still
use the saved search - note, *can* be used for the original object type,
not *must* be used, perhaps it does have a different value, such as
'mixed' for example.
> If a client does not know about KEP 15, the 'search' annotation and
> the
> 'search' value in folder-config are both equally lost on the client,
> there are
> no obvious advantages or disadvantages to either.
>
> As to the new folder-type, if the client does not know about KEP 15,
> it also
> does not know that it *MUST NEVER* change objects in a prepopulated
> folder, so
> it would happily allow the user to do this, enabling diverging
> datasets,
> inconsistencies, and lost data.
>
Changing (copies of) objects (only applicable if folder is
pre-populated) in saved search folders by clients not compatible with
KEP #15 is mixture of implementation detail and a saved search folder
permission problem, resolved by restricting the user to not allow
editing the contents of said folder at all *if* and *only if* such
folder is to be pre-populated at all, being worked around in a fashion
that makes any folder implementing any of these new features be ignored
by the client -which it never does completely?
>> There is a locking mechanism in place for folder annotations,
>> similar
>> to the locking mechanism on IMAP folders, contents and metadata such
>> as
>> flags.
>
> How exactly does it work?
>
http://git.cyrusimap.org/cyrus-imapd/tree/imap/imapd.c#n8019
http://git.cyrusimap.org/cyrus-imapd/tree/imap/imapd.c#n8214
> I guess we then would need to specify that a client would always have
> to do
>
> #1: Lock
> #2: Re-Read
> #3: Modify
> #4: Write
> #5: Unlock
>
> to safely modify a folder-config annotation.
come on...; "to safely modify *any* annotation", or better yet, "to
safely execute *any* IMAP operation".
> The additional read is a bit of
> network overhead & delay, but probably not prohibitive in most
> scenarios.
>
If this is so much of a concern, I suspect we would have seen a lot
more issues logged stating the exact problem as is described could be a
problem, both with Kolab as well as Cyrus IMAP upstream. In fact, one
could consider it an IMAP design flaw.
>> How large an annotation is exactly depends on a variety of factors
>> including but not limited to the complexity and brevity of a query
>> language for search, which is yet to be explored / defined.
>
> It depends on that and on what else will then go into this annotation
> in the
> future provided we define this as the canonical way. So I see
> potential for
> this annotation to grow beyond 10k easily.
>
Well, from the top of my head;
- Identity configuration (reply/respond with 'sales at kolabsys.com'
identity as opposed to 'greve at kolabsys.com', ...)
- Favourite folder (boolean),
- local subscription (per application, do we dare do this?),
- alarm / reminder configuration,
- z-push / active-sync,
- horde,
- ... (other clients)
and whatever else we can come up with has a purpose / use-case valuable
enough to pursue. Adding annotations for each of these 1) on the server,
2) in every client and 3) in the documentation is more difficult of a
process then if we were to outline a key-value pair in an existing
annotation.
>> That said, however, all annotations need to be retrieved regardless,
>> for both private and shared.
>
> True.
>
> But only those you actually need at the time, not all of them all the
> time.
>
Again, firstly, with many annotations retrieving any annotations is
subject to the client's understanding of which annotations are available
and the server's understanding of what are valid annotations.
Secondly, with many annotations, in order to be able to determine
whether or not there is a potential conflict, and in order to be able to
determine which takes precedence when content is retrieved for display,
all annotations will need to be retrieved.
Now, so far I've only heard the following arguments against a single
folder-config annotation;
- simultaneous editing - exists for every single operation in IMAP,
- no ability to opt-out server-side - not a requirement / feature
request,
And one against *potentially* preserving the original folder-type
value;
- search results potentially having mixed object types as results - but
'folder-type' MAY still have another annotation value such as 'mixed' or
'random', so that clients that do not know how to display the contents
of folders with multiple object types in it MUST / SHOULD / MAY ignore
the folder entirely.
Am I missing something?
--
Kind regards,
Jeroen van Meeuwen
--
Senior Engineer, Kolab Systems AG
e: vanmeeuwen at kolabsys.com
t: +44 144 340 9500
m: +44 74 2516 3817
w: http://www.kolabsys.com
pgp: 9342 BF08
More information about the format
mailing list