A summary of sorts (was: Re: Why and when storing local time? (Re: Basic rationale of the KEP #2 design))
Florian v. Samson
florian.samson at bsi.bund.de
Mon Apr 4 10:50:22 CEST 2011
Georg,
Am Donnerstag, 31. März 2011 um 18:16:19 schrieb Georg C. F. Greve:
>
> On Thursday 31 March 2011 15.49:08 Florian v. Samson wrote:
> > Please provide a pointer to the DST switching dates of UTC:
>
> I never said that UTC switches.
1. Semantically UTCWND (="UTC with no DST") implies that regular UTC has
DST, which is does not: <http://en.wikipedia.org/wiki/UTC#Daylight_saving>
2. In your example below the opposite is true: UTC has no DST (correct), but
your "UTCWND" has a (supposedly arbitrary, in your example the German) DST
offset added to UTC (in "summer"), so it is actually rather "UTCWSD" (="UTC
with some DST").
> What I said is that local time is expressed in variable UTC offsets.
O.K., this is true for sure (DST comes and goes, TZs become redefined etc.),
but you never wrote it that way before, IIRC. Having read this email
completely, I now understand that this was just confusion in terms (no
matter on whose side).
> > Until then I stick to believe UTC == "UTCWND"
>
> For UTC == UTCWND to hold true, the conversion to either one from local
> time and back would have to be identical.
3. Hence it is not to be called "UTC*" anymore, as it loses the property of
being an *universal* time everywhere around the globe; it seems to have a
specific, local DST-delta added.
4. I still fail to see, what the definition of "UTCWND"/"UTCWSD" shows,
proves or adds, except confusion.
First you define your "UTCWND" (or rather "UTCWSD") to have certain
properties, then you show that your "UTCWND" really has these specific
properties: IMHO this is a circular conclusion, leading to nothing.
Or am I missing something?
> But UTCWND was specifically
> defined to not behave identically, as it is supposed to behave as if DST
> did not exist.
O.K., *you* defined your "UTCWND" the way *you* want to.
But I still fail to understand what that strange mishmash of UTC with an
DST-dependent local offset helps to explain: IMHO nothing.
> That is what I tried to explain in
> http://kolab.org/pipermail/kolab-format/2011-March/001287.html
I sure read it (multiple times, actually), but copying the same stuff again,
without providing any new explanation what this is about or intended to
show, cannot provide new insights.
> Copying from there: Considering that 10:00 in Europe/Berlin
>
> * in the winter translates to
> 09:00 UTC
> 09:00 UTCWND
>
> * in the summer translates to
> 08:00 UTC
> 09:00 UTCWND
>
> So any local time with DST regime maps to different times for UTC and
> UTCWND, which means they are *not* identical, nor were they defined to
> be.
But it does not show / prove / explain anything, IMO. It is just some
arbitrary definition of an artificial "Georg's mishmash time".
> > As discussed here numerous times: everybody usually thinks in local
> > time, so that is what is meant, no matter if currently DST is active
> > or not.
>
> Exactly. So we need storage that preserves or at the very least allows us
> to restore that user intent from the stored data.
Yes, this is what I always proposed: store the TZ-data for the TZ identified
by the TZ-ID, which we already agreed (hopefully) to store there. This
will allow "to restore that user intent from the stored data.", AFAICS.
> > > but it adds one step to get to local time for all fields past,
> > > present and future.
> >
> > Yes, but it can retain compatibility with older versions of the
> > Kolab-format: a big pro, IMHO.
>
> No, actually, because older clients stored data in UTC.
Sure "older clients stored data in UTC", so there is no change in their
behaviour: This is what backward compatibility is about, hence I do not
understand your "No". Can you please explain?
> As shown above, UTC != UTCWND, so clients will interpret data wrongly.
This is no change from the situation today, thus no drawback, rather "as
good as it can get" for those older clients.
> We agree that compatibility is a major issue.
Good (and the best is still to come, IMO).
> Because the old client behaviour was broken due to format
> insufficiencies, one- way compatibility is the best we can ultimately
> hope for. Older clients will never correctly interpret a full set of
> newer data regardless of which solution we choose.
Ack, +1.
> So the best we can do is a solution where newer clients will read and
> interpret older data correctly. Because the old format is a very limited
> subset of RFC3339, local time storage in RFC3339 fits that bill.
In order to retain compatibility with older clients, we would have to stick
to that "very limited subset of RFC3339" which implied stating date-time in
Zulu (UTC), AFICS, and add new ("top level", as Bernhard would interject
here) XML-tags.
> UTCWND on the other hand has no compatibility in either direction.
Yes, this is why I never understood your digression about "UTCWND". Maybe I
just missed somebody's statement which triggered that, but I had the
impression "UTCWND" was solely your invention.
> > > It can also not be stored as RFC3339, as that explicitly specifies
> > > UTC.
> >
> > So let it be it Zulu-time (UTC).
>
> As explained before, that requires an additional point of metadata,
> namely whether or not the client assumed that DST would be in effect for
> this appointment.
>
> So it's UTC + TZ-ID + DST Assumption.
Bernhard and I (I hope correctly) had the impression that "DST Assumption"
just means "the TZ-data for the TZ identified by the TZ-ID", so this
translates to:
UTC + TZ-ID + TZ-data
This is exactly what I proposed, and what Joon proposed as well (the way I
understood him) quite a while ago. Hurrah, I would have never thought
that such a large part of this lengthy discussion thread was just happening
due to misunderstandings. :-))
> Clients would then have to adjust the stored UTC value accordingly if the
> DST assumption turns out to be wrong. Some more work would have to go
> into this to ensure that possible things like a changed base offset gets
> calculated correctly.
Ack, +1, even though I do not really understand what is addressed
with "ensure that possible things like a changed base offset gets
calculated correctly".
(Your first sentence in above paragraph covers what I tried to express a
couple of times before, and obviously failed to accomplish.)
> That is clearly a possibility, but much more complex than local time +
> TZ-ID.
"Yes" *and* "No": I think we both could extract a lengthy list of pros and
cons WRT to the properties and the technical implications of either
solution from the lively discussion throughout the last couple of months on
this mailing list.
Maybe we should, put the two proposals, that list and the reasoning for the
final decision in the KEP2. This would also dispel my criticism WRT
transparency of the process.
BTW, I disagree with your "much" above, and how is backward compatibility
ensured, then? ;-)
> > No, this is only correct for clients in the same time zone as the event
> > itself;
>
> You misunderstood what I tried to say: Finding out what the user intended
> to store requires zero steps when storing the intended local time.
>
> That from there on you have multiple calculations to get to the end
> result is correct, but also true for all other approaches.
Yes, so what? As this is no difference, it is neither a "pro" or "con" for
any solution.
> > > Nobody seems willing to advocate for it strongly,
> >
> > I did and still do, in case you forgot. An I still believe it should
> > be the primary source of TZ-data for the Clients accessing that
> > Kolab-object, and the rules to dynamically update that in-format
> > TZ-data should be well defined.
>
> Alright. As you may have seen, using static information as the primary
> source of information is not shared by most people.
a. I refrain from jumping at the "static" (again), as I probably have
triggered that by emphasising that in-format TZ-data is updateable and IMO
should be updated as well.
Can we both agree, that a local zoneinfo-database and in-format TZ-data both
are possible sources of TZ-data for a Kolab-client, but with different
properties?
Maybe we can even agree that the crucial properties are:
- Local TZ-database:
Pros: 1. supposed to be automatically updated with / by the OS, hence
Kolab-clients do not have to take care of the freshness of
TZ-data.
2. No need to store any TZ-data in Kolab-objects, thus saving
disk space.
(A very rough estimate: ca. 100 Bytes of TZ-data in a couple
of thousand events per user = a couple of hundred KBytes.)
Cons: 1. Kolab Format-parsers may not have a TZ-database at hand
(= easily in reach), hence a PIM-Client using such a parser
is being precluded to conform to and properly utilise KEP2,
under both technical *and* behavioural aspects.
2. As every Kolab-client uses its own copy of TZ-data, displayed
event times may deviate due to the varying freshness of those
TZ-databases, depending on update frequency of each single
client PC and the OS / OS-distribution used (rsp. the freshness
and correctness of their distribution of a TZ-database).
I am sure I forgot some points: everybody feel free to add some.
- In-format TZ-data
Pros: 1. As it is a single, central source of TZ-data for all
Kolab-clients accessing an Kolab-object, all those clients
can display identical date-times. Consequently switching to
more recent TZ-data for an event (date, task) is an atomic
operation / event for all those clients.
2. Not all clients must have access to a TZ-database.
Cons: 1. Updating in-format TZ-data is a process which must be properly
specified, in order to avoid "update wars" in each Kolab-object
(e.g. via a time-stamp or some kind of a version string) and in
IMAP-directories (thrashing the IMAP-server by concurrently
trying to update Kolab-objects, because the local TZ-database
was updated simultaneously; every client waiting a random number
of minutes followed by iteratively checking that each
Kolab-object's TZ-data is still stale immediately before
updating them for each writable IMAP-directory containing
Kolab-objects will alleviate this issue).
2. Even though the additional disk space occupied by TZ-data in
thousands of Kolab-objects is of minor size, for most usages
the TZ-data fields will mostly contain identical, hence
duplicate information.
Again, I sure missed a couple of points.
And the different points have different weights.
b. Yes, I read you and Bernhard clearly arguing against it, but IIRC even
Jeroen has not made a concise statement WRT the priority of in-format vs.
local sources of TZ-data, yet (and we seem to have left everybody else
behind, see below).
I have the impression many of the other subscribers of this mailing-list are
tired of the whole ongoing KEP2 discussion, even though at least you,
Bernhard, Jeroen and I seem to agree, that properties and implications of
each proposed solution must be very well understood, in order to avoid
mishaps like the very reason for KEP2: a significant flaw in the
Kolab-format specification.
This is one of various reasons, why I believe a transparent compilation of
technical and behavioural pros and cons in KEP2 for each proposed solution
is crucial, so people not following the discussion closely (anymore) are
still able to form a substantiated opinion without retracing the whole
discussion (to be honest, that was my original plan for KEP2, after I read
KEP1 for the first time; I ended up retracing via the web-frontend more
than once, but for a good part this is due to my leaky memory).
> Would you be willing to live with a way to encode this format in KEP 2 as
> a cache that is subordinate to the dynamic TZ-ID based lookup, but may be
> used where such lookup is too hard and can be updated under certain
> conditions?
Actually I think that using in-format TZ-data as a cache ("subordinate" to a
local TZ-ID based look-up of TZ-data) is a reasonable compromise, which
dispels my concerns WRT incapable Kolab-format parsers. This would also
allow for switching the priority of the in-format vs. the local sources of
TZ-data just by changing the specification (and consequently the client
behaviour), but without changing the format of Kolab-objects, in case a
different assessment of the weight of the pros and cons anytime later:
nice.
(Still then some of the cons from both in-format and local TZ-data sources
become effective, but I think the combined pros outweigh these by far.)
But as discussed above, then we also have all ingredients ("DST assumption")
needed in every Kolab-object to retain backward compatibility by sticking
to the date-time format specified in the existing Kolab-format
specification (Zulu only, no milliseconds etc.).
So as this is such a low hanging fruit, why not grab it?
Cheers
Florian
P.S.: For the definition of in-format TZ-data, Bernhard suggested
(<http://kolab.org/pipermail/kolab-format/2010-December/001161.html>)
looking at <http://tools.ietf.org/html/draft-douglass-timezone-xml>, which
is also what somebody else suggested later on this list ("do not reinvent
the wheel", using the experience and efforts condensed in recent
iCal-RFCs).
One technical design detail in the timezone-xml specification worries me a
little: the use of XML-namespaces, which are a quite modern XML-features,
which might not me well supported in all XML-libraries used, yet. If that
worry should turn out to be substantiated, an algorithmic transformation of
the XML-namespaces defined in the timezone-xml specification to XML-subtags
(e.g. per a provided RegEx) may be a solution.
More information about the format
mailing list