KEP 2: Modification of datetime type, introduction of 'tz' sub-tag

Wed Dec 1 13:00:27 CET 2010

Hi Georg,

Am Dienstag, 30. November 2010 19:34:23 schrieb Georg C. F. Greve:
> Hi Hendrik,
>
> On Tuesday 30 November 2010 10.52:45 Hendrik Helwich wrote:
> > For backward compatibility i agree that clients should be able to read
> > RFC3339 for the first koab format version. But for writing i think
> > clients MUST write a clear normailzed datetime format like the Zulu
> > format and also do not need to read RFC3339.
>
> If clients can read it for ONE Kolab version, they can read it for all of
> them. I don't think anyone would really want to have multiple parsers for
> two formats where one format is a limited subset of the other.
>
> > This is because i see no real benefit for kolab in the complex RFC3339
> > datetime format. You can have partially timezone information in that
> > datetime format. And there is no need for this information. Why not omit
> > this unused information to make the format more clear and remove
> > redundancy?
>
> As another discussion demonstrated, it is not actually time zone
> information, just an offset. While - as long as all times are always stored
> without DST in effect - it allows to extrapolate some idea about the
> meridian, it does NOT allow to extrapolate to the correct DST regime.
>
> So it is a pure offset, and *not* a redundancy.

I am aware that a local offset from the UTC time zone is not a full time zone 
information but anyhow it is a partially timezone information in my opinion. 
So i would say it *is* a redundancy. 

> But as an offset it is fairly harmless, and parsers can be expected to have
> no issues deal with it. So it does not introduce any risk or instability.

I still think it is good to not allow writing unnecessary redundant 
information to the kolab format. RFC3339 is much more complex than reading 
the strict Zulu format. So more complexity could always be a risk.

> But you are right it is also not strictly necessary.
>
> If anyone has an idea why it ended up in the format, I'd be interested in
> the story. My guess is that it somehow came from ISO8601, from which
> RFC3339 was then derived as a schema, IIRC.

I am also interested in this. I used one library in the past which implements 
RFC3339 and the output of the parser was always a UTC datetime struct. 
Currently i don't really understand why the offset and this complexity is 
useful.

> > If we have the possibility to change things in version 1.1 i would like
> > to pick up the idea of Andrew (Mail from 12.11.2010) to specify times
> > directly in local time e.g. for the element "last-modification-date" in
> > the Zulu format and for the element "start-date" optionally in a local
> > time like this: [...]
> > So the Suffix 'Z' could indicate that it is a UTC time. For a local time,
> > the timezone which is specified in the kolab format xml must be used.
>
> Is this RFC3339 compliant?

I think it is not RFC3339 compliant. But as far as i know it is not possible 
to store local times (without time zone information) with the RFC3339 
datetime format. And i would agree with Andrew that local times are the 
natural modelling for e.g. event start times. I borrowed the idea from the 
iCalendar format.
But this local time would only result in a better human readability of the 
kolab format. A UTC datetime would also be an adequate solution.

> It seems like it would establish a third, RFC3339 incompatible format,
> which would be very easy to mistake for an RFC3339 compatible format,
> because it probably defeats the expectation of many client implementors
> that someome would go to create yet another time format when RFC3339 is
> widely used for this purpose.

I agree that using commonly used proposals is always a better way than invent 
things again. I just think that the RFC3339 datetime format is much to 
complex and do not fit the need of the kolab datetime format very well.

> That seems like a rather risky proposition to me.
>
> > You suggested that the Zulu format could be not sufficient. I think you
> > are referring to the milliseconds. Do you see a use case where this could
> > be needed?
>
> We are planning to extend Kolab integration into the area of real time
> collaboration technologies, among other things, including collaborative
> editing and such.
>
> It is entirely foreseeable that such applications would be based upon
> RFC3339, so if RFC3339 is not supported it would necessitate translation
> which would introduce yet another potential source for error with no gain,
> and we might find that some of these applications actually make use of the
> milliseconds.
>
> So yes, I would like to not close that door.
>
> As a compromise proposal - because strict Zulu UTC with the time zone
> information is sufficient for the purposes of all existing Kolab objects -
> we could say that
>
>  * Clients *MUST* be parsing datetime as RFC3339
>
>  * As a general rule, Clients *SHOULD* always write datetime in the
> simplest possible format
>
>  * For all existing objects individually we specify UTC Zulu *MUST* be
> used.
>
> This way we'd keep backward compatibility, and older clients will be able
> to continue reading the timestamps unless they have implemented the
> implicit "do not read versions above your level" rule.
>
> At the same time we ensure that we have a smooth path to integrate other
> technologies, including those for which we'd encounter the current approach
> to fall short.
>
> Because there is nothing yet that uses it, clients would gain a grace
> period to switch to full RFC3339 parsing where they aren't already using
> it.
>
> What do you think?

This could be a good compromise i would say.
This problem reminds me of the quirks/standards mode in HTML. So for the 
current kolab format version we need something similar like the quirks mode 
in HTML to handle it. 
But from kolab format version 1.1 maybe it is possible to be more strict and 
do something similar like the standards mode in HTML.
An XML Schema could be created to specify the kolab 1.1 format. There is also 
a datetime format in XML Schema available which uses RFC3339. This 
additionally could be restricted by an regexp pattern to force e.g. the Zulu 
format. All clients must write valid XML with respect to this XML Schema.
This would have two main advantages:

(a) Existent tools can be used to automatically validate the kolab format. If 
kolab clients write invalid kolab XML data they would need to fix it (and not 
the other clients).

(b) There are existing XML Schema binding tools which could be used to parse 
and create the kolab xml format and which create suitable data structures. 
This would simplify the implementation of a kolab client a lot, because you 
do not need to work with the XML directly.

> > So in fact i think we have to do a trade-off here and decide what is more
> > important:
> > (a) to use actual time zone data
> > (b) to assure that all people come to the same time to a meeting
>
> You are right there is a trade off here. But this is not it.
>
> With static time zone specifications, some people will still miss their
> 11:00 meeting because they know it's at 11:00, it has been there for years.
> So when the computer tells them it is now at 10:00, they will probably
> ignore the computer, and still go at 11:00, just like they'll ignore their
> car navigator that tells them to turn right when they know they need to go
> straight.

The meeting will only be at the same local time for people which are in the 
same time zone like the event (or have the similar DST rules).
This would also stay if the time zone date is outdated
So for the people which are used to the fact that the meeting will be at 11:00 
this will still be at 11:00 if the timezone date is outdated.
It could be switched wrong for people which are in different time zones and 
this people should already be used to the fact that the meeting time changes 
over the year.

> If that same guy's time zone file is out of date, but everyone else's is
> still up to date, he'll just as gladly go to the meeting at 11:00, and meet
> everyone as planned, and all is well.
>
> What I am not trying to say is that ALL people would behave like this, some
> people will behave differently. What I am trying to explain is that *NONE*
> of these options can *GUARANTEE* that everyone will be at the right time at
> the right place.

Ok that could never be guaranteed you are right.
But the kolab clients must anyway show the same time to the people i would 
say. With this argumentation we do not need to adapt the kolab format and add 
time zone at all.

> The questions are:
>
>  (a) Which method has the better chance of providing correct information?
>  (b) Which method is more robust against future developments?
>
> and, of course,
>
>  (c) Who will tech support (and consequently the user) blame for the
> failure?
>
> The answers to these questions are fairly straightforward.
>
> Answer to (a): As long as DST rules do not change, both perform equally
> well.
>
> Once DST rules change, the static encoding *will* break, the database *may*
> break, but only if the user has not updated their system in quite a while,
> because such changes are prepared politically, then communicated,
> incorporated into the database and made available quickly.

But can we rely on that everyone does always update his system directly if 
updates are available? There are many different systems out there. And even 
if the people have the possibility to update the system can we trust that 
they do it? 

> So in most cases there is a substantial update window which would only be
> realistically missed in an unmaintained, unserviced and essentially
> orphaned installation of Kolab. I'd expect those users to have much bigger
> problems than a recurring meeting that switched one week early or late.
>
> Answer to (b): The database.
>
> Because there is an RFC in the works that will do for DST information what
> RFC 1305 did for time synchronization. Just like there are very few people
> today who use atomic clock receivers to set their system time and instead
> rely on NTP, only very few people will use database updates, and instead
> rely upon the network service.

This would be great if the time zone database is always automatically 
synchronized with a central authority.
But when will this be available? And how long must we wait till we can rely on 
that the vast majority of systems are automatically up to date?

> Answer to (c):
>
> For the static approach: Kolab.

Yes. But at least they cannot blame Kolab that the times of a meeting are not 
the same to the attendees

> For the database: The platform provider.

If the times in a kolab client are shown differently to the people, some users 
will still blame Kolab or the client for that, i would say. 

> So the database scores better on every single issue. That is why, even
> though it is not (yet) perfect, I see it as the better of two imperfect
> choices.
>
> > Additionally it could be allowed for clients to update the timezone data
> > in the kolab item if they notice that its outdated.
>
> That only seems to combine the weaknesses of both approaches, and none of
> the strengths, while increasing the burden on client implementors and
> complexity.
>
> It adds many questions to which we'll have to discuss answers, such as:
>
> When does one client determine that the static information is out of touch
> with reality? How does it ensure its update is the better one, and won't be

Its was just an idea how to make sure that (a) the time is the same for all 
people (b) the included timezone is up to date.

A client could e.g. check the last modification time of the kolab item and 
check if the included timezone was updated after that time in the local time 
zone database and replace the time zone in the kolab item if this is true.

> overwritten by another client which may or may not correctly see that the
> first client was wrong in its assumption? How do we ensure that all clients
> have received the latest updates before displaying the event to their user?

The kolab clients must synchronize with the imap Server. This is already done 
by the different Kolab clients.

> > > So it will likely be supported by an NTP-like service to update DST
> > > information in the future, which will give us maximum reliability and
> > > assurance of correct display, with no change to the storage format.

This sounds great. But when will it be used by the most people/os?

Best regards,

Hendrik

> > To be honest i have doubts that this will work correctly in practice.
>
> This is a much simpler problem than NTP.
>
> Considering that NTP seems to work pretty well in practice, I have little
> doubt this can be made to work, to be honest.
>
> > All people on all the different systems need to have always a database
> > which is fully mappable to Olson database and which also needs to be in
> > the same state. How can it be assured that all people always update their
> > database at the same time?
>
> Firstly, they don't need the SAME database. They only need a version of the
> database that correctly gives DST rules for the current instance.
>
> If rules have not changed in the region in question, but only in another
> which is irrelevant to the calculation at hand, different versions will
> deliver the same result.
>
> Secondly, when limiting it to "versions of the database that are correct
> with regards all the zones relevant to the calculation", not everyone will
> need to have them, only those providing that service.
>
> A cron job of importing the latest version once a day ought to do just fine
> for that. Remember: The Olson database changes very infrequently, and for
> events that are months in the future. So as long as there is at least one
> update per half-year, you're typically on the safe side.
>
> Best regards,
> Georg