2 commits - KEP-0002.txt

Georg Greve greve at kolabsys.com
Wed May 18 17:48:39 CEST 2011


 KEP-0002.txt |  251 ++++++++++++++++++++++++++++++++++++++---------------------
 1 file changed, 165 insertions(+), 86 deletions(-)

New commits:
commit 7a56997f8852aa32ed77394b63d797f3771892e0
Merge: 8056192... 669a019...
Author: Georg Greve <greve at kolabsys.com>
Date:   Wed May 18 17:49:46 2011 +0200

    Merge branch 'master' of ssh://git.kolabsys.com/git/keps



commit 805619200c8cec12790207222ca2999347c3ab66
Author: Georg Greve <greve at kolabsys.com>
Date:   Wed May 18 17:49:17 2011 +0200

    New revision, based on last discussions w/Florian v. Samson & Bernhard Reiter

diff --git a/KEP-0002.txt b/KEP-0002.txt
index f2ca5bd..0ab8592 100644
--- a/KEP-0002.txt
+++ b/KEP-0002.txt
@@ -10,6 +10,7 @@
  |obsoleted_by=
  |related=
 }}
+
 __TOC__
 == Abstract ==
 
@@ -17,7 +18,7 @@ Kolab used to store all times in UTC and did not allow for time zone information
 
 It is primarily two issues that are being addressed by this KEP, one essential and technical, the other related to usability. The usability related issue is that users sometimes specifically set time zones for datetime fields and expects this explicit selection to be preserved across sessions and clients. Without storage of this information clients cannot meet that user expectation.
 
-The functional issue is the more important of the two: For non-recurring events there can be errors in the display of an event's time if DST rules have changed since the event was made. For recurring events in parts of the world with DST regimes it was impossible to define a recurrence that takes place at the same time all year because DST is defined in a '''dynamic''' offset towards UTC. Both issues are accentuated by the fact that DST rules are subject to political decisions taken in the future, and consequently unknown today.
+The functional issue is the more important of the two: For non-recurring events there can be errors in the display of an event's time if DST rules have changed since the event was made. For recurring events in parts of the world with DST regimes it was impossible to define a recurrence that takes place at the same local time all year and is correctly displayed by clients in other time zones. Both issues are accentuated by the fact that DST rules are subject to political decisions taken in the future, and consequently unknown today.
 
 In order to achieve a recurring event that retains its local time across DST transitions, a client must know which time zone to use. The implicit assumption of older clients to always use local time zone is problematic, as explained in subsection "Description of current client behaviour" below. So enabling time zone information for datetime fields is essential.
 
@@ -25,119 +26,192 @@ Some reference for background was provided on the Kolab format list <ref name="k
 
 == Update to the XML Format ==
 
-All objects hold datetime in the form of creation and modification times. Consistent time handling across all object types and occurrences of time objects is highly desirable. The following change therefore affects all Kolab object types. It is part of the changeset for version 1.1 of all objects.
+All Kolab object types hold datetime in the form of creation and modification times. Several other object types also hold other datetime fields. This KEP describes the canonical format for all datetime fields across all Kolab object types. This will ensure consistency and is part of the changeset for version 1.1 of all object types.
 
 Change of type: '''datetime'''
 
-The type for datetime storage in Kolab XML is modified as follows:
+=== Definitions ===
 
-* Clients '''MUST''' store all datetime fields in their ''authoritative time zone'' as selected by the user when entering the date/time.
-* The autoritative time zone '''MUST''' be one of 'UTC' (all caps) '''-- OR --''' geographical time zone identifiers in the uniform naming convention designed by Paul Eggert, specifying time zones from the Olson database, a.k.a. tz database, a.k.a. zoneinfo database <ref>[[wikipedia:Zoneinfo | Wikipedia: Zoneinfo]]</ref>.
-* Where no authoritative time zone is provided, clients '''MUST''' consider 'UTC' authoritative.
-* All datetime fields '''MUST''' be formatted according to {{rfc|3339}} <ref name="rfc3339">{{rfc|3339|title=Date and Time on the Internet: Timestamps}}</ref>.
-* Where UTC is authoritative, clients '''MUST''' use the UTC based Zulu notation for datetime fields, so YYYY-MM-DDTHH:MM:SSZ with 'T' and 'Z' as the literal characters. 
-* All datetime storage fields '''MAY''' carry up to one 'tz' attribute describing the time zone in the uniform naming convention designed by Paul Eggert, specifying time zones from the Olson database, a.k.a. tz database, a.k.a. zoneinfo database <ref>[[wikipedia:Zoneinfo | Wikipedia: Zoneinfo]]</ref>.
-* All fields that store future dates at the time of writing an entry, e.g. the 'start-date' and 'end-date' fields, '''MUST''' define the 'tz' attribute. Where the event should be calculated strictly against UTC, emulating previous behaviour, the value of the 'tz' field must be set to 'UTC' (all caps) explicitly.
-* All clients '''MUST''' be capable of parsing datetime fields according to {{rfc|3339}} <ref name="rfc3339">{{rfc|3339|title=Date and Time on the Internet: Timestamps}}</ref> format and '''SHOULD''' support loose parsing according to the superset provided by ISO 8601 <ref name="iso8601">ISO 8601: https://secure.wikimedia.org/wikipedia/en/wiki/ISO_8601</ref>.
-* When parsing a datetime field with 'tz' attribute, clients '''MUST''' treat this attribute as the most authoritative field and the stored time written as local time in this time zone, regardless of which offset the stored time specifies. Where this offset contradicts the 'tz' attribute, the 'tz' attribute '''MUST''' prevail. It is therefore recommended to ignore the offset where a 'tz' attribute exists and only fall back to it where it does not.
+==== Kolab ISO8601 Profile ====
 
-It is recommended that all clients '''SHOULD''' check the Olson database at least once every three months against the locally cached version, '''OR''' suggest update policies for their respective operating systems that ensure the Olson database gets updated regularly. As far as is known, all commonly used and supported GNU/Linux distributions do this already.
+Based on {{rfc|3339}} <ref name="rfc3339">{{rfc|3339|title=Date and Time on the Internet: Timestamps}}</ref>, the Kolab Groupware Solution specific profile of the ISO 8601 <ref name="iso8601">ISO 8601: https://secure.wikimedia.org/wikipedia/en/wiki/ISO_8601</ref> standard for representation of dates and times using the Gregorian calendar using the Augmented Backus-Naur Form (ABNF) is as follows:
 
-=== Canonical client behaviour ===
+ date-fullyear   = 4DIGIT
+ date-month      = 2DIGIT  ; 01-12
+ date-mday       = 2DIGIT  ; 01-28, 01-29, 01-30, 01-31 based on
+                           ; month/year
+ time-hour       = 2DIGIT  ; 00-23
+ time-minute     = 2DIGIT  ; 00-59
+ time-second     = 2DIGIT  ; 00-58, 00-59, 00-60 based on leap second
+                           ; rules
+ time-secfrac    = "." 1*DIGIT
+ time-in-utc     = "Z"
+
+ partial-time    = time-hour ":" time-minute ":" time-second 
+                   [time-secfrac]
+ full-date       = date-fullyear "-" date-month "-" date-mday
+ full-time       = partial-time [time-in-utc]
+
+ date-time       = full-date "T" full-time
+
+The "Z" to specify a time in UTC '''MUST''' only be used for times in UTC and '''MUST NOT''' be added to values in local time.
+
+Per ABNF, ISO8601 and RFC3339, the "T" and "Z" characters in this syntax are explicitly defined as the upper-case letters, usage of the lower-case letters is explicitly forbidden.
+
+{{note|Usage of time-secfrag|time-secfrag usage is regulated on a per object/per field basis, and '''SHALL''' be explicitly forbidden unless required.}}
+
+===== Examples =====
+
+Valid date-time fields according to the above specification are
+
+  2010-01-31T11:27:21Z
+  2005-12-19T02:55:23.437689098765
+  2001-06-19T11:01:23
+  2005-12-19T02:55:23.43Z
+  2011-05-01
+
+==== Kolab XML Datetime Type ====
+
+A field of "datetime" type '''MUST''' be compliant to the Kolab ISO8601 profile.
 
-When time zone information is provided, a client '''MUST''' consider the event local to this time zone. Recurrence '''MUST''' then be calculated to keep the event at the same local time within that time zone, adjusting the time for the event accordingly for the client's local time zone. For more detail see [[#Notes_for_client_implementors | Notes for client implementors]] below).
+It '''MAY''' have one additional "tz" attribute. The value of the "tz" attribute is a string, which '''MUST''' be one of 'UTC' (all caps) '''-- OR --''' geographical time zone identifiers in the uniform naming convention designed by Paul Eggert, specifying time zones from the Olson database, a.k.a. tz database, a.k.a. zoneinfo database <ref>[[wikipedia:Zoneinfo | Wikipedia: Zoneinfo]]</ref>.
 
-When tz is specified as 'UTC' or missing, a client '''MUST''' calculate recurrences strictly according to UTC.
+Where the definition of a field explicitly specifies storage in UTC only, the "tz" attribute '''MUST NOT''' be used.
 
-When modifying existing objects, clients '''MUST''' preserve the original time zone used for storage unless changed by user interaction. For instance the 'start-date' and 'end-date' time zone defaults presented to the user by the client '''MUST''' match those stored in the 'tz' attribute.
+Where the definition of a field explicitly allows storage in local time, the "tz" attribute '''MUST''' be used in all cases, including for storage of UTC.
 
-When adding new objects, clients '''SHOULD''' default to the local time zone of the user, but '''SHOULD''' allow the user to select the time zone for storage and consequently recurrence calculation.
+{{note|Storage of local time|Storage of local time and consequently usage of the tz attribute are regulated on a per-field basis, and '''SHOULD''' be explicitly required or forbidden.}}
 
-When clients encounter deviations from the schema, e.g. parsing datetime objects that do not match the writing conventions, or a missing 'tz' attribute for start-date or end-date in an event using version 1.1 of the XML schema, clients '''SHOULD''' inform the user of a potential issue, using the 'product-id' to help the user identify clients that might be broken also in ways that could corrupt other data.
+==== Kolab XML Date Type ====
 
-There will likely be an explicit KEP on this issue at a later point in time.
+The Kolab Date Type is defined as full-date.
+
+Where a field explicitly refers to a full-date '''ONLY''', the "tz" attribute '''MUST''' be used in all cases, including for storage of UTC.
+
+===== Examples =====
+
+Valid date-time fields according to the above specification are
+
+  <field>2010-01-31T11:27:21Z</field>
+  <field tz="Europe/Berlin">2005-12-19T02:55:23.437689098765</field>
+  <field tz="America/Sao_Paulo">2001-06-19T11:01:23</field>
+  <field tz="UTC">2005-12-19T02:55:23.43Z</field>
+  <field tz="America/Los_Angeles">2010-05-01</field>
+
+=== Kolab Date and Datetime Usage ===
+
+* Clients '''MUST''' store all date and/or datetime fields not based on user interaction/fields that are automatically generated '''-- AND --'''  carry values in the past or present in '''UTC only'''. This explicitly includes the following fields: 'creation-date', 'last-modification-date' of all Kolab object types.
+* Clients '''MUST''' store all date and/or datetime fields based on direct user interaction '''-- OR ---''' fields that may carry values which are '''NOT''' limited to UTC only in local time using the "tz" attribute. This explicitly includes the following fields: 'start-date', 'end-date' of all Kolab object types.
+* Clients '''MUST NOT''' use fractions of seconds (time-secfrag) for any datetime field in any Kolab object type unless the definition of that field and object specifically permit or require time-secfrag, which '''SHOULD''' always be done in a way to specify the maximum number of digits. Fractions of sections '''MUST NOT''' be used for any field in object types: 'note', 'contact', 'distribution-list', 'journal', 'event', 'task'. 
+* Clients '''MUST''' be capable of reading date and datetime fields that comply with the writing rules of this KEP and subsequent definitions of the Kolab object types they process.
+* Clients '''MUST''' preserve user preference and selection in the "tz" attribute to the maximum extent possible.
+* Clients '''SHOULD''' check if a new update of the Olson database or the authoritative database used by the system is available and get that update at least once every three months, '''OR''' suggest update policies for their respective operating systems that ensure the time zone data database gets updated regularly. As far as is currently known, all commonly used and supported GNU/Linux distributions do this already.
+* Clients '''MAY''' support loose parsing according to the superset provided by ISO 8601 <ref name="iso8601">ISO 8601: https://secure.wikimedia.org/wikipedia/en/wiki/ISO_8601</ref>.
+
+=== Canonical client behaviour ===
+
+* When creating a new object with time zone sensitive fields, clients '''SHOULD''' default to the local time zone of the user, but '''SHOULD''' allow the user to select the time zone for storage and consequently recurrence calculation;
+* When modifying existing objects, clients '''MUST''' use the value of the 'tz' attribute of the respective fields to set the default/preselected value for the editing of the fields, where applicable. For instance the 'start-date' and 'end-date' time zone defaults if presented to the user by the client '''MUST''' match those stored in the 'tz' attribute. The time zone stored in the 'tz' attribute '''SHOULD''' only be changed based upon user interaction;
+* When calculating recurrences, a client '''MUST''' calculate in a way that keeps the event at the same local time in the time zone stored in the 'tz' attribute. Clients '''MUST''' then use the result as the time from which to calculate the time of the event at the client's time zone. For more detail see [[#Notes_for_client_implementors | Notes for client implementors]] below);
+* When receiving iTip invitations, a client '''MUST''' treat the time zone id in the VTIMEZONE object as authoritative and, if it is not a valid Olson database time zone identifier, translate it using the translation matrix provided by Kolab Systems in cooperation with the Kolab ecosystem. If the time zone id in the VTIMEZONE element does not exist in the matrix, clients '''MAY''' attempt to map the time zone based on its rules to a currently used time zone -- '''AND/OR''' -- allow the user to select an appropriate time zone for event storage;
+* For recurrence calculation: When tz is specified as 'UTC', a client '''MUST''' calculate recurrences strictly according to UTC;
+* For recurrence calculation: Where tz is missing although the specification required it, a client '''SHOULD''' calculate recurrences strictly according to UTC;
+* When clients encounter deviations from the schema, e.g. parsing datetime objects that do not match the writing conventions, or a missing 'tz' attribute for start-date or end-date in an event using version 1.1 of the XML schema, clients '''SHOULD''' inform the user of a potential issue, using the 'product-id' to help the user identify clients that might be broken also in ways that could corrupt other data. There will likely be an explicit KEP on this issue at a later point in time. This mechanism '''MAY''' also be used for the update strategy, see below.
 
 === Examples ===
 
 Examples of valid 'start-date' fields using datetime structures according to the above specification are
 
- <start-date>2010-01-31T11:27:21Z</start-date>
+ <start-date tz="Europe/Rome">2011-05-01</start-date>
  <start-date tz="UTC">2010-01-31T11:27:21Z</start-date>
- <start-date tz="America/Los_Angeles">2005-12-19T02:55:23-08:00</start-date>
- <start-date tz="America/Sao_Paulo">2001-06-19T16:39:57.122-03:00</start-date>
- <start-date tz="Europe/Brussels">2001-06-19T11:01:23+02:00</start-date>
+ <start-date tz="America/Los_Angeles">2005-12-19T02:55:23</start-date>
+ <start-date tz="Europe/Brussels">2001-06-19T11:01:23</start-date>
+ <last-modified>2011-04-01T01:02:33Z</last-modified>
+ <future-high-precision-timestamp tz="America/Sao_Paulo">2001-06-19T16:39:57.1229853</future-high-precision-timestamp>
 
-{{note|Strict writing - loose parsing|There has been at least one case of a primary Kolab client writing incompliant datetime fields in the past. In such a case a more liberal parsing than writing rule makes for a more robust user experience even when users connect experimental third-party clients. Measures to detect and address such cases will likely be addressed in another KEP.}}
+{{note|Strict writing - loose parsing|The rules above have been carefully designed to address all the issues while tightening the writing rules as much as sensible to keep time stamps as consistent as possible. This is not to say that clients should not also understand times stamps that are slightly incorrect, but still consistent and readable. In such cases, clients should do their best to parse the time stamp correctly, but warn the user or administrator of a client that is not compliant with this KEP so the incorrect client can be fixed. Clients '''MUST NEVER''' rely on other clients' lax parsing.}}
 
 === Upgrade Path ===
 
-The previous datetime format used by Kolab XML formats up to and including version 1.0 based on strict UTC Zulu notation will continue to be understood and interpreted in the same way as was canonical behaviour before, although it is seems that at least some older clients did not implement canonical behaviour correctly (see [[#Current_client_behaviour | Current client behaviour]], below). So newer clients confronted with an older data set should safely operate according to specification.
+The previous date time format that was used by Kolab XML formats up to and including version 1.0 based on strict UTC Zulu notation continues to be the authoritative form of writing all the automatically/software generated fields which carry information that is in the past or present. So all old data can be parsed by new parsers, and old clients will continue to understand at least some of the fields written in the new format. 
 
-All clients must already preserve all tags and attributes they do not understand, but from discussion on the mailing list, not all clients properly guarantee this at the current point in time. So the newly introduced 'tz' attribute would be in peril of being stripped out by older clients.
+There will '''NOT''' be backwards compatibility for all types of newer objects, however. While in principle all clients should already preserve tags and attributes they do not understand, not all older clients properly guarantee this at the current point in time. So the newly introduced 'tz' attribute would be in peril of being stripped out by older clients. Older clients are also likely to falsely interpret data written by newer clients, potentially corrupting it upon write.
 
-Furthermore older clients will likely reject or falsely interpret the new format, making it necessary to update all clients as soon as possible. It is therefore recommended that newer clients warn users or administrators when discovering objects written according to the old specification, offer to update them to the new specification, and provide information that an update to newer clients is recommended for all users.
-
-Unavoidably, installations with older clients will continue to display recurrence times incorrectly. This is neither an improvement nor a deterioration of the current situation. 
+And finally it is unavoidable that older clients will continue to behave as they did thus far, continuing to display recurrence times incorrectly.
 
 === Smart Upgrade Option ===
 
-Clients '''MAY''' choose to use the version of the XML object to identify which event was created by an older client, and silently update it to have recurrence behave in the way the user expects. When doing so, it is recommended to assume the client's local time zone was the authoritative time zone, as that was what clients were assuming and showing to the user thus far. Alternatively, clients '''MAY''' request users to make an explicit choice for events the client detected as old recurrences suffering from the issue.
+When encountering data written to the old specification prior to this KEP, clients '''MAY''' choose to
+# continue to display the old data sets as the old clients did, leaving the old data unchanged;
+# bring up a dialogue informing the user of the change in data format, and suggest an update which should usually be done by the proper editing dialogue (especially for events) so users can provide the data that was absent in the old format;
+# silently update the data to the new data format, based on the assumptions that older clients made when interpreting this data, thus maximally preserving client behavior.
+
+The first is the '''recommended''' approach for non-recurring events, the second is the '''recommended''' approach for recurring events. The third should only be used with extreme caution and ideally some explicit user interaction for the entire process, i.e. an "update wizard" after which only new clients will be connected to this server.
 
-As both provides additional work for clients, these are '''NOT''' a requirement, and the specific path to choose is left to client implementors.
+As all of these approaches include additional work for client implementers, none of these are required.
+
+In any case it is '''NOT''' recommended to ever have older and newer clients coexist on a shared set of data, and client implementers should seek to implement advice to this extent for their users.
 
 === Notes for client implementors ===
 
 There may be future use cases for time zone storage and DST calculations, such as other Kolab object types.
 
-For events, which are the primary use case of this particular use case, there are two existing uses:
+For events and tasks, which are the primary use case of the particular issues this KEP addresses, there are two existing uses:
 
 ; '''Store user preference'''
 : A user typically has selected a time zone to enter a date/time. When storing without timezone information, that information is lost. So while the user might realistically expect the event to preserve the time zone they entered initially when editing it again, Kolab was unable to provide this functionality thus far.
-:* Unless already required because the date/time field is in the future (see above), clients '''SHOULD''' therefore always store a user-selected time zone, e.g. when different from the default choice offered.
-:* Clients '''MUST''' default to the information in the stored time zone when opening an event for editing, as otherwise the recurrence calculation based upon it might be inadvertently altered by the user.
 
 ; '''Event time calculation'''
 : Other than for the storage of user preference, recurrences are the most important but not the only use case for this Kolab Enhancement Proposal.
-:* Where the 'tz' attribute is explicitly set to 'UTC' or missing, clients '''MUST''' implement strict mapping to UTC. Otherwise the time zone stored in the 'tz' attribute '''MUST''' be considered authoritative, and the value of the time stored in the datetime field '''MUST''' be considered local to this time zone.
-{{note|Important: The meaning of UTC offset for local time zones|Where the authoritative time zone is '''NOT''' UTC, the offset to UTC stored in the datetime field is based on the DST and standard time zone assumption (see below) of the storing client at the time this event was created. This means it is unreliable and can be wrong due to a change of DST regime, as well as a change of standard time for this location. So clients should '''NEVER''' rely upon this offset for any calculation, and take it only for indicative purposes.}}
-:* Where the authoritative time zone is a local time zone, clients '''MUST''' arrive at the corrected authoritative time by ignoring the offset to UTC stored in the date/time field and calculate offsets to UTC freshly using the Olson database based on treatment of the stored time as local time in this time zone.
-{{note|Example|An event set for time zone 'Europe/Brussels' with the value '2010-12-24T18:00:00+02:00' should be interpreted as '18:00 in Brussels on 24 December 2010', so calculated with the correct UTC offset of +01:00 for standard time which is in effect, ignoring the erroneus +02:00 UTC offset, which may be owed to a client storing the event during DST, or a change in time zone information between the time this event was defined, and its actual occurence.}}
-:* The same methodology '''MUST''' be used for each instance of a recurring event.
+:* Where the 'tz' attribute is explicitly set to 'UTC' or missing (in which case clients '''SHOULD''' also issue a warning due to an incorrect data format), clients '''MUST''' implement strict mapping to UTC. Otherwise the time zone stored in the 'tz' attribute '''MUST''' be considered authoritative, and the value of the time stored in the datetime field '''MUST''' be considered local to this time zone;
+:* Where the authoritative time zone is a local time zone, clients '''MUST''' arrive at the corrected authoritative time by using their client- or system-wide time zone database (e.g. Olson, a.k.a. tzdata) to calculate the event, its possible recurrences, and its offset to UTC based on the database information;
+{{note|Example|An event set for time zone 'Europe/Brussels' with the value '2010-12-24T18:00:00' should be interpreted as '18:00 in Brussels on 24 December 2010', so calculated with the correct UTC offset of +01:00 for standard time which is in effect.}}
+:* For recurring events, the same methodology '''MUST''' be used for each instance of a recurring event;
 :* This corrected authoritative time can then be displayed (if the same as local time zone), be translated into the local time zone for display, or be translated to UTC for UTC based use cases.
-: All the aforementioned functions should typically be available by making use of existing system calls/libraries.
 
 ; '''Olson mapping database'''
-: There should be mapping between time zone locations and time zones on other systems from and to the Olson database locations. In order to achieve that, Kolab Systems will work with the various client implementors to provide a canonical database that all clients without Olson database can use for such mapping on their systems against the Olson database locations. This database will be provided in a location that is freely and publicly available.
+: There should be mapping between time zone identifiers from and to the Olson database locations for systems not based on the Olson time zone identifiers, i.e. Microsoft Windows. In order to achieve that, Kolab Systems will work with the various client implementers to provide a canonical database that all clients without Olson database can use for such mapping on their systems against the Olson database locations. This database will be provided in a location that is freely and publicly available.
+
+; '''Invitation handling'''
+: The mapping database should also be used for handling iTip invitations, which carry an arbitrary number of VTIMEZONE objects. When the id of the VTIMEZONE object is unknown, clients can fall back on automatic detection against the local database and/or user choice when storing the event. Clients '''SHOULD''' offer users the possibility to send an email with the VTIMEZONE object to kolab-format at kolab.org so the ID can be included into the next revision of the mapping database.
 
 == Description of the issue ==
 
-=== Ambiguities in relationship between UTC and local times ===
+=== Relationship between UTC and local times ===
+
+The functions to convert between UTC and local times are more complex than one might naively suspect.
 
-It is counterintuitive that UTC storage would involve ambiguity, but in regions with DST regimes it does.
+The reason for this lies in DST necessarily being implemented as a dynamic offset on UTC that is different from the offset on UTC that is as standard time. This creates the well known effect that in one direction the hour 2am to 3am exists twice, in the other direction it is missing. But the effects of this ambiguity are not limited to that one hour which is typically placed between Saturday and Sunday. As a result, the relationship between UTC, which computers use, and local time, which people experience, is ambiguous if the DST rules are not known.
 
-The reason for this lies in the way in which DST is necessarily implemented as a dynamic offset on UTC that is different from the offset on UTC that is standard time. This creates the well known effect that in one direction the hour 2am to 3am exists twice, in the other direction it is missing. But the effects of this ambiguity are not limited to that one hour which is typically placed between Saturday and Sunday. As a result, the relationship between UTC, which computers use, and local time, which people experience, is '''ambiguous'''.
+{{note|Example: How UTC and local time are correlated|11:00 in Europe/Berlin translates to 09:00 UTC in the summer (DST is UTC+2) and 10:00 UTC in the winter (standard time is UTC+1). When person A stores an event for 11:00 during the summer, while person B stores an event for 11:00 during the winter, one event will store 09:00 UTC, the other will store 10:00 UTC. So two events for 11:00 in Europe/Berlin will have two '''different''' UTC times stored, based on when they were made.}}
 
-{{note|Example: How UTC and local time are ambiguously correlated|11:00 in Europe/Berlin translates to 09:00 UTC in the summer (DST is UTC+2) and 10:00 UTC in the winter (standard time is UTC+1). When person A stores an event for 11:00 during the summer, while person B stores an event for 11:00 during the winter, one event will store 09:00 UTC, the other will store 10:00 UTC. So two events for 11:00 in Europe/Berlin will have two '''different''' UTC times stored, based on when they were made.}}
+The same mapping difficulty also exists in the other direction. An offset on UTC loosely correlated to a longitude, but on each longitude there is typically several countries with different DST regimes, switching at different times, in different directions (depending on the hemisphere), or not at all. So an offset to UTC does '''not''' correlate to time zones, it correlates to a group of time zones, with or without DST.
 
-The same ambiguity also exists in the other direction. An offset on UTC loosely correlated to a longitude, but on each longitude there is typically several countries with different DST regimes, switching at different times, in different directions (depending on the hemisphere), or not at all. So an offset to UTC does '''not''' correlate to time zones, it correlates to a group of time zones, with or without DST.
+The function to translate between UTC and local time therefore has more input parameters than just the time in UTC or local time.
 
-These ambiguities cannot be resolved without additional information.
+It is furthermore subject to change due to a variety of effects:
+
+* '''Time of switching between standard time / DST''': The dates of when a region switches from standard time to DST and when it switches back are set by a political process, and occasionally even changed on short notice, e.g. [http://www.timeanddate.com/news/time/turkey-starts-dst-2011.html Turkey] in 2011. 
+* '''The amount of switching''': Most regions switch by an hour between standard time and DST, but this is not a given. Currently only "[http://www.timeanddate.com/time/australia/lord-howe-island.html Australia/Lord_Howe]" is switching by only 30 minutes. There is no guarantee it will stay the only one.
+* '''The existence of DST switching''': Many regions routinely discuss getting rid of DST, some countries might in fact do this, as demonstrated in the next point.
+* '''The UTC offset of standard time''': The offset to UTC for standard time is also not guaranteed to remain stable, e.g. [http://www.timeanddate.com/news/time/russia-may-end-dst.html Russia] has plans to abolish DST entirely, and switch standard time over to what was previously DST.
+* '''What's stable?''' The geography of the planet and some of its geographical markers are significantly more stable over time than timezone rules. An example are major cities, which may change their name, but less often so their position.
 
 === Resolving the ambiguities between UTC and local time(s) ===
 
-Resolving the ambiguity always requires knowledge about the time zone, for which geographical identifiers are the most reliable and secure approach (also see below). But because DST rules are subject to change, a client can only have certainty about the translation between UTC and local time with some security for its current date and the past, and only if the client has been kept sufficiently up to date.
+Resolving the ambiguity always requires knowledge about the time zone, for which geographical identifiers are the most reliable and secure approach of identifying them (also see below). But because DST rules are subject to change, a client can only have certainty about the translation between UTC and local time with some security for its current date and the past, and only if the client has been kept sufficiently up to date.
 
-In consequence, the probability of a correct translation reaches certainty only for some time in the past, is fairly high for the present, and then decreases quickly for the future.
+'''In consequence, the probability of a correct translation reaches certainty only for some time in the past, is very high for the present, and then decreases quickly for the future.'''
 
 === Avoiding implicit assumptions ===
 
-This lack of certainty translates into necessary assumptions when trying to store datetime fields in UTC. The assumption typically made is that DST rules in the future are not going to change from what they are today, and it is typically made implicitly by the client when using the system functions to convert the time entered by the user into UTC for storage, using the system's database, which for GNU/Linux is based on the Olson database.
+This lack of certainty translates into necessary assumptions when trying to store date time fields in UTC when local time is intended. The assumption typically made is that DST rules in the future are not going to change from what they are today since the client could not know about them anyways, and it is typically made implicitly by the client when using the system functions to convert the time entered by the user into UTC for storage, using the system's database, which for GNU/Linux is based on the Olson database.
 
-But because a client cannot know whether DST rules are going to change between the time an event is scheduled and the date for which it is scheduled, this conversion is based on an assumption, which can be proven wrong later by socio-political changes.
+Because a client cannot know whether DST rules are going to change between the time an event is scheduled and the date for which it is scheduled, this conversion is based on an assumption, which can be proven wrong later by socio-political changes (see above).
 
-Furthermore, when this event is displayed later, there is no way for the client to know whether that initial assumption has held true unless of course the writing client also stored time zone information and the assumption on DST that was used to produce the UTC value. But even then it may be impossible for the client to correctly display what the user requested, because standard time zones are also subject to change.
+Furthermore, when this event is displayed later, there is no way for the client to know whether that initial assumption held true.
 
-Because the original selection of the user is lost in the conversion to UTC, there is ultimately no secure way to unambiguously retrieve the initial user's intent unless it is stored directly, as the local time entered by the user.
+Because the original selection of the user is otherwise lost in the conversion to UTC, or comes with a substantial amount of meta data, the easiest approach to restore the initial user's intent is storing it directly, as the local time entered by the user.
 
 == Current client behaviour ==
 
@@ -154,9 +228,7 @@ Then the following can occur:
 :# One year later, DST rules get changed, and propagated through the typical channels to all platforms.
 :# Two years later, the client looks up the event, knows that DST is in effect, and correctly translates 10:00 UTC to 12:00 local time in Europe/Berlin.
 
-So in effect, the event which was set for 11:00 Europe/Berlin is now incorrectly displayed at 12:00 Europe/Berlin due to the time zone change. So correct behaviour on all sides can lead to incorrect results due to this ambiguity between UTC and local time. The other way to invoke this scenario would be to store an event with an older client on a platform with outdated DST rules and read the same event with a current client with current DST rules.
-
-Only by adding one more bit of information could a client correctly resolve the ambiguity. This bit of information would have to be one of 'Did the client that wrote this believe that DST is in effect at this point in time? And was the client correct to believe so?' -- OR -- '''all''' UTC values need to be stored unambiguously, so against one of standard time or DST, and all calculations occur from the stable base line.
+So in effect, the event which was set for 11:00 Europe/Berlin is now incorrectly displayed at 12:00 Europe/Berlin due to the time zone change. So correct behaviour on all sides can lead to incorrect results due to this ambiguity between UTC and local time. 
 
 === Recurring Events ===
 
@@ -164,51 +236,41 @@ This problem also exists in recurring events, and affects them more often as the
 
 Existing clients currently make the implicit assumption that the time was specified in and should be calculated against the local time zone of the client itself. This will lead to issues when a user is changing time zones, or when participants in multiple time zones are concerned. This behaviour could be confirmed with both Kontact and the Kolab Web Client Horde.
 
-A weekly meeting is set for 11:00 every Wednesday in Zurich, Switzerland, starting on 23 June 2010. This gets translated stored in UTC as 2010-06-23T09:00:00Z. On Wednesday 17 November 2010 Switzerland has switched out of DST, the local timezone is therefore UTC+1. If correctly interpreting the stored information, the meeting should now start at 10:00. At 09:50 the KDE Reminder Daemon correctly informs the user that the conference call is about to start in 10 minutes.
+A weekly meeting is set for 11:00 every Wednesday in Zurich, Switzerland, starting on 23 June 2010. This gets translated stored in UTC as 2010-02-17T09:00:00Z. On Wednesday 17 February 2010 Switzerland is using standard time, the local timezone is therefore UTC+1. If strictly interpreting the stored information, the meeting should now start at 10:00. 
 
-KDE Kontact however incorrectly displays the meeting as scheduled for 11:00. The same is true for the Kolab web client based on Horde for all versions of Kolab <= 2.2.4. This however is equivalent to 10:00 UTC. When adding another user in Sao Paulo, Brazil to the equation, the event is shown as taking place at 06:00 local time, or 08:00 UTC, due to the Brazilian summer time with an offset of UTC-3 that went into the assumption for the calculation of the recurrence. The result is that two users, while being presented with a data set that looks consistent, will miss each other by two hours.
+Versions of KDE Kontact <=4.6.3 however display the meeting as scheduled for 11:00. The same is true for the Kolab web client based on Horde for versions of Kolab <= 2.2.4. This however is equivalent to 10:00 UTC. When adding another user in Sao Paulo, Brazil to the equation, the event is shown as taking place at 06:00 local time, or 08:00 UTC, due to the Brazilian summer time with an offset of UTC-3 that went into the assumption for the calculation of the recurrence. The result is that two users, while being presented with a data set that looks consistent, will miss each other.
 
-Which other clients exhibit the same behaviour is unclear, but it seems there is no reasonable assumption that current behaviour correctly models any rational use case.
+Which other clients exhibit the same behavior is unclear, but it seems there is no reasonable assumption that current behavior correctly models any rational use case.
 
 == Background notes on design decisions and backwards compatibility ==
 
-This section provides summaries of discussions around this KEP and the rationale that went into the design decisions above. It is primarily intended for the purpose of documenting at least part of the thought process and can be safely ignored by client implementors.
+This section provides summaries of discussions around this KEP and the rationale that went into the design decisions above. It is primarily intended for the purpose of documenting at least part of the thought process and can be safely ignored by client implementers. As a general note it should be said this is a complex issue, with no silver bullet or one solution that is so clearly better than all the others that everyone just has to agree - the problem can be resolved in separate ways, each bringing their own advantages and disadvantages, and ultimately a call of judgment and preference that had to be made.
 
 === UTC vs local time ===
 
-As demonstrated, storage in UTC without additional information is an oversimplification and does not solve the complex issues provided by documented existing use cases. For UTC storage to be able to address the situation, the bidirectional ambiguities between UTC and local time(s) need to be resolved. To achieve this, clients would have to store and interpret additional information besides datetime and time zone to resolve this ambiguity and re-calculate the original user's intent.
-
-This information would have to include in particular whether the stored time was calculated against standard time or DST. When displaying events, clients could then check (a) whether the writing client was correct in its assumption of DST for this date, and adjust accordingly if necessary, and (b) adjust the timing for recurring events to the differences in UTC offset for standard time and UTC. This calculation would have to be undertaken on each calendar item for most operations before times could be relied upon and used for further calculation, such as free/busy listing and so on. So while possible, it would provide substantial additional logic, which due to the complexity of the issue is prone to errors.
+As demonstrated, storage in UTC without additional information is an oversimplification and does not solve the complex issues provided by documented existing use cases. For UTC storage to be able to address the situation, the bidirectional ambiguities between UTC and local time(s) need to be resolved. To achieve this, clients would have to store and interpret additional information besides date time and time zone to resolve this ambiguity and re-calculate the original user's intent.
 
-Alternatively, clients could directly store the user's intent by storing local time when local time should be authoritative, or UTC, when UTC should be authoritative for the timing of the event.
+To address all the known issues, this information would have to include the entire time zone information. To address the most common issues, it would at least have to include the assumption made regarding whether or not DST would be in effect. Checking that stored data against the known rules at the time of interpretation would have to be undertaken on many objects and virtually all calendar items for most operations before times could be relied upon and used for further calculation, such as free/busy listing and so on. So while possible, it would require additional logic, which due to the complexity of the issue is prone to errors.
 
-This principle of '''store what you mean''' directly provides the information that otherwise would have to be carefully reconstructed from UTC storage through resolving the various ambiguities before being able to make use of stored information.
+Ultimately this approach would lead towards storing time zone data in the objects themselves in order to achieve consistency and security to preserve the intent of the user. This brings several pitfalls because time zone data is subject to change, in particular:
 
-The downside is that in this case there will always have to be calculation from local time to UTC for operations that translate between time zones or compare datetime in different time zones. But this would be performed on the most up to date version of the database available to the client, thus having a higher reliability than if this calculation had been done on storing the event, and is a standard operation that applications and operating systems routinely implement. So clients could largely fall back on re-using existing and well-tested solutions.
+* Clients may not be allowed to update, e.g. for shared calendars with only read permission;
+* Clients that "know" data to be outdated but cannot update the object may in fact prefer to present the user with a correct event, rather than displaying a wrong event consistently, which is behavior that has been observed in iCalendar clients, which have double storage of TZID & in-format data;
+* There are no good indicators when in-format data should be updated. Without these a client with the old timezone data would feel the urge to update the object again, in effect rolling the update back;
+* Periodic checks for updates would have to happen often if clients are required to do the updates. This will complicate objects and requires those clients to have access to other, authoritative sources;
+* Substantial size would be added to the event, often VTIMEZONE definitions can be larger than the data for the event itself. If more than one timezone is used, each timezone has to be within the object. As many appointments will be in the same timezone, the data within a folder or a Kolab account and server will be highly redundant;
+* Once a client notices the need for an update, many objects would be effected which would lead to the need of a lot of data transfer and backup space to update the redundant data. Backup space is expensive;
+* A client could only notify the user if it encounters an object in need of a timezone update when the folder is read-only. Again this could be many objects.
 
-There is also the additional advantage that in many cases it will be sufficient for clients to work with local time. While for UTC storage this would have to be reconstructed as described above, storage in local time immediately allows to interpret and work with the data, and simpler clients have a better chance of displaying to the user what the user actually expects to see.
+But the most substantial argument is the problem of finding a good algorithm that will determine when an update of the time zone data stored in the objects should be done. As there is currently no serial number of authoritative data attached to this information, such a mechanism would have to be invented and would further complicate implementation of this KEP.
 
-Thus storage of local time was understood to be the least complex solution to a complex issue.
+Therefore the principle of storing local time (understanding UTC as one possible time zone) along the lines of '''store what you mean''' has come out as the preferred option for most participants in the discussions, and all participants could accept this approach. 
 
 === Usage of the Olson database ===
 
-The fundamental problem of DST handling is one where technical systems need to conform to human policy making updates. So all solutions are likely to bring their own imperfections. There were three suggestions under discussion.
-
-One was to store DST data at the time the event was created in the events themselves. This gives all clients the same information to work with, which is good. This is also the path taken by {{rfc|2445}} <ref name="rfc2445">{{rfc|2445|title=Internet Calendaring and Scheduling Core Object Specification (iCalendar)}}</ref> with the VTIMEZONE component, which allows to statically encode DST rules. Such static encoding does however ensure that when DST rules change, all clients will display the event wrong.
-
-This is why reportedly various clients using iCalendar ignore the encoded static DST rules and instead rely upon the mandatory tzid field, often using the Olson database for lookup of the current DST rules, as that is also specifically mentioned in the VTIMEZONE section of {{rfc|2445}} <ref name="rfc2445">{{rfc|2445|title=Internet Calendaring and Scheduling Core Object Specification (iCalendar)}}</ref> as a good reference base for globally canonical time zone IDs.
-
-Usage of a system database brings the advantage that such updates in policy can be dynamically addressed and factored into the calculation without any need for user interaction. The downside of this would be that a client with an outdated database might display the time incorrectly if one of the relevant time zones was affected by an update that is not yet locally available, e.g. the client operating system has not been maintained for six months or more.
-
-And finally there was the possibility to combine static encoding with database, somewhat similar to what VTIMEZONE allows, use static encoding for calculation, but allow clients to update under certain conditions and in certain ways. After some discussion, it was agreed this was likely not easy to implement in a robust way that avoids editing wars between clients, takes into account the issues of ACLs and permissions and some other issues. Most importantly however, such an update could never be guaranteed to reach all clients because they might be located on different Kolab instances, or different groupware solutions altogether. For reasons of complexity and lack of advantage over the pure database solution, this approach was dropped.
-
-Between the static and the database approach, the decision ultimately became one of cost/benefit analysis with an eye on user expectation: While in static encoding there is a guarantee that all participants using Kolab will have one consistent time, that time is also ''wrong''. This might be considered a minor issue by some who prefer consistency over correctness, but that consistency gets lost as soon as the assumption of 'everyone is using Kolab' is taken out of the equation. What we then are left with is a situation where all users of Kolab consistently miss the meeting, whereas most users of other groupware solutions make it to the meeting at the correct point in time.
+Because of the disadvantages for in-format storage of time zone data, the preference of the majority was to store only a reference as geographical identifier, which was understood to be the most robust form of storing time zone references that can adapt to virtually all possible changes in time zones. When going down this route, agreeing on a limited set of time zone identifiers that is nonetheless complete makes it substantially easier on client implementers.
 
-The database approach on the other hand provides the chance that at least ''some'' users will obtain the correct result, whereas the users with outdated databases are going to find themselves with the same result that static encoding would have provided. The determining factor for the quality of results involving the database thus becomes its correctness.
-
-For this we expect major progress from RFC draft Timezone Service Protocol<ref name="douglass">Douglass, http://tools.ietf.org/html/draft-douglass-timezone-service-00</ref>. Once it has matured further and sees implementation, it is likely to provide accurate data with mapping of aliases to DST data without local databases, thus resolving the primary weakness of the database approach, and for its spread today will likely support the Olson timezone database aliases.
-
-Usage of a database is also inevitable to keep the allowed time zone selections synchronized between clients, ensuring a consistent user experience. While there is no officially recognized database that all applications refer to, the Olson database comes closest to being such a database and is reportedly (see <ref>[[wikipedia:Zoneinfo | Wikipedia: Zoneinfo]]</ref>) the existing default for
+The Olson database is specifically mentioned recommended in the VTIMEZONE section of {{rfc|2445}} <ref name="rfc2445">{{rfc|2445|title=Internet Calendaring and Scheduling Core Object Specification (iCalendar)}}</ref> as a good reference base for globally canonical time zone IDs and to our knowledge comes closest to an actual global standard for such time zone IDs. The Olson time zone data is also reportedly the data source most widely (see <ref>[[wikipedia:Zoneinfo | Wikipedia: Zoneinfo]]</ref>) used, in particular by:
 
 * BSD-derived systems, including FreeBSD, NetBSD, OpenBSD, DragonFly BSD, and Mac OS X;
 * the GNU C Library and systems that use it, including GNU, most Linux distributions, BeOS, Haiku, Nexenta OS, and Cygwin;
@@ -223,10 +285,28 @@ Usage of a database is also inevitable to keep the allowed time zone selections
 * several other Unix systems, including Tru64, and UNICOS/mp (also IRIX, still maintained but no longer shipped).
 * Python via pytz module
 
-Because the Olson timezone IDs are also used by the Unicode Common Locale Data Repository (CLDR) as well as the International Components for Unicode (ICU). The CLDR also provides mapping for Microsoft Windows time zone IDs to the standard Olson names. So using references to the Olson timezone database is likely the best out of an imperfect set of choices.
+Choosing the Olson time zone identifiers will simplify matters for all clients on the above platforms or using the above languages.
+
+The issue of most concern is that Microsoft Windows has its own time zone identification system. A bridge is however provided by the Unicode Common Locale Data Repository (CLDR) as well as the International Components for Unicode (ICU). The CLDR also provides mapping for Microsoft Windows time zone IDs to the standard Olson names. So using references to the Olson timezone database is likely the best out of an imperfect set of choices.
 
 In order to support clients on non-Olson platforms, as well as all clients in their iTIP <ref name="rfc2446">{{rfc|2446|title=iCalendar Transport-Independent Interoperability Protocol (iTIP)}}</ref> handling, Kolab Systems shall work with all client implementors to maintain and continue to make freely and publicly available a database to match various time zone ids to Olson database locations.
 
+=== iTip / VTIMEZONE completeness ===
+
+By choosing the approach of reference to Olson time zone IDs, Kolab Clients will not be able to easily implement a rarely used aspect of iTIP invitation.
+
+This is because iTip requires a client to parse multiple arbitrary time zone definitions and their VTIMEZONE data as part of an iTip invitation, and handle them correctly. If a Kolab client receives an invitation with an unknown time zone identifier that cannot be mapped to an Olson time zone ID, the client may not be capable of handling the invitation, at all. For all Olson time zone IDs, the recommended source of IDs for VTIMEZONE (see above), and for all common Windows time zone identifiers we expect to be able to handle them correctly in the first implementation of this KEP. Despite our research into the matter, we could not find a use case for fictional time zones made up as part of an iTip invitation.
+
+We also discovered that other clients also do not seem to implement iTip completely, e.g. Microsoft Exchange/Outlook only supports one VTIMEZONE object.<ref name="outlook"> "[MS-OXCICAL]: iCalendar to Appointment Object Conversion Protocol Specification" V 5.0 of 18 March 2011, http://msdn.microsoft.com/en-us/library/cc463911(EXCHG.80).aspx, Footnote 44 in Section 2.1.3.1.1.19 & Footnote 46 in Section 2.1.3.1.1.19</ref> 
+
+So full iTip compliance seems rare, if it exists, at all, and iCalendar & iTip both suffer from the weaknesses of in-format storage, as well as a lack of update rules and possibilities, which leads to inconsistencies in the way the data is handled between clients when the client knows the in-format data to be outdated. By going by time zone identifies strictly and references only, translating where necessary, we expect Kolab to provide a more consistent and robust user experience that comes closer to a users expectation than other iTip implementations which stick to VTIMEZONE storage.
+
+In any case the trade-off seemed an acceptable design decision to make.
+
+=== Future perspectives ===
+
+The design decisions outlined above also are considered a good match for the future because we expect major progress from RFC draft Timezone Service Protocol<ref name="douglass">Douglass, http://tools.ietf.org/html/draft-douglass-timezone-service-00</ref>. Once it has matured further and sees implementation, it is likely to provide accurate data with mapping of aliases to DST data without local databases, thus resolving the primary weakness of the database approach, and for its spread today will likely support the Olson timezone database aliases.
+
 == References ==
 
 {{Reflist}}





More information about the commits mailing list