Storing UTC? There is a good case for local time!

Mon Dec 20 20:05:21 CET 2010

Hi all,

I am sure most of you know the following quote:

 	"For every complex problem, there is a solution that is 
		simple, neat, and wrong."
									-- H. L. Mencken

I am afraid, this may apply to some extent to the idea to reduce complexity by 
storing everything in UTC. Allow me to try and explain the situation in a new 
way to help everyone get up to speed.

Point #1: There is *ambiguity* in UTC

This is highly counterintuitive, and it does not apply to zones without 
Daylight Savings Time (DST). But for zones where DST regimes are in effect, UTC 
is ambiguous: The same local time translates to two different UTC times based 
on whether DST or standard time are in effect. This is a direct and inevitable 
result of DST being implemented as a dynamic offset to standard time.

So 11:00 in Berlin is either 10:00 UTC (winter) or 09:00 UTC (summer).

As a result, there is no way to restore the intended local time - which is 
typically what the user cares about - from a date stored in UTC unless we know 
whether this was calculated from standard time or DST.

Point #2: Clients cannot foretell the future

Our certainty for whether standard time or DST is in effect at any given date 
in the year is almost 100% for the current date (today). It increases into the 
past as a function of how up to date the database is that we use for making 
that call.

That certainty dissolves rapidly into the future to the point that we are 
fairly unsure about that decision for a variety of reasons, including that DST 
decisions are political, and thus depend on a variety of factors we cannot 
predict.

Point #3: Assumptions for pure UTC storage to work well

Taking #1 and #2 into account, UTC storage works well when our certainty of 
the rules to convert local time to UTC approaches 1, and is unlikely to change 
between the time the storage was made, and the date will be used.

This is true for all fields that store times on the current day or in the past, 
so the assumption holds true for creation date, modification date and such.

The assumption is guaranteed to be wrong for recurring events which recur long 
enough to see at least one shift of standard time to DST or vice versa.

This is what initially started the KEP 2 process.

Point #4: Most calendaring is about the future

The assumption under #3 also has an increasing chance to be wrong with events 
being scheduled into the future. The further into the future an event is 
stored, the higher the chance that the assumption will be wrong. 

Maybe an example helps to explain the assumption.

This is what clients do today:

A user stores an event for 11:00 in Berlin on March 20th 2014.

According to what we know today, there will be standard time in effect, which 
the client will look up through the system's database and consequently 
translate into 10:00 UTC.

In 2012, due to some strange occurence that none of us sees coming, DST 
boundaries are changed by the German parliament for 2014 and later, now the 
time will switch to DST in February.

When the scheduled date arrives, the client faithfully translates 10:00 UTC to 
12:00 local time in Berlin because now DST is in effect, and the user will miss 
their important long scheduled meeting.

The assumption that I'm talking about is the initial assumption by the client 
storing the event, that on March 20th 2014, there would be standard time in 
effect. It just has no way of knowing that, or more importantly: It has no way 
of knowing what future events might change that.

So the further an event is stored into the future, the higher the chance the 
assumption is wrong. Naturally the same would also hold true for systems that 
use outdated databases today for events scheduled next year.

The reason we haven't seen complaints about this yet is likely because people 
rarely schedule things far enough into the future for this to become an issue, 
and most systems have sufficiently up to date DST databases already.

It does however show the fundamental flaw of UTC based storage due to the 
ambiguity between local times and UTC, which require several assumptions or 
additional information to become reliable.

POSSIBLE SOLUTIONS:

One way to address this would be to allow storing local time.

This way a client could store the *intent* of the user and allow the clients 
to sort out the timing in the future when they can in fact be reasonably sure 
of the relationship between local time and UTC.

The other way would be to store UTC, but also store the assumptions made.

So in addition to the tz attribute, we'd have a DST attribute that would 
describe whether or not that client thought that DST would be in effect at this 
particular time - so clients can test that assumption and adjust the stored 
time where they now see the initial client was wrong.

Both would allow to model all issues that I've seen so far.

UTC storage alone, or even UTC storage + timezone are both oversimplifications 
of a fairly complex problem. So we'll need to introduce one more degree of 
complexity.

The question is: Which one?

Thoughts, opinions and proposals very much welcome.

Best regards,
Georg

-- 
Georg C. F. Greve
Chief Executive Officer

Kolab Systems AG
Zürich, Switzerland

e: greve at kolabsys.com
t: +41 78 904 43 33
w: http://kolabsys.com

pgp: 86574ACA Georg C. F. Greve
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 308 bytes
Desc: This is a digitally signed message part.
URL: <http://lists.kolab.org/pipermail/format/attachments/20101220/8cd4aa9d/attachment.sig>

Storing UTC? There *is* a good case for local time!

Storing UTC? There is a good case for local time!