Time Zones in Python

Python datetimes are naïve by default, in that they do not include time zone (or time offset) information. E.g. one might be surprised to find that (datetime.now() - datetime.utcnow()).total_seconds() is basically the local time offset (28800 in my case for UTC+08:00). I personally kind of expected a value near zero. This said, datetime is able to handle time zones, but the definitions of time zones are not included in the Python standard library. A third-party library is necessary for handling time zones. In our project, a developer introduced pytz in the beginning. It all looked well, until I found the following:

>>> from datetime import datetime
>>> from pytz import timezone
>>> timezone('Asia/Shanghai')
<DstTzInfo 'Asia/Shanghai' LMT+8:06:00 STD>
>>> (datetime(2017, 6, 1, tzinfo=timezone('Asia/Shanghai'))
...  - datetime(2017, 6, 1, tzinfo=timezone('UTC'))
... ).total_seconds()

Sh*t! Was pytz a joke? The time zone of Shanghai (or China) should be UTC+08:00, and I did not care a bit about its local mean time (I was, of course. expecting -28800 on the last line). What was the author thinking about? Besides, it did not provide a local time zone function, and we had to hardcode our time zone to 'Asia/Shanghai', which was ugly.—Disappointed, I searched for an alternative, and I found dateutil.tz. From then on, I routinely use code like the following:

from datetime import datetime
from dateutil.tz import tzlocal, tzutc
datetime.now(tzlocal())  # for local time
datetime.now(tzutc())    # for UTC time

When answering a StackOverflow question, I realized I misunderstood pytz. I still thought it had some bad design decisions; however, it would have been able to achieve everything I needed, if I had read its manual carefully (I cannot help remembering the famous acronym ‘RTFM’). It was explicitly mentioned in the manual that passing a pytz time zone to the datetime constructor (as I did above) ‘“does not work” with pytz for many timezones’. One has to use the pytz localize method or the standard astimezone method of datetime.

As tzlocal and tzutc from dateutil.tz fulfilled all my needs and were easy to use, I continued to use them. The fact that I got a few downvotes on StackOverflow certainly did not make me like pytz better.

When introducing apscheduler to our project, we noticed that it required that the time zone be provided by pytz—it ruled out the use of dateutil.tz. I wondered what was special about it. I also became aware of a Python package called tzlocal, which was able to provide a pytz time zone conforming to the local system settings. More searches and reading revealed facts that I had missed so far:

  • The Python datetime object does not store or handle daylight-saving status. Adding a timedelta to it does not alter its time zone information, and can result in an invalid local time (say, adding one day to the last day of daylight-saving time does not result in a datetime in standard time).
  • The time zone provided by dateutil.tz does not handle all corner cases. E.g. it does not know that Russia observed all-year daylight-saving time from 2012 to 2014, and it does not know that China observed daylight-saving time from 1986 to 1991.
  • The pytz localize and normalize methods can handle all these complexities, and this is partly the reason why pytz requires people to use its localize method instead of passing the time zone to datetime.

So pytz can actually do more, and correctly. I can do things like finding out in which years China observed daylight-saving time:

from datetime import datetime
from pytz import timezone
china = timezone('Asia/Shanghai')
utc = timezone('UTC')
expect_diff = timedelta(hours=8)
for year in range(1980, 2000):
    dt = datetime(year, 6, 1)
    if utc.localize(dt) - china.localize(dt) != expect_diff:

It is now clear to me that the pytz-style time zone is necessary when apscheduler handles a past or future local time.

A few benchmarks regarding the related functions in ipython (not that they are very important):

from datetime import datetime
import dateutil.tz
import pytz
import tzlocal
dateutil_utc = dateutil.tz.tzutc()
dateutil_local = dateutil.tz.tzlocal()
pytz_utc = pytz.utc
pytz_local = tzlocal.get_localzone()
%timeit datetime.utcnow()
310 ns ± 0.405 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
%timeit datetime.now()
745 ns ± 1.65 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
%timeit datetime.now(dateutil_utc)
924 ns ± 0.907 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
%timeit datetime.now(pytz_utc)
2.28 µs ± 18.4 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
%timeit datetime.now(dateutil_local)
17.4 µs ± 29.6 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
%timeit datetime.now(pytz_local)
5.54 µs ± 11.8 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

My final recommendations:

  • One should consider using naïve UTC everywhere, as they are easy and fast to work with.
  • The next best is using offset-aware UTC. Both dateutil.tz and pytz can be used in this case without any problems.
  • In all other cases, pytz (as well as tzlocal) is preferred, but one should beware of the peculiar behaviour of pytz time zones.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s