Working with time in Python

How is time measured?

Time is usually measured by earth rotation or by astronomical observations. There are a lot of standards which specify how this should be done exactly: GMT, UT, TAI, UTC. In this post we will focus on UTC because of its widespread use in computing.

What time is it now?

The answer to this question will likely be different for people who live in different countries. Various geographical areas have their own time zone and their local time is derived from it. There are almost as many time zones as there are countries. There is also a global time (or universal time) which is the same no matter where you are geographically, it is called UTC. All time zones are expressed as a fixed offset of a certain number of hours and minutes from UTC, this is called UTC offset. For example, Eastern Standard Time corresponds to UTC-05:00. It is easy to get local time for certain time zone if we know the UTC time: we just need to take UTC time and add or subtract the offset (to which the time zone corresponds). If we need to convert local time in one country to local time in another country, we first convert to UTC by adding or subtracting the offset of the first country time zone and then doing the same but using the offset of the second country time zone.

Daylight Savings Time

The correspondence between a timezone and an UTC offset is not permanent. Certain countries change their clocks (usually by adding or subtracting one hour) in spring and autumn, this process is called a Daylight Savings Time or DST. For example, the Pacific Time Zone usually corresponds to UTC-08:00, however while the DST is active, the offset is UTC-07:00.

Local time and UTC translation caveats

How much info should we have in order to understand which point in time we are talking about exactly? To determine the local time precisely, we need three components: date, time and time zone name. If we omit the time zone name, date and time become ambiguous in case we are dealing with more than one region, because the same date and time now refer to more than one distinct points in time, one per region.

Referring to a single point in time is easier when we use UTC time, which is the same across the globe. Date and time with UTC offset is pretty much still a UTC, because it is trivial to remove the offset and get a pure UTC. However, there is one caveat: we lack one critical piece of information — the time zone name. We are still referring to a single point in time, but we can’t reliably translate this back to local time if we don’t know the time zone name. The problem is that a time zone might have one UTC offset in summer and a different one in winter, all due to DST and other political reasons.

It all gets increasingly complicated when we do date and time arithmetic, let’s say we have this date: 2018-01-30 15:30 UTC+02:02, now if we subtract 6 months from it, we get 2017-07-30 15:30 UTC+02:02. At first glance, everything seems to be correct, however the second date is in the period when the DST was active, which means that 6 months ago the time zone corresponded to UTC+03:00 offset, while we have UTC+02:02 offset, so if we try to convert the result to local time using this offset we are actually one hour off. Without time zone name it wouldn’t be possible to know about the offset change for that particular period and we would’ve ended with wrong local time in the end.

Date and time representations in computing

In computing, date and time are usually represented as a string with date and time information. There are a lot of standards that specify the format of the string, one of them is a popular ISO 8601 standard. A timestamp formatted according to ISO 8601 looks like this:

2018-01-30T15:30:00+05:00

There is also a numerical date and time representation — UNIX time, which is basically a number of seconds passed since a certain fixed point in time (epoch), specifically January, 1st, 1970 00:00:00 UTC. April, 17th, 2018 would equal to the following UNIX timestamp:

1523955600

A numerical representation is more inconvenient for date and time computations. For example, it is trivial to compare which date is older, because that would involve a simple integer comparison: a bigger number implies that the date is more recent because more seconds have passed.

Unfortunately, ISO 8601 standard doesn’t specify a way to include a time zone name with the time stamp, it only includes UTC offset (+05:00), so, this string doesn’t contain all the necessary info to convert this to a local time unambiguously due to the reasons explained earlier. UNIX time is in UTC so there is no time zone info either. If we want to be able to convert these timestamps to a local time we’d need to pass around a time zone name with them.

Due to the changing nature of time zones it is practically impossible to encode the universal rules to handle various time zone transitions. That’s why there exists a time zone database, it is called IANA database (also Olson database), which is actively maintained to account for every possible time zone change.

Date and time representation in Python

Python has a datetime module which has various tools to work with date, time and time zones. datetime module provides a class with the same name, which represents a date, time and time zone info. This class has an attribute tzinfo which can store a time zone object. A datetime object with tzinfo attribute set to None is called a naive datetime, while the one with tzinfo set to some time zone object is called an aware datetime.

There is an abstract datetime.tzinfo class which serves as a blueprint from which specific time zone classes can be built. However, out-of-the-box datetime module provides only timezone class, objects of which represent fixed offsets from UTC. Those objects must be constructed manually and they don’t have any knowledge about time zones, so this is not very useful on its own. Fortunately, there are other packages which solve this exact problem. pytz package implements IANA database in Python, so it is a must for any computations dealing with time zones. pytz has a timezone function which constructs tzinfo objects, with all related information, like time zone name, UTC offset, DST offset, etc.

>>> from pytz import timezone
>>> tz = timezone('Europe/Kiev')
>>> tz
<DstTzInfo 'Europe/Kiev' LMT+2:02:00 STD>

To safely construct an aware datetime with pytz you need a naive datetime object and a tzinfo object from pytz:

>>> from datetime import datetime
>>> naive = datetime(2018, 4, 1, 10, 1)
>>> aware = tz.localize(naive)
>>> # ISO 8601
>>> aware.isoformat()
'2018-04-01T10:01:00+03:00'
>>> # UNIX time
>>> aware.timestamp()
1522566060.0

To convert a datetime from one time zone to another we use a astimezone method:

>>> bkk = timezone('Asia/Bangkok')
>>> bkk
<DstTzInfo 'Asia/Bangkok' LMT+6:42:00 STD>
>>> aware.astimezone(bkk)
datetime.datetime(2018, 4, 1, 14, 1, tzinfo=<DstTzInfo 'Asia/Bangkok' +07+7:00:00 STD>)

Date and time arithmetic

It is possible to do date and time arithmetic with datetime.datetime and datetime.timedelta but it is inconvenient. A dateutil.relativedelta gives a more user-friendly and more flexible interface to do the arithmetic. For example, to subtract one month from a certain datetime, we can do this:

>>> from dateutil.relativedelta import relativedelta
>>> aware
datetime.datetime(2018, 4, 1, 10, 1, tzinfo=<DstTzInfo 'Europe/Kiev' EEST+3:00:00 DST>)
>>> month_ago = aware - relativedelta(months=1)
>>> month_ago
datetime.datetime(2018, 3, 1, 10, 1, tzinfo=<DstTzInfo 'Europe/Kiev' EEST+3:00:00 DST>)

You should strive to do date arithmetic in UTC time whenever possible, because UTC doesn’t care about DST and other time zone peculiarities. While it is possible to do the arithmetic in local time, it requires an extra care. Actually, our last example returned a point in time which is a month and an hour ago (because there was no DST a month ago) and that is not what we wanted. The correct result should look like this:

>>> tz.normalize(month_ago)
datetime.datetime(2018, 3, 1, 9, 1, tzinfo=<DstTzInfo 'Europe/Kiev' EET+2:00:00 STD>)

normalize method is very useful and should always be called after you do date arithmetic in local time. It basically accounts for DST or any other time zone changes according to IANA database.