An introduction to Temporal Date Types in Neo4j

With Neo4j 3.4 GA now out in the wild, a few people have asked me about the new Temporal data types that have been introduced. In this post I will go over the a few of the new additions and provide a working example.

TL;DR: Jump to…

When I first started using Neo4j back in 2014, I can remember one thing was missing – support for native date datatype. The holy grail. This lead to some pretty complex clever approaches to dates including Time Trees which represented the dates as a tree of nodes with labels :Year, :Month and :Day and even down to second level.

As the performance of index-backed range seeks improved, the need to maintain these trees of nodes reduced but you were still left with a dilemma. Do you store the properties as a human readable date? Maybe seconds since epoch? Do I need to use APOC to convert the dates.

Luckily, from version 3.4, we no longer need to worry about it.

Neo4j now supports five temporal types, all a combination of date, time and timezone.

TypeDescriptionSupports Date?Supports Time?Supports Timezone?
DateA tuple of Year, Month and DayY
TimeAn instance of a point in timeYY
LocalTimeA time that is considered “local” in it’s contextY
DateTimeA combination of Date and TimeYYY
LocalDateTimeA combination of Date and Time that can be considered “local” in it’s contextYY

Clocks

Before I go into the new types, I first need to mention clocks. When creating a new date or time value, you have the option of chosing one of three clocks.

  • The Transaction clock – Uses the date and time at the start of the current transaction – this is the same as cypher’s timestamp() function which does not update during the transaction.
  • The Statement clock – Transactions can contain more than one statement. To use the date and time of the current statement, use the static .statement() function.
  • The Realtime Clock – This returns the real time date regardless of the transaction and statement start dates. This is the equivalent of apoc.date.currentTimestamp()

Each of these functions also accepts a single parameter, allowing you to specify the timezone of the returned instance. For example datetime.statement(‘Europe/London’). If you don’t specify a timezone, the server’s default timezone will be used.

Dates

Getting the Current Date

The date() function can be used to create in instance of a Date. As mentioned earlier, you can call the static method on dates to return a time based on the start of the transaction, statement or realtime.

RETURN date(), date.transaction(), date.statement(), date.realtime()
date date.transaction date.statement date.realtime
"2018-05-16" "2018-05-16" "2018-05-16" "2018-05-16"

The date function will also take two optional arguments, a Map of options and/or a timezone in String format.

Specifying a date

A date can be specified in one of two ways, either by passing through a valid ISO 8601 data type as a String, or by providing a map containing the year, month and day.

UNWIND [
  date('2018-01-01'),
  date({ year: 2018, month: 1, day: 2 })
] AS date
RETURN date
date
"2018-01-01"
"2018-01-02"

The individual units of the date value can be accessed via year, month and day accessors.

WITH date('2018-05-16') AS date
RETURN date.year, date.month, date.day
date.year date.month date.day
2018 5 16

Another nice feature of the ISO 8601 support is the ability to create dates by the week number (2018-W51), quarter (2018-Q2) or ordinal dates (2018-364).

Time

Time values are created using the time() function. Like a Date, a time instant can be created with either an ISO 8601 string or a map containing hour, minute, second, millisecond and/or nanosecond values. As with the Date, there are accessors for each unit of time held in the

UNWIND [
  time('12:34:56.1234'),
  time({ hour: 12, minute: 34, second: 45, millisecond: 123, nanosecond: 400000 })
] AS time
RETURN time.hour, time.minute, time.second,  time.millisecond, time.nanosecond, time.timezone
time.hour time.minute time.second time.millisecond time.nanosecond time.timezone
12 34 56 123 123400000 "Z"
12 34 45 123 123400000 "Z"

DateTime

A DateTime is a combination of both date and time and can be constructed using a combination of the date and time constructs mentioned above.

UNWIND [
  datetime('2018-01-02T07:20:30[Europe/London]'),
  datetime({ year: 2018, month: 1, day: 2, hour:07, minute: 20, second: 30, timezone: 'Europe/London' })
] AS date
RETURN datetime
datetime
"2018-01-02T07:20:30[Europe/London]"
"2018-01-02T07:20:30[Europe/London]"

Local Dates and Times

Local Dates and Times are simply a way of storing data without the need for extraneous data. Instances of LocalTime and LocalDateTime can be considered “local” in their context, meaning a delivery shipped at 16:00 in London would be in in GMT or BST without needing to store the timezone.

It is worth noting that local dates and times are not comparable to date and time data that are stored in different timezones. A package shipped from Berlin with a LocalTime of 16:00 could not be compared to the time of the package shipped from London.

RETURN localtime(), localdatetime()
localtime localdatetime
"18:53:44.727000000" "2018-05-16T18:53:44.727000000"

Durations

Neo4j 3.4 also supports durations. A duration is defined as the difference between two instances in time. To create a duration, we can either pass through a string containing a valid duration string or a map containing one or more components

The String version starts with a P and then contains one or more of the following

  • xY – Number of years
  • xM – Number of months
  • xD – Number of days

And then one or more time units, prefixed with a T

  • xH – Number of hours
  • xM – Number of minutes
  • xS – Number of seconds with milliseconds after a decimal point.

For example, P2012Y4M2DT14H37M21.545S denotes a duration of 2012 years, 4 months, 2 days; 14 hours, 37 minutes, 21 seconds and 545 milliseconds.

Alternatively, the map can contain one or more of: years, quarters, months, weeks, days, hours, minutes, seconds, milliseconds, microseconds, nanoseconds.

Let’s take a look at a couple of examples in action…

UNWIND [
  duration('P30DT4980S'),
  duration({ days: 30, hours: 1, minutes: 23 })
] as duration
RETURN duration
duration
"P0M30DT4980S"
"P0M30DT4980S"

We can sum both a date and a duration together to provide us with a date 90 days in the future.

WITH date() as now, duration('P90D') AS duration
RETURN now, now + duration AS then
now then
"2018-05-16" "2018-08-14"

Three months until my birthday, I now expect a birthday present from you…

Durations between dates

You can calculate the difference between dates by using the duration.between(start, end) function. This function compares two points in time and returns an instance of a Duration.

WITH datetime.transaction() AS start
CALL apoc.util.sleep(2000)
WITH start, datetime.realtime() AS end
RETURN duration.between(start, end) as duration
duration
"P0M0DT2.003000000S"

There are also duration.inSeconds, duration.inDays duration.inMonths functions which discard any smaller components to provide a rounded figure.

Truncating Values

Any temporal value can be truncated using the truncate() function. Truncating an instant allows you to round a value to the nearest unit of time – for example, removing the seconds or milliseconds from a value.

datetime.truncate('seconds', myDate)

Config Changes

There is one additional configuration setting included in neo4j.conf. The db.temporal.timezone setting is used to configure the default timezone for the server. By default this is set to UTC (Z).

db.temporal.timezone=Europe/London

Converting Dates

If you’re currently using the timestamp() function or milliseconds since epoch, you can supply a map with epochMillis set to the timestamp property stored on the node.

MATCH (e:Event)
SET e.startDate = datetime({ epochMillis: e.startsAt })
If you are storing seconds rather than milliseconds, you can define a <code>epochSeconds</code> option instead.

A Working Example: Route Planning

To demonstrate how these temperal data types work, I will revisit the Journey Planning project that featured in my talk at Graph Connect 2017 in London. For brevity, I will use the basic scheduling data rather than the extensive model.

Let’s run a quick cypher statement to set up some test data. First I’ll create a :Service node, each service will have one or more scheduled :Leg nodes. Each leg will have two relationships to signify which :Station a traveller can board from or alight to.

// An array of legs with their origin, destination and departure and arrival times
WITH [
  {origin:'PAD', destination:'REA', departsAt:'08:00', arrivesAt:'08:28'},
  {origin:'REA', destination:'DPW', departsAt:'08:30', arrivesAt:'08:43'},
  {origin:'DPW', destination:'SWI', departsAt:'08:45', arrivesAt:'08:58'},
  {origin:'SWI', destination:'CHI', departsAt:'09:00', arrivesAt:'09:13'},
  {origin:'CHI', destination:'BAT', departsAt:'09:15', arrivesAt:'09:28'},
  {origin:'BAT', destination:'BRI', departsAt:'09:30', arrivesAt:'10:00'}
] as legs
// Create the Service Node
MERGE (service:Service { reference: '0800-GWR-PAD-BRI' })
WITH service, legs
// Create (:Service)-[:HAS_SCHEDULED_LEG]->(:Leg)
UNWIND legs AS leg
MERGE (l:Leg {
    reference: service.reference + '-'+ leg.origin + '-'+ leg.destination,
    departsAt: leg.departsAt,
    arrivesAt: leg.arrivesAt
})
MERGE (service)-[:HAS_SCHEDULED_LEG]->(l)
// Create (:Station)-[:CAN_BOARD]->(:Leg)-[:CAN_ALIGHT]->(:Station)
MERGE (origin:Station { reference: leg.origin })
MERGE (destination:Station { reference: leg.destination })
MERGE (origin)-[:CAN_BOARD]->(l)
MERGE (l)-[:CAN_ALIGHT]->(destination)
// Create (:Leg)-[:NEXT_LEG]-=>(:Leg)
WITH collect(l) as legs
UNWIND range(0, size(legs)-2) as idx
WITH legs[idx] as this, legs[idx+1] as next
MERGE (this)-[:NEXT_LEG]->(next)

This should give us some data to play with.

The Data Model
Looking good…

Creating a Schedule

As standard, we want customers to be able to book a journey 90 days in advance. Rather than creating the processes manually, it would be useful to create a script that can be run daily to create all services in batch.

Each day, we would like to create the schedule for 90 days time. We can do this by adding a 90 day duration to the current date:

WITH date() + duration('P90D') AS scheduleDate

Then, let’s match all of the services and their legs (for the sake of argument, let’s pretend we’ve also checked that the services are valid for the scheduledDate…)

MATCH (service:Service)
WITH scheduleDate, service, [ (service)-[:HAS_SCHEDULED_LEG]->(l) | l ] as legs

Then create a :ServiceDay node. We can use the accessors from scheduleDate to create a unique reference for the Service for that particular day. While we’re at it, we can set the date property to our Date instance.

MERGE (day:ServiceDay {
  reference: service.reference +'-'+ scheduleDate.year +'-'+ scheduleDate.month +'-'+ scheduleDate.day
})
SET day.date = scheduleDate
MERGE (service)-[:HAS_SERVICE_DAY]->(day)

Next, unwind the scheduled legs, get the origin and destination stations and create the legs for the service on that day.

WITH scheduleDate, service, legs, day
UNWIND legs AS leg
MATCH (origin)-[:CAN_BOARD]->(leg)-[:CAN_ALIGHT]->(destination)
MERGE (l:ServiceDayLeg {
  reference: leg.reference + '-'+ scheduleDate.year +'-'+ scheduleDate.month +'-'+ scheduleDate.day
})
SET
    l.departsAt = localtime(leg.departsAt),
    l.arrivesAt = localtime(leg.arrivesAt),
    l.duration = duration.between(l.departsAt, l.arrivesAt)
MERGE (day)-[:HAS_LEG]->(l)
MERGE (origin)-[:CAN_BOARD]->(l)
MERGE (destination)<-[:CAN_ALIGHT]-(l)

Why LocalTime? We’re only dealing with a single timezone so there is no need to store a timezone with the time. If you’re dealing with services in different countries then you can compare instants in different timezones using Time.

Lastly, let’s combine the legs together into a linked list so we can traverse through the journey.

WITH service, l ORDER BY l.departsAt ASC
WITH service, collect(l) AS legs
UNWIND range(0, size(legs)-2) AS idx
WITH legs[idx] AS this, legs[idx+1] as next
MERGE (this)-[:NEXT_LEG]->(next)

Full Cypher Statement

WITH date() + duration('P90D') AS scheduleDate
MATCH (service:Service)
WITH scheduleDate, service, [ (service)-[:HAS_SCHEDULED_LEG]->(l) | l ] as legs
// Create Service Day
MERGE (day:ServiceDay {
    // We can use the .year, .month, .day accessors on a date type
    reference: service.reference +'-'+ scheduleDate.year +'-'+ scheduleDate.month +'-'+ scheduleDate.day
})
// ... and set the date as a property
SET day.date = scheduleDate
MERGE (service)-[:HAS_SERVICE_DAY]->(day)
WITH scheduleDate, service, legs, day
// Unwind the legs
UNWIND legs AS leg
MATCH (origin)-[:CAN_BOARD]->(leg)-[:CAN_ALIGHT]->(destination)
MERGE (l:ServiceDayLeg {
    reference: leg.reference + '-'+ scheduleDate.year +'-'+ scheduleDate.month +'-'+ scheduleDate.day
})
SET
    // The arrival and departure times can be converted to LocalTime
    l.departsAt = localtime(leg.departsAt),
    l.arrivesAt = localtime(leg.arrivesAt),
    l.duration = duration.between(l.departsAt, l.arrivesAt)
MERGE (day)-[:HAS_LEG]->(l)
MERGE (origin)-[:CAN_BOARD]->(l)
MERGE (destination)<-[:CAN_ALIGHT]-(l)
WITH service, l ORDER BY l.departsAt ASC
WITH service, collect(l) AS legs
// Create :NEXT_LEG relationships
UNWIND range(0, size(legs)-2) AS idx
WITH legs[idx] AS this, legs[idx+1] as next
MERGE (this)-[:NEXT_LEG]->(next)

So…how long will it take me to get home this evening?

MATCH (origin:Station {reference: 'PAD'})-[:CAN_BOARD]->(start:ServiceDayLeg),
      (destination:Station {reference: 'SWI'})<-[:CAN_ALIGHT]-(end:ServiceDayLeg)
MATCH path = (start)-[:NEXT_LEG*0..10]->(end)
WITH duration.between(start.departsAt, end.arrivesAt) as journeyDuration
RETURN journeyDuration, journeyDuration.minutes as minutes
journeyDuration minutes
"P0M0DT3480S" 60

Indexing Temporal Types

Equality and range lookups on temporal data types are backed by indexes, making queries extremely fast. Indexes are created in the same manner as before.

CREATE INDEX ON :ServiceDay(date)

The planner shows that a simple range query uses a NodeIndexSeekByRange stage.

explain MATCH (s:ServiceDay)
WHERE date('2018-08-12') &gt;= s.date &gt;= date('2018-08-15')
RETURN s

The Query Plan shows that range queries are indexed backed

Further Reading

Neo4j 3.4 is still in it’s early stages the moment so many of these features are still in development. Still, you can try them out by downloading 3.4.0 from neo4j.com or in Neo4j Desktop. Community posts are scarse, but documentation for all of the new functionality can be found in the preview documentation.