WordPress Recommendations with Neo4j – Part 3: Collaborative Filtering

This post is part of a series on building a recommendation engine with WordPress. If you haven’t already done so, check out the posts below:

  1. Part 1: Hooks & Data Modelling
  2. Part 2: Content Based Recommendations
  3. Part 3: Collaborative Filtering
  4. WordPress Recommendations with Neo4j – Part 4: PageRank with APOC Procedures

Collaborative Filtering

In it’s simplest terms, Collaborative Filtering is a method of making automated predictions for a user based on the behaviour and preferences of other users. By tracking the behaviour of users through the website, we can provide new users with a contextual recommendation.

In order to provide these recommendations, firstly we will need to start tracking the User’s path through the website. The easiest way to do this would be to create a cookie with a unique identifier for the user. By setting this cookie to expire in 30 days, we can make sure when the user returns we are able to identify the user. Using PHP’s session_id() will allow us to track what the User does within a session and give us further insights.

First we need to create a function that will start the session and identify the user.

/** @var string User ID */
private static $_user;

* Make sure a session has been started so we have a unique Session ID
* @return void
public static function session() {
    // Start Session

    // Identify User

* Identify the current User or create a new ID
* @return void
private static function identify() {
    if ( array_key_exists('neopress', $_COOKIE) ) {
        static::$_user = $_COOKIE['neopress'];
    else {
        static::$_user = uniqid();

    $expires = time()+60*60*24*30;
    $path = '/';
    setcookie('neopress', static::$_user, $expires, $path);

Then add this function to the init action so it will run as WordPress loads.

add_action('init', Neopress::class .'::session');

Now we know who the User is, we need to track their path through the site. Let’s create a Session class to hold our logic. On each page load, we want to make sure the User and Session records exist, create a new Pageview node linked to the post that they are visiting. At this stage, we can also create a :NEXT relationship between each Pageview so we can see in which order the content of the site is consumed.

namespace Neopress;
class Session {

     * Create a Cypher Query for a Category
     * @return void
    public static function log() {
        // Merge Page
        $cypher = 'MERGE (p:Post {ID: {page_id}})';
        $params = ['page_id' => get_the_ID()];

        // Attribute the Pageview to a Session
        if ( $session_id = session_id() ) {
            // Set User's WordPress ID if logged in
            if ($user_id = get_current_user_id()) {
                $cypher .= ' MERGE (u:User {user_id:{user_id}})';
                $cypher .= ' SET u.id = {id}';
                $params['user_id'] = $user_id;
            else {
                $cypher .= ' MERGE (u:User {id: {id}})';

            // Create Session
            $cypher .= ' MERGE (s:Session {session_id: {session_id}})';

            // Attribute Session to User
            $cypher .= ' MERGE (u)-[:HAS_SESSION]->(s)';

            // Create new Pageview
            $cypher .= ' CREATE (s)-[:HAS_PAGEVIEW]->(v:Pageview {created_at:timestamp()})';

            // Relate Pageview to Page
            $cypher .= ' CREATE (v)-[:VISITED]->(p)';
            $params['id'] = Neopress::user();
            $params['session_id'] = $session_id;

        // Create :NEXT relationship from last pageview
        if (array_key_exists('neopress_last_pageview', $_SESSION)) {
            $cypher .= ' WITH v';
            $cypher .= ' MATCH (last:Pageview) WHERE id(last) = {last_pageview}';
            $cypher .= ' CREATE (last)-[:NEXT]->(v)';
            $params['last_pageview'] = $_SESSION['neopress_last_pageview'];

        // Return Pageview ID
        $cypher .= 'RETURN id(v) as id';

        // Run Query
        $result = Neopress::client()->run($cypher, $params);

        // Store Last Pageview in Session
        $_SESSION['neopress_last_pageview'] = $result->getRecord()->get('id');

Now, we can use the shutdown listener to run our code once a page has finished loading.

class Neopress {

    // ...

     * Register Shutdown Hook
     * @return void
    public static function shutdown() {
        if (is_single()) {

add_action('shutdown', Neopress::class .'::shutdown');

After a few clicks around the site, we can see a rich graph of information developing.

Recommend Unread Posts

Now that we have some information in the database, we can start to build up some more intelligent recommendations. Using our Cypher before, we can utilise the session information we have collected to filter out posts that this user has visited during their session or during previous visits to the site.

MATCH (s:Session) WHERE s.session_id = '3ch9ng6amor3m9a9rao91ikn51'
MATCH (p:Post)-[:HAS_TAXONOMY|AUTHORED]-(target)-[:HAS_TAXONOMY|AUTHORED]-(recommended:Post)
WHERE p.ID = 110
AND recommended.status = "publish"
WITH labels(target) as labels, recommended, case when "User" in labels(target) then 10 else 5 end as weight
RETURN id(recommended) as ID, sum(weight) as weighting

We can even take it a step further and find all posts that the current user has not read during previous visits by adding a single line of cypher.


Social Recommendations

Social proof is a powerful tool. By creating the connection between users by using information either collected from the website or using third party – for example Facebook friends – we can provide valuable context about why the post has been recommended. In the following query, we use the connections between people to recommend posts that their connections have read. By using Cypher’s COLLECT function, we can return a list of the friends to display to the user.

MATCH (u:User) WHERE id(u) = 169
WITH id(p) AS post_id, COLLECT(friend.name) AS friends
RETURN post_id, friends, SIZE(friends) AS count
post_id friends count
110 [Adam, Joe, Jon] 3
108 [Adam, Jon] 2
113 [Joe, Jon] 2
120 [Matt] 1
135 [Adam] 1

Unearthing Hidden Gems

Sometimes, it may be appropriate to provide the user with something completely different. As humans, we first look to belong and then to differentiate ourselves from the group. Nothing brings more value than a recommendation out of left field. Take music for example, you may like rock music but you’ve shown no interest in Blink 182 – that doesn’t necessarily mean that deserves a recommendation. I hate Blink 182. At this point, there is more value in recommending things that your friends aren’t listening to, the hidden gems in the database. The power of cypher means that with a simple tweak of the query, you can identify a completely different subgraph.

If we take our :CONNECTED_TO relationship, we can filter out recommendations that our connections have the same taxonomy ratings but do not have an association with any connected Users. As we want to look at two connections regardless of who initiated the friendship, I have ommited the direction of the relationship in the query.



Throughout this series, we’ve learnt how to use Neo4j to provide better recommendations; from creating WordPress hooks to synchronise our data with Neo4j to running cypher queries to pull out recommendations. These recommendations should provide users with a better experience and allow you to promote your quality content.

Are you trying this? Is there anything you would do differently? Leave a comment below and let me know how you get on.