Neo4j 3.0 with a .Net driver: Neo4jClient
I started dabbling in Neo4j, a NoSQL graph database, around 2 years ago. I have not used it much in the .Net world, and with the recent release of Neo4j 3.0, with its built in .Net driver, I decided to have a go.
I had looked at Neo4jClient (available via Nuget), a .Net driver for Neo4j written by Chris Skardon and Readify, in the past, so decided to have a look at both .Net drivers together. This post will look at Neo4jClient and my next post will look at the new official .Net driver that comes with Neo4j 3.0.
File > New Project
I started with a basic MVC 4 project in Visual Studio, but for dabbling and testing I would now recommend LinqPad. Chris recommended this to me, when I met up with him at Graph Connect Europe 2016 in London. Install LinqPad and you have a playground, for testing your ideas, instead of the often cumbersome Visual Studio.
Setting up Neo4j 3.0
In order to use Neo4j you need to download and install the Neo4j 3.0 Community Edition for Windows (there is also a Mac and Linux version), run it and then click the Start button to start your Neo4j server.
When you have your server up and running you will need to set up the authentication. This only takes a minute. Click the link in the Neo4j status box, or go to http://localhost:7474 in your browser to see your local instance in the web dashboard. Enter the following in the prompt at the top, next to $
:server connect
This will give you a login, and tell you what the default credentials are. Type these in and you will be prompted to enter a new password. Do this and you are up and running.
This console is great when you are learning Cypher.
Database Setup with Neo4jClient
Here is the code I used to create my Neo4jClient driver and populate my graph. I will go through each part separately.
Driver and population
This code first creates your driver, a new GraphClient (specifying the default address on your localhost, the username and your password, which you set above) and then connects to your database instance.
I then clear the database. This would not be what you want to do in production code, but is useful for when dabbling to make sure you always start with a clean slate (see below for the method code).
I then create each Person that I wish to insert in the database and pass them all to a method to insert them (see below).
Finally I create the relationships between the created Person nodes, again using a nice reusable method (see below).
Clear the database
I have then written a Cypher query (Cypher is the query language for Neo4j, and is so much easier to write and understand than SQL!) that will clear the database. I won’t go into all the details of Cypher here, but will explain what it does. For more information on Cypher see Neo4j’s Introduction to Cypher, and for more on how to use Cypher with the Neo4jClient see Performing Cypher Queries and Cypher Examples in the Neo4jClient wiki.
This code will clear the database, using the special DETACH DELETE, so that relationships are removed as well, without having to match them first. It first does a MATCH on all nodes and then deletes the nodes and any incoming or outgoing relationships.
The concrete class Person
Here is the Person class that I have created for this demo. It is fairly basic.
The great thing about Neo4jClient is that you can create and return concrete classes from your database, so you do not have to do any processing of your data into or out of the database structure. It is all taken care of for you!
The method for creating a person node in the database
To make this code reusable I have put it in it’s own method. It could be broken down more so that this method would pass the appropriate data to a generic CreateNode method, which is what I would do when developing this further.
UNWIND is very clever and will take an array and iterate over the contents. This therefore will insert a Person node for each Person passed in to the method.
This Cypher statement uses the MERGE clause instead of the CREATE clause. If I used CREATE it would be faster, but would not check to see if the Person node, with the specified id, already existed. Using MATCH means I will not get duplicates in the database. When merging I am only checking whether a Person node with the id exists, rather than checking for the person with the specified id, name, etc, exists. This is faster than checking for an exact match to the Person object passed in (this assumes that the id is unique to all Person objects).
If the Person node does not already exist it is created and anything after OnCreate is then run, so the name, etc, from the passed in Person are used.
The method for creating a unique relationship
I also created a separate method for creating a unique relationship between two person nodes. A unique relationship is like using MATCH when creating the Person nodes. It will only create the relationship if it does not already exist.
This Cypher query finds the two Person nodes that we wish to relate, using the ids passed in. It then uses CREATE UNIQUE to create the relationship specified, if it does not already exist. The type of relationship is also passed into the method.
In Neo4j every relationship has a direction, and only one direction. For some relationships, e.g. in this example MARRIED_TO, I may want it to be a two way relationship. I can specify this easily by passing in TRUE to this method to say it is two-way. If this bool is set to true another statement is added to the query to create the same relationship, but in the opposite direction. (Note the only difference is the < rather than >). This is one of the advantages of creating a query before running it.
Simple Query
Now I can use the Neo4jClient to query my data and return a Person node in the database as my concrete Person class, ready for me to do what I will with it in my C# code.
This query will match all nodes, which have a Person label (A node can have zero or more labels, think of them like Gmail labels. This is mind-blowing when thinking of a relational database as it is equivalent to having something that can be in multiple tables!), and return them as an IEnumerable<Person>();
If I wanted to have a View in my application that listed all the Persons in the database then I might want to use
return View(people.ToList());
to pass this data to my view.
More Complex Query
So, moving on to more real world and complex uses. It is likely that I will want to return data that includes these useful relationships that I have created. As this is a family tree, a good use case would be to view information about a Person, along with their relatives and their basic details.
This is where I turned to Chris Skardon, and the community, for a bit of help. I had created a nice PersonData and Relative class (see below), but was not sure how to go about populating them. Below is what Chris and I came up with, when we met up at Graph Connect 2016, and my concrete classes.
PersonData class
Relation class
Family Tree Query
This code first creates the query and then runs it, returning the results to a results variable. This could also be done by adding .Results to the end of the query, which would assign the results to the variable query instead. If you place a breakpoint at var result… you will be able to see the Cypher that is created by checking the content of the query variable.
This query first matches the Person node, using my parameter id of 2 (which would be passed in by e.g. clicking on a person in your application in real life). WithParams specifies what {id} should be set to, and could contain multiple parameters if need be. Parameters should always be used with Cypher queries as it means the query profile can be used again and again by the database, saving time when running similar queries.
Using this found Person node we then go on to look for Optional Matches, as this person may not have any relations at all in the database (if we used MATCH and the person had no relationships then the person would not be matched at all and not returned as a result). We then take the original Person p, and create an anonymous type with the relationship Type, e.g PARENT_OF, and the relative Person node p2, called relations. We then return p and relations as a PersonData class, assigning p to Person and relations to Relations. For this to work Relations must be an IEnumerable, and not something like List. We can always convert to a list at a later time if we want.
This is where LinqPad really comes into its own as you can dump the result of this query result, and it will be shown to you in a visual way. This was very helpful when Chris and I looked at how to do this.
So this query will return the specified Person, with an IEnumerable of their Relations, which contains a string of the relationship type and the Person they are related to. I could now pass this data to my view to show the person’s information and also their relations.
Here is a simple rendering of the results. In a real application I would be formatting this so that relationships are shown in a more friendly format (maybe by storing a display version as a property on the relationship). I may also put parents, spouses, siblings and children in separate parts of the page.
In my next post I will look at how this same scenario would work with the new official .Net driver for Neo4j 3.0.