Neo4j learning accelerator

Introduction

Neo4j is a graph database and queried via CYPHER. An Commercial Version incl. the Neo4j-Browser you can download here. Neo4j Enterprise as open source under the AGPLv3 open source license created by the Free software foundation you can find here.

Further Examples are described on Windows platform.

Start “Neo4j Desktop” and just great a new graph. The APOC procedures makes the world much more easy and just has to be downloaded from GitHub here.

Helpful Queries

Playing with e-Mails

Find all nodes, cut the relationships and delete them.

MATCH (n) DETACH DELETE n;

Create a node – relationship – node with properties.

CREATE (m:Instance {name: "Sven"})-[n:SENT {subject: "Likes"}]->(o:Instance {name: "Anna"});

Delete a node where a certain property does not exist.

MATCH (n) WHERE NOT exists(n.name) DETACH DELETE n;

Delete a special constellation (sender/receiver according subject without direction of the mail).

MATCH (m)--(n:SENT)--(o) where n.subject="Hates" DETACH DELETE m,n,o

Import data from an CSV-File (to be stored in the import folder) and create some relations. In the following example and export of Outlook Inbox. The sender/receiver is created with the label “Instance” and the property “name”. The relationship is created with the label “SENT” and the property “subject”

LOAD CSV FROM 'file:///Inbox.csv' as line 
CREATE (:Instance {name: line[1]})-[:SENT {subject: line[0]}]->(:Instance {name: line[2]});

Find all nodes with the label “Instance” with the property “name” and build nodelist that will gone through until there is only one node remaining. Before that all nodes with the same node-property “name” are merged into one.

MATCH (n:Instance) WITH n.name AS name, COLLECT(n) AS nodelist, COUNT(*) AS count WHERE count > 1
CALL apoc.refactor.mergeNodes(nodelist) YIELD node RETURN node;

Split a node according the string of an property and create multiple nodes out of it with the same relationship (point out every receiver of a mail individually).

MATCH (l)-[m]->(n:Instance) WHERE n.name CONTAINS ";" WITH l,m,n, SPLIT(n.name, ";") as oneInstance
UNWIND RANGE (0,SIZE(oneInstance)-1) as i MERGE (p:Instance {name: l.name})-[q:SENT {subject: m.subject}]->(o:Instance {name: oneInstance[i]}) RETURN p,q,n,o;

Delete nodes that contain a certain substring within the property “name”.

MATCH (n) where n.name CONTAINS ";" DETACH DELETE n;

Finally as one executable block:

MATCH (n) DETACH DELETE n;
LOAD CSV FROM 'file:///Inbox.csv' as line 
CREATE (:Instance {name: line[1]})-[:SENT {subject: line[0]}]->(:Instance {name: line[2]});
MATCH (l)-[m]->(n:Instance) WHERE n.name CONTAINS ";" WITH l,m,n, SPLIT(n.name, ";") as oneInstance
UNWIND RANGE (0,SIZE(oneInstance)-1) as i MERGE (p:Instance {name: l.name})-[q:SENT {subject: m.subject}]->(o:Instance {name: oneInstance[i]}) RETURN p,q,n,o;
MATCH (n:Instance) WITH n.name AS name, COLLECT(n) AS nodelist, COUNT(*) AS count WHERE count > 1
CALL apoc.refactor.mergeNodes(nodelist) YIELD node RETURN node;
MATCH (n) where n.name CONTAINS ";" DETACH DELETE n;
MATCH (n) WHERE NOT exists(n.name) DETACH DELETE n;

Analyzing Timeseries with Python

Small demo of the needed elements to load, analyze and print a time series in Python. Format of the data-source:

...
2019-08-28,11701.019531,EUR
2019-08-29,11838.879883,EUR
2019-08-30,11939.280273,EUR
2019-09-02,11953.780273,EUR
2019-09-03,11910.860352,EUR
2019-09-04,12025.040039,EUR
2019-09-05,12126.780273,EUR
2019-09-06,12191.730469,EUR
2019-09-09,12226.099609,EUR
2019-09-10,12268.709961,EUR

Generates based on the complete data, the annual average performance and volatility

Annual Performance:  8.026639312445006
Annual Vola:  19.100050116208784

… and the plot of the analyzed data:

… based on the following code:

#!/usr/bin/python3 

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from datetime import datetime, date

# some variables
now = datetime.now() # date/time of software execution
endDate = date(year = now.year, month = now.month, day = now.day) # date of software execution (end of analytics period)
startDate = date(year = now.year-10, month = now.month, day = now.day) # 1 year earlies (begin of analytics period)
deltaYears = (endDate-startDate).days/365.2425 # difference of startDate and endDate in years

# read csv file & build timeseries
# raw = pd.read_csv("./Reference/stocks_2/JP3942600002.EUR.csv", header=None)
raw = pd.read_csv("./DAX.EUR.csv", header=None)

ts = pd.DataFrame(columns=['datetime','value']) # generate timeframe
ts['datetime'] = pd.to_datetime(raw[0]) # load column-datetime with the raw-timestamps
ts['value'] = raw[1] # load column-value with the values
ts = ts.set_index('datetime') # index based on datetime
# print(ts)

# reduction of timeseries to the choosen period and cleaning for weekdays
ts = ts.resample('D').ffill() # generate sample size one-day and fill missing elements
selection = pd.date_range(startDate, endDate, freq='B') # generat selection from startDate to endDate with weekdays
ts = ts.asof(selection) # get subset of ts according selection and interpolate remaining ()
# print(ts)

# some calculation
val = np.array(ts['value'])
res = np.log(val[1:]/val[:(len(val)-1)])
# r = (np.power(1+np.mean(res),len(res))-1)*100 # performance
r = (np.power(np.power(1+np.mean(res),len(res)),1/deltaYears)-1)*100 # performance (annual)
v = np.std(res)*np.sqrt(len(res))*100/np.sqrt(deltaYears) # vola (annual)
print('Annual Performance: ',r)
print('Annual Vola: ',v)

# check
ts.head()

# plot 
ts.plot()
plt.show()

Decision Intelligence

Cassie Kozyrkov (Google) presented a view on machine learning and artificial intelligence that was completely new to me. Starting with the idea

All technology is an echo of the wishes of whoever built it.

Cassie Kozyrkov

She is emphasizing to reduce the fear on the technology on believe in the human. The possibility to communicate to machines not with code, instead with experience and examples as we do it all the years in the past.

She pointed out the AI as student with certain special capabilities. By the way a great quote:

Written in Python it is ML – written in PowerPoint it is AI.

Cassie Kozyrkov

So how to work with that student?

StepStudentAI
Purposewise objective to studywise objective to optimize
Informationrelevant textbooksrelevant datasets
Testingwell-crafted examsstatistical tests
Generalprudent safety netprudent reliability engineering

As in real live, be aware of the competence of the student. If you have a student with an remarkable memory you can not test him by asking informations he exactly learnt in the same way from a textbook. You also cannot ask him the same test questions twice.

Electric Cars

Alejandro Agag, founder of the Formula-E presented why and how to change the world.

Starting with:

There are two doors. Behind Door Number One is a completely sealed room, with a regular, gasoline-fueled car. Behind Door Number Two is an identical, completely sealed room, with an electric car. Both engines are running full blast.I want you to pick a door to open, and enter the room and shut the door behind you. You have to stay in the room you choose for one hour. You cannot turn off the engine. You do not get a gas mask.I’m guessing you chose the Door Number Two, with the electric car, right? Door number one is a fatal choice – who would ever want to breathe those fumes?

Arnold Schwarzenegger

It is clear that electrification is the needed future. How to become everybody on board and get everyone enthusiastic about it?

Starting with electric cars – better electric race cars.
In the first generation show the electric advantages like torque, noise, … but the driver has to change the car. Emotion is triggered but the range is limited.
The second generation – today – can already deal with the complete race distance. You can see that the cars can deal with longer distances. So the electric car becomes mature but what is about disadvantage of slow refill. How to use it as classic fuel consuming cars with fast refills?
The next generation – you will see fast charging – like Formula One in 10-15s. Now the electric car has only advantages compared to classic combustion engine driven vehicles.

And on this journey he takes millions of people with him.