Using the xgt package

Introduction

xGT is a tool for reading in massive amounts of data into RAM for performing fast pattern search operations. The best data for this analytic approach is where there are relationships between data objects described in the data. The xgt library is a module written in the Python language and is the preferred interface between a Python script and xGT.

Installing the Python xgt package

The xGT Python interface can be used in a variety of ways. (Check the AWS documentation for more info.)

If you wish to run python scripts on a local machine, then you will need to install the xgt python package on your client. Trovares distributes xgt as a pip package, available to download from our developer site. You can do this directly from the pip install line:

python -m pip uninstall xgt
python -m pip install http://developer.trovares.com/download/python/awslive/xgt-awslive.tar.gz

Using the xgt package

The following code is an example of a script that takes advantage of the xgt library to create a graph, load data to it, run a query, extract results, and finally remove the graph.

import xgt

#-- Define the graph in python --
ng = xgt.Graph('Company')

v1 = xgt.Vertex(name   = 'Employee',
                schema = [['PersonID', xgt.INT],
                          ['Name', xgt.TEXT],
                          ['PostalCode', xgt.INT]],
                key    = ['PersonID'])

e1 = xgt.Edge(name   = 'ReportsTo',
              schema = [['EmpID', xgt.INT],
                        ['BossID', xgt.INT],
                        ['StartDate', xgt.DATE],
                        ['EndDate', xgt.DATE]],
              source = [['EmpID', v1.key.PersonID]],
              target = [['BossID', v1.key.PersonID]])
ng.add(v1).add(e1)


#-- Connect to the xGT server --
conn = xgt.connect()


#-- Create the graph --
conn.drop_graph('Company')
conn.create(ng)


#-- Load data into the graph --
cg = conn.get_graph('Company')

emp = cg.vertices.Employee
emp.insert(
[
[111111101, 'Manny', 98103],
[111111102, 'Trish', 98108],
[911111501, 'Frank', 98101],
[911111502, 'Alice', 98102]
])

rep = cg.edges.ReportsTo
rep.insert(
[
[111111101, 911111501, '2015-01-03', '2017-04-14'],
[111111102, 911111501, '2016-04-02', '2017-04-14'],
[911111502, 911111501, '2016-07-07', '2017-04-14'],
[111111101, 911111502, '2017-04-15', '3000-12-31'],
[111111102, 911111502, '2017-04-15', '3000-12-31'],
[911111501, 911111502, '2017-04-15', '3000-12-31']
])


#-- Query data --
conn.drop_table('Result1')
cmd = """
MATCH
  (emp:Employee)-[edge1:ReportsTo]->
  (boss:Employee)-[edge2:ReportsTo]->
  (emp)
WHERE
  edge1.EndDate <= edge2.StartDate
RETURN
  emp.Name AS EmpName,
  emp.PersonID AS Employee1ID,
  boss.PersonID AS Employee2ID,
  edge1.StartDate AS FirstStart,
  edge1.EndDate AS FirstEnd,
  edge2.EndDate AS SecondEnd,
  edge2.StartDate AS SecondStart
INTO
  Result1
"""
conn.run_job(cmd)


#-- Create a table --
conn.drop_table('Table01')
nt = xgt.Table(name   = 'Table01',
               schema = [['col1', xgt.INT],
                         ['col2', xgt.TEXT],
                         ['col3', xgt.DATE]])
conn.create(nt)
r3 = conn.get_table('Table01')


#-- Results extraction --
r1 = conn.get_table('Result1')
r1dat = r1.get_data(0, 100)

for row in r1dat:
    r = [('"' + c + '"' if isinstance(c, str) else str(c)) for c in row]
    print(', '.join(r))
print('')

#-- Drop all objects --
conn.drop_graph('Company')
conn.drop_table('Result1')
conn.drop_table('Table01')