Create these two files and place them on the machine where you are running Jupyter Notebook.
The specified file path is relative to the working directory for Jupyter Notebook.
Reading from the client filesystem, as is done in this example, is more convenient and is appropriate for small data sizes.
However, it is much slower than reading from the server filesystem.
For more information, see [2.5 Data Movement](data.html).

employees.csv
```
person_id,name
123456789,Manny
123454321,Bob
987654321,Frank
987656789,Alice
```

reports.csv
```
employee_id,boss_id,start_date,end_date
123456789,987654321,2015-01-03,2017-04-14
123454321,987654321,2016-04-02,2017-04-14
987656789,987654321,2016-07-07,2017-04-14
123456789,987656789,2017-04-15,
123454321,987656789,2017-04-15,
987654321,987656789,2017-04-15,
```

Notice the empty column for end_date in the last three lines.
This results in assigning a `NULL` value to those properties.

**Code to create CSV files**

In [None]:
with open('employees.csv', 'w') as emp_file:
    emp_file.write('person_id,name\n')
    emp_file.write('123456789,Manny\n')
    emp_file.write('123454321,Bob\n')
    emp_file.write('987654321,Frank\n')
    emp_file.write('987656789,Alice\n')

with open('reports.csv', 'w') as reports_file:
    reports_file.write('employee_id,boss_id,start_date,end_date\n')
    reports_file.write('123456789,987654321,2015-01-03,2017-04-14\n')
    reports_file.write('123454321,987654321,2016-04-02,2017-04-14\n')
    reports_file.write('987656789,987654321,2016-07-07,2017-04-14\n')
    reports_file.write('123456789,987656789,2017-04-15,\n')
    reports_file.write('123454321,987656789,2017-04-15,\n')
    reports_file.write('987654321,987656789,2017-04-15,\n')

**Begin xGT client script**

In [None]:
import xgt
# Connect to server
server = xgt.Connection()

# Drop all objects
[server.drop_frame(f) for f in ['ReportsTo', 'Employees']]

In [None]:

# Populate Employee vertices
try:
  employees = server.get_frame('Employees')
except xgt.XgtNameError:
  employees = server.create_vertex_frame(
      name='Employees',
      schema=[['person_id', xgt.INT],
              ['name', xgt.TEXT]],
      key='person_id')

employees.load("employees.csv", header_mode=xgt.HeaderMode.IGNORE)

In [None]:
# Populate Reports edges
try:
  reports = server.get_frame('ReportsTo')
except xgt.XgtNameError:
  reports = server.create_edge_frame(
      name       = 'ReportsTo',
      schema     = [['employee_id', xgt.INT],
                    ['boss_id', xgt.INT],
                    ['start_date', xgt.DATE],
                    ['end_date', xgt.DATE]],
      source     = employees,
      target     = employees,
      source_key = 'employee_id',
      target_key = 'boss_id')

reports.load("reports.csv", header_mode=xgt.HeaderMode.IGNORE)

In [None]:
# Utility to print the data sizes currently in xGT
def print_data_summary(server, frames=None, namespace=None):
    if frames is None:
        frames = server.get_frames()
    for frame in frames:
        frame_name = frame.name
        if namespace is None or frame_name.startswith(namespace):
            if isinstance(frame, xgt.EdgeFrame):
                frame_type = 'edge'
            elif isinstance(frame, xgt.VertexFrame):
                frame_type = 'vertex'
            elif isinstance(frame, xgt.TableFrame):
                frame_type = 'table'
            else:
                frame_type = 'unknown'
            print("{} ({}): {:,}".format(frame_name, frame_type, frame.num_rows))

print_data_summary(server, namespace='example')

In [None]:
# Query in section 4.1.5.1
df = server.run_job("""
    MATCH (x:Employees)-[edge1:ReportsTo]->(y:Employees)
    RETURN x.person_id, edge1.start_date, edge1.end_date, y.person_id
""").get_data(format='pandas')
print("Number of answers: {:,}".format(len(df)))
df

In [None]:
# Query in section 4.1.5.2
df = server.run_job("""
    MATCH (x:Employees)-[edge1:ReportsTo]->(y)-[edge2]->(x)
    WHERE edge1.end_date <= edge2.start_date
    RETURN x.person_id, y.person_id,
           edge1.start_date, edge1.end_date,
           edge2.start_date, edge2.end_date
""").get_data(format='pandas')
print("Number of answers: {:,}".format(len(df)))
df