2.21. Frame Row References (Row IDs)

In TQL variables are bound to a single row allowing a user to access that row during a query. Sometimes, though, a user may want a reference to a row to exist outside a query. xGT provides a row ID that is a handle to a row in a frame. The handle is unique across all the rows in all the frames in the system. Row IDs are available in both the TQL language as well as in the Python client. Paths, when returned by a query, are stored as a list of row IDs.

Like a variable, a row ID allows a user to access the data in the row and refer to the row in TQL queries. In a TQL query, variables and row IDs are treated identically and can be used the same ways. A row ID encodes both the frame and position (offset) of the row within that frame. The row IDs can be stored in frames, but only a query can create a frame with a column of row ID type. Row IDs can be stored in a different frame than the frame they point to. The column type for a row ID is ROWID or a list of type ROWID.

Results frames that contain row IDs are read-only. Existing data cannot be modified or deleted, and no more data can be added to these results frame either via TQL or Python commands. This does not prevent the creation of multiple result frames referring back to the same source frame.

A simple query returning row IDs uses the TQL id() function:

MATCH (v0:Vertex)-[e:Edge]->(v1:Vertex)
RETURN id(v0) AS source_id, id(e) AS edge_id
INTO Results

In this case, the Results frame will have two columns each with a single row ID: one for the source vertex and a second one for the edge. The row IDs in the first result column all refer to the frame Vertex, while the row IDs in the second column all refer to the frame Edge. Note that the use of the id() function is required to prevent xGT from expanding the corresponding frame element into all its properties.

Another example of a query generating row IDs is returning a path variable:

MATCH p = (v0:SourceVertex)-[e:Edge]->(v1:TargetVertex)
INTO Results

In this case, the Results frame will have a single column named p that stores lists of row IDs corresponding to the three element paths that matched the specified pattern. Note that in this example, the row IDs in the first column correspond to elements in three different frames: SourceVertex, Edge and TargetVertex. xGT distinguishes strictly between row IDs originating from different frames.

Row IDs can also be constructed from a frame and either a row position or vertex key using the id() function. See Type Construction Functions for a detailed description.

RETURN id(graph__VertexA, 'key0') AS a_id
INTO Results

In this case, the code constructs a row ID from the name of a vertex frame and a value for its key property using a variant of the id() function. The Results frame will store the row ID corresponding to the vertex that matches that key value.

2.21.1. Row IDs in Python

The class xgt.RowID is the representation of a frame row in the Python client. Instances of this class are automatically constructed when retrieving data from frames that have columns containing references to frame rows, like the examples above.

query = """
  MATCH (v0)-[e]->(v1)
  RETURN id(v0) AS source_id, id(e) AS edge_id
  INTO Results
data = conn.get_frame('Results').get_data()

In this case, the variable data will consist of a list of lists of instances of the xgt.RowID class.

The Python RowID class has the get_data() method that retrieves the properties of the corresponding frame row. It also provides properties to access the frame the row comes from and the row’s position (offset) within that frame.

2.21.2. Lifetime and Validity of Row IDs

A Row ID is invalidated whenever the frame it refers to has a deletion operation. The reason for this is that the position of the rows can change when deletions happen as xGT optimizes memory space usage for the frame.

The validity of a row ID is verified when it is accessed in a TQL query or through the Python client. If deletions have occurred in the source frame since the row ID was created, then access is prevented and an error is reported.

For example, if we had stored the path variables produced by this query:

MATCH p = (v0:SourceVertex)-[e:Edge]->(v1:TargetVertex)
INTO Results

Then a deletion on the SourceVertex frame will invalidate the row IDs for that frame:

MATCH (v0:SourceVertex)
WHERE v0.key = 10

Subsequent accesses to the Results frame on the path entries corresponding to the modified frame will fail:

MATCH (t0:Results)
RETURN t0.p[0] AS mysrc