4.2. Graph Data Model: Frames and Namespaces

xGT uses the concept of a frame as a fundamental building block for its data storage. A frame is a structured collection of homogeneous data in which all elements in the frame, called rows, have the same number of elements. A row in a frame has a number of columns, each of which can have one of the fundamental xGT data types (Data Types and Operators in xGT).

Frames in xGT have several properties associated with them: a schema, which consists of a description of the frame’s columns, their names and data types, and a name. This means that each vertex or edge in a single frame has the same set of properties.

Frame names in xGT are represented using a two-level naming scheme. The first part of the name represents the namespace in which the frame is stored, while the second part of the name represents the frame’s name proper. Frame names need to be unique only within their own namespace.

xGT uses the term “fully qualified name” to indicate the fully specified name of a frame, including its namespace. Two underscores “__” are used to separate the namespace of a frame from its name in a fully qualified name: namespace1__FrameA, namespace2__FrameA. If a frame is within the default namespace, the namespace can be omitted. For example, if the default namespace is namespace1, then FrameA can be used in the place of namespace1__FrameA. Throughout most of the documentation, examples will assume that frames are in the default namespace. For more information on the default namespace, see Default Namespace.

4.2.1. Types of Frames in xGT

xGT supports three fundamental types of frames to store user data in the system:

  • TableFrame: Tables represent a collection of rows with the only restriction being that they conform to the table’s schema.

  • VertexFrame: Vertex frames represent a collection of rows conforming to a schema, with the additional restriction that the values of the key column must be unique across the entire vertex frame. Each row in a vertex frame corresponds to a vertex in xGT’s graph model.

  • EdgeFrame: Edge frames also represent a collection of rows conforming to a schema, with the restrictions that the values of two columns in each row must map to the values of the keys columns in the source and target vertex frame, respectively. Multiple rows in an edge frame can have the same values for source and target key columns. Each row in an edge frame corresponds to an edge in xGT’s graph model.

Each frame can be thought of as a grouping of homogeneous rows, vertices, or edges. A VertexFrame may store all vertices representing a single entity type, while an EdgeFrame may store all edges representing a single type of relationship or connection. For example, in a social network graph, there might be one vertex frame representing people and several edge frames, each storing connections of a different type: “friend-of”, “knows”, “family”, “works-with”. A TableFrame is most often used to store results of a query.

Creating and dropping frames is discussed in detail in Frame Management.

4.2.2. Namespaces

All frames in xGT are grouped into namespaces and frame names need to be unique only within a namespace. The name of a namespace must be globally unique within xGT. Namespaces can be thought of as living under a global namespace directory that xGT keeps for its users.

A user might connect to an xGT server on which namespaces have already been created, or the user might create a new namespace to store frames. A namespace can be created with create_namespace():

server.create_namespace('mynamespace')

If the namespace indicted by a frame name does not already exist, it will be created along with the frame. In this example, if the namespace named graph does not already exist, it will be created so that it can store graph__Vertex.

vertex_frame = server.create_vertex_frame(name = 'graph__VertexFrame',
                                          schema = [['id', xgt.INT]],
                                          key = 'id')

Client API calls can be used to obtain the list of namespaces with get_namespaces() and delete a namespace with drop_namespace(). Unless the force_drop parameter is set to true, a namespace will not be dropped if it contains any frames.

4.2.2.1. Default Namespace

To simplify usage, xGT supports the use of a default namespace. When a connection to the server is created, the default namespace is set to default. This namespace may already exist on the xGT server, but if it does not, it will be created at the time of establishing a connection.

Any API calls that do not specify a namespace will occur in the default namespace. For example, if the default is not changed, any API call that refers to FrameA will be translated to default__FrameA. In a single user environment or for any situation where namespaces are not needed, users can ignore the existence of namespaces and perform all actions within the namespace default. Users can then simply refer to frames by their non fully qualified names. For simplicity, most examples in the documentation will assume that frames are in the default namespace.

The default namespace can be changed with set_default_namespace(). This only changes the default namespace for the current user session. Note that if the specified namespace does not exist, it will be automatically created. The current default namespace can be obtained with get_default_namespace().

A user can change the default namespace for a connection object any number of times. A Python script may use multiple connection objects, each with their own default namespace. The example below uses two connection objects to create a frame career__Employees and to retrieve frames career__Companies, cyber__Devices, and cyber__Netflow.

conn1.set_default_namespace('career')
conn2.set_default_namespace('cyber')
conn1.create_vertex_frame(name = 'Employees',
                          schema = [['person_id', xgt.INT],
                                    ['name', xgt.TEXT]],
                          key = 'person_id')
companies = conn1.get_vertex_frame(name = 'career__Companies')
netflow = conn1.get_edge_frame(name = 'cyber__Netflow')
devices = conn2.get_vertex_frame(name = 'Devices')

Note in the example above that the namespace can either be provided or omitted. The default namespace of conn1 is career, so giving the fully qualified name career__Companies is redundant but allowed. Even though its default namespace is career, conn1 is used to retrieve the frame cyber__Netflow in the cyber namespace by providing the fully qualified name.

4.2.2.2. Disabling the Default Namespace

The default namespace can be disabled, which requires all API calls to provide the fully qualified name of any frame used. This is done by passing in either the empty string or None to set_default_namespace():

server.set_default_namespace(None)

Note that this only disables a default namespace for the given user session. Default namespaces were introduced in xGT version 1.8 and disabling them requires always specifying the namespace as in versions 1.4 through 1.7.

4.2.3. Access Control

xGT supports access control on frames. There are two categories of access control: frame and row. Frame security protects access to the entire frame: a user is either granted a certain type of access to all rows in a frame or not. Row security protects individual rows of a frame: vertices in a vertex frame, edges in an edge frame, and rows in a table frame. A frame may have no access control, only frame access control, only row access control, or both. Currently, row access control is NOT supported for namespaces.

More information about using xGT with access control is found in Access Control, and information about setting up and managing access control is found in Access Control.

4.2.3.1. Frame Access Control

When creating frames or namespaces, security labels can optionally be attached for each of the access types:

  • Create: The create access type allows the insertion of new rows into the frame.

  • Read: The read access type allows reading operations on the data rows held by the frame.

  • Update: The update access type allows the updating of existing rows in the frame. Columns of existing rows can be modified if this access type is granted.

  • Delete: The delete access type allows the deletion of existing rows from a frame. Delete access on a frame is also required to delete the frame itself.

In order to perform an action on a frame or namespace, the connected user must have required labels in their label set. For example, to read a frame, the user must have the labels attached to read access for that frame. To load new data into the frame, the user must have the labels attached to read access and create access for that frame. If xGT is being used with access control, the label set of a user will be configured by the administrator as described in Configuring Groups and Labels.

Security labels are attached to a frame by passing into a frame creation method the dictionary parameter frame_labels that contains a key for each of the access types:

namespace_labels = { 'create' : ['label1'],
                     'delete' : ['label1']
                   }
server.create_namespace(name = 'graph',
                        frame_labels = namespace_labels)
frame_labels = { 'create' : ['label1', 'label2'],
                 'read'   : ['label2'],
                 'update' : ['label1', 'label2', 'label3'],
                 'delete' : ['label1', 'label2', 'label3']
               }
vertex_frame = server.create_vertex_frame(name = 'graph__VertexFrame',
                                          schema = [['id', xgt.INT]],
                                          key = 'id',
                                          frame_labels = frame_labels)

The above example creates a namespace with one set of labels and a frame within it with a different set of labels. If a namespace does not exist and is implicitly created along with a frame, it will be assigned the same labels as the frame. By default, no labels are attached for any of the four access types. Any attempts to perform an action for which the user does not have the required label set will result in an XgtSecurityError() being thrown.

4.2.3.2. Row Access Control

While frame access control restricts access to the entire frame, row access control protects individual rows in a frame. The word “row” is used to mean individual vertices in a vertex frame, edges in an edge frame, and rows in a table frame. Security on a row is binary: an authenticated user can either access the row or not. If a user does not have access to a row, they will interact with the frame as though the row did not exist. For instance, it will not be accessed during a TQL query or be viewable during an egest.

Row access control is provided by attaching security labels to individual rows. This occurs when a row is created: either during ingest with load() or insert() or during a TQL query that creates new vertices or edges. Section Setting Row Labels provides more information on attaching security labels to rows during ingest. An authenticated user will only be able to view or access rows whose security labels are a subset of the user’s labels. Currently, row access control is NOT supported for namespaces, only for vertex, edge and table frames.

Security labels can only be attached to rows if the frame is created with row access control. To create such a frame, the create frame method should be called with the parameter row_label_universe equal to a list of all possible security labels that can be attached to any future row in that frame:

row_labels = [ 'label1', 'label2', 'label3', 'label4', 'label9' ]
vertex_frame = server.create_vertex_frame(name = 'graph__VertexFrame',
                                          schema = [['id', xgt.INT]],
                                          key = 'id',
                                          row_label_universe = row_labels)

These labels must exist in the xgt__Label frame, as described in Configuring Groups and Labels. Note that the Row Label Universe, which is the set of possible security labels for a row in this frame, cannot be changed after frame creation. The maximum size of this row label universe for any one frame is 128 labels.

The row label universe attached to a frame can be retrieved with row_label_universe. However, this will return only those labels that are both in that frame’s row label universe and in the authenticated user’s label set. If the frame has additional labels in its row label universe that are not visible to the user, they will not be returned.