Release Notes¶
1.16.0 (4/4/2024)¶
New Features¶
List types are now supported in CASE clauses.
The
outdegree()
andindegree()
Cypher functions can now be used on expressions that could come from multiple frames.The Cypher function
unique_vertices()
can now be used in places other than WHERE clauses.Added
truncate()
Cypher function for temporal types.Added
update_columns()
method for updating column data.Added the ability for a user to supply separate SSL key and certificate locations when creating a
Connection
object.Added support for topology deletions and property modifications for labels defined in an
OPTIONAL MATCH
.Added parameter
rows
to the frameget_data()
method to allow selection of the data for a list of row positions.Added support for saving both CSV and Parquet files to S3.
Added support for the large string type for reading / writing Parquet files.
Added Cypher support for geospatial point types including the functions
point()
,distance()
, andwithinbbox()
.
Changed¶
Added key columns to all ingest error messages for vertex and edge frames.
Improved some error message descriptions.
Improved performance when using
WITH
clause.Improved performance of WGAs when filtering graphs.
Improved performance of some queries by applying compiler optimizations in more places.
Reduced memory usage of frames in some cases.
Improved handling of passing variables of the wrong type to Python functions.
Significantly improved performance of ingest in the presence of errors when the
suppress_errors
parameter is given toload()
.Changed the behavior when inspecting frames in a namespace to only show frames the user has permissions to see.
Changed the behavior when inspecting jobs in the job history to only show jobs belonging to the connected user.
Changed S3 authentication from deprecated AWS signature version 2 to AWS signature version 4.
Fixed¶
Fixed the behavior of null in some cases.
Fixed a bug that occurred when the same label was used multiple times in
unique_vertices()
.Fixed some bugs in edge type inferencing on variable length edges.
Fixed a bug where the RPM install failed when the “xgtd” group already existed.
Fixed a bug where in a few corner cases
get_data()
was incorrectly converting data to the Python format.Fixed a bug where the offset and length parameters to
get_data()
were interpreted incorrectly for frames with row-level permissions.Fixed a bug when using client side RowIDs with row-level permissions.
Fixed a bug where the degree Cypher functions could give an incorrect result when the source or target vertex frames had row permissions.
Fixed a crash that occurred when loading Parquet files from a URL if the connection timed out.
Fixed crashes that were caused by unhandled exceptions.
Fixed a bug where AWS credentials weren’t being passed from the client’s low-code ‘create frame’ methods to the xGT server, causing authentication to fail.
Fixed crash when some configuration values were invalid during boot.
Fixed crashes caused by inconsistent container tracking states that occurred during transactional retries.
Fixed crashes resulting from simultaneous insertion into and deletion from a namespace.
Fixed crashes caused by partial commit states.
Fixed
Connection
object raising an error sometimes when the default namespace already existed.
1.15.0 (11/29/2023)¶
New Features¶
Added support for authenticating using PKI.
Added support for automatically creating a private namespace for each user.
Added functions for adding, deleting, and reordering frame columns.
Added support for
DISTINCT
on edge entity variables crossing aWITH
boundary.Added filtering to CALL of whole graph algorithms. This is beta.
Added support for using a list of frame identifiers to define a CALL graph.
Added support for accessing list properties of entity variables that can come from multiple frames.
Added parameter
row_filter
to the frameget_data()
andsave()
methods to allow filtering and modification of frame data along with column mapping.Added support for aggregating durations using
sum()
andavg()
.Added epochSeconds as a component of datetime.
Added support for using arbitrary expressions in the duration component map constructor.
Added support for the component
week
to the duration component map constructor.Added support for passing a component map to
date()
,time()
, anddatetime()
.Added support for generating the current date, time, and datetime in a query.
Changed¶
Changed behavior of
DISTINCT
on entity variables to distinct on the entity identifier instead of the union of the entity’s properties.Renamed
total_rows()
tonum_rows()
.Renamed
input_filter
parameter for theload()
andinsert()
methods torow_filter
.Renamed
frame_to_input_column_mapping
parameter for the frameinsert()
and low-code create frame methods tocolumn_mapping
.Improved accuracy of reported memory usage.
Reduced memory usage of frames under some scenarios.
Improved performance of evaluating Cypher expressions.
Fixed¶
Fixed some cases where null was handled inconsistently for some Cypher functions and features.
Fixed some cases where column mapping when reading into row-level frames from files was incorrect.
Fixed issues with pandas API changing in version 2.1.0.
Fixed deadlock in client CSV load.
1.14.1 (7/17/2023)¶
New Features¶
Added the ability to
SET
an entity variable generated byCREATE
in the sameWITH
section.
Fixed¶
Fixed bug where a
CREATE
wasn’t executed when combined with aSET
in the sameWITH
section and theSET
came last.Fixed bug where some Cypher functionality used on an entity variable that could come from multiple frames incorrectly raised exceptions.
Fixed bug where a property name that was the same as a temporal component (month, hour, etc.) would incorrectly be identified as a temporal component instead of a property.
1.14.0 (6/19/2023)¶
New Features¶
Added unsigned integer type.
Added support for IPv6 addresses to IPADDRESS type.
Added parameter
input_filter
to the frameload()
andinsert()
methods to allow filtering and modification of input data along with column mapping.Added support to accept a pyarrow Table as input data to the frame
insert()
method.RowIDs can be passed as Cypher parameters.
RowIDs can be returned from Cypher queries.
RowIDs can be returned to the client using the frame and job
get_data()
methods and are supported for Python list of lists and pandas DataFrame formats.Added
id()
Cypher function used to construct RowIDs and indicate returning a RowID in aRETURN
clause instead of expanding to all properties.Added the ability to get components of all temporal types using
variable.component
syntax in a Cypher query.Added the ability to have a
CALL
clause follow aWITH
clause.Added the ability to set all the properties from one entity variable to another using
SET
.Added support for aliasing entity variables in a Cypher query.
Added the frame
drop_frames()
method to allow deleting multiple frames in a single call.Added parameter
columns
to frame and jobget_data()
methods to allow selecting which columns to download.Added
key_column
property toVertexFrame
to return the key column position.Added
source_key_column
andtarget_key_column
properties toEdgeFrame
to return the source and target key column positions.Added support for licensing based on sockets and cores.
Changed¶
Results stored in jobs now contain all the rows and are associated with a
Connection
object. The results are deleted when the connection is terminated.WGA vertex input parameters are now RowIDs instead of frame name, vertex pairs.
WGA vertex output parameters are now RowIDs instead of frame name, vertex pairs. The outputs are now similar to graph elements and can be used to get properties, etc.
Removed
getDay()
,getHour()
,getMinute()
,getSecond()
, andgetMicrosecond()
Cypher functions for duration types. Use the newvariable.component
syntax to access the duration components.Added
format
parameter to the frame and jobget_data()
methods to select the output data type. The methodsget_data_pandas()
andget_data_arrow()
are now deprecated.Added
get_frame()
andget_frames()
methods that return frame objects regardless of the type. The methodsget_table_frame()
,get_vertex_frame()
,get_edge_frame()
,get_table_frames()
,get_vertex_frames()
, andget_edge_frames()
are now deprecated.Calls to frame
load()
,save()
,insert()
, andget_data()
can now be cancelled.Improved metrics cache queries performance.
Fixed¶
Fixed bug where null values in a list were sometimes not returned correctly to the client.
Fixed bug where runtime-generated nulls passed to some Cypher functions incorrectly raised an exception.
Fixed bug where
CREATE
on a table frame raised an exception if properties were specified.Fixed bug where in some cases using
UNWIND
on a non-list wasn’t raising an exception.Fixed bug where deleting a row in a row-level frame using
DELETE
without aRETURN
incorrectly raised an exception.Fixed bug where a Cypher query with an
AND
in theWHERE
and both sides of theAND
constructed a constant type incorrectly raised an exception.Fixed bug where creating and deleting frames and namespaces didn’t preempt metrics cache queries.
1.13.1 (2/17/2023)¶
New Features¶
Added parameter
delimiter
to the framesave()
methods to allow selecting the column separator character when writing a CSV file.Added
direction
output tobreadth_first_search()
that returns whether the search traversed the edge in the forward or reverse direction.
Changed¶
Can now read from URLs (including S3 buckets) using the low-code create frame methods.
Can now read from mixed URLs, server files, and client files using the frame
load()
methods and the low-code create frame methods.The encoding of strings to time now supports a leading
T
to adhere to ISO 8601.The Python decimal and timedelta objects can now be passed as Cypher parameters.
Improved error text for ingest errors from Parquet and CSV files.
Fixed¶
Fixed bug in encoding fractional seconds from a string to a duration.
Fixed bug where inserting a list of dates into a text column yielded the wrong text.
Fixed bug where a header mode of
IGNORE
didn’t work for the low-code create frame methods.Fixed bug where the metrics cache was off by default.
1.13.0 (1/27/2023)¶
New Features¶
Added support for accessing properties of vertex and edge elements in paths.
Added support for accessing properties of bound edges from edge steps with multiple frames.
Added support for
CREATE
for table frames.Added DURATION type.
Added support for multiple edge frames on a frame step in a
WHERE
pattern.Added support for inserting Python ipaddress type into IPADDRESS frame columns.
Added support for inserting Python decimal type into INT, FLOAT, and TEXT frame columns.
Added support for passing Python date, time, datetime, and ipaddress objects as Cypher parameters.
Added support for comparing dates and datetimes.
Added support for reading Parquet files from http:// and ftp:// locations and S3 buckets.
Added the ability to use wildcard characters in filenames when loading data from both the client and server filesystems.
Added support for using wildcards or passing lists of filename to the low-code create frame methods.
Added support for LIST columns to the low-code create frame methods.
Added
on_duplicate_keys
parameter tocreate_vertex_frame_from_data()
that allows choosing how to handle duplicate vertices.Added
frame_to_input_column_mapping
parameter to the frameinsert()
methods.Added
all_edges
parameter tobreadth_first_search()
to return all edges traversed by the search.Added support for using LDAP or Kerberos for Docker xGT authentication.
Changed¶
Changed
get_data()
andget_data_pandas()
to return native types for DATE, TIME, DATETIME, and IPADDRESS columns instead of strings.Changed
get_data_arrow()
to return native types for DATE, TIME, and DATETIME columns instead of strings.Improved Cypher coverage of using path variables and elements of paths.
Improved performance and reduced memory usage for some usages of WGAs.
Improved type inferencing for edge steps with multiple frames and for
WHERE
patterns.Improved consistency of inserting scalar data across all input methods and converting types to string.
Improved type constructors (such as
date()
) to work with any valid string expression, not just constants.Renamed
frame_to_file_column_mapping
parameter for the frameinsert()
and low-code create frame methods toframe_to_input_column_mapping
.
Fixed¶
Fixed bug where sometimes
Connection
objects were not properly cleaned up.Fixed some corner case bugs for using lists in Cypher queries.
1.12.1 (10/19/2022)¶
Changed¶
Exposed config files to Docker and AWS Marketplace users.
Added mutual SSL mode, and client and chain certificates are now only required for mutual SSL mode.
Fixed¶
Fixed bug where a line of only white space in a query caused a compile error.
Fixed bug where CSV files weren’t able to be loaded from an S3 bucket.
Fixed bug where writing Parquet files with null data sometimes wrote the wrong data.
Fixed bug where a query returning a single row with an empty multi-level list sometimes caused a segfault.
Fixed bug where null values caused the wrong result for some global aggregators.
Fixed bug when executing some variable length edge traversal queries.
Fixed bug where some list comprehension expressions returned the wrong result.
1.12.0 (9/2/2022)¶
New Features¶
Added support for path variables.
Added Cypher functions
nodes()
,relationships()
, andlength()
for path variables.Added weakly connected components algorithm to the whole graph algorithm (WGA) catalog.
Added strongly connected components algorithm to the WGA catalog.
Added support for multiple
MATCH
statements within a query.Added support for
CREATE
with multiple edges and vertices.Added support for list properties when using
MERGE
,CREATE
, andSET
.Added support for null entity returns from an
OPTIONAL MATCH
.Added Cypher support for undirected edges where source and target vertex frames are the same.
Added support for list comprehension.
Added Cypher support for
^
(exponentiation) andXOR
operators.Added Cypher support for
%
(modulus) operator for float values.Added parameter
on_duplicate_keys
to VertexFrame’sinsert()
andload()
to allow skipping vertices if the key or all the properties are the same instead of raising an exception.Added support for single-level lists as query parameters.
Changed¶
Improved breadth-first search and PageRank performance in the WGA catalog.
Improved type inference for pandas and pyarrow.
Further improved type support for Arrow and Parquet.
Improved access control for Arrow.
Added
default_namespace
parameter toConnection()
that initializes the connection with a default namespace.Improved cached metrics data for frames.
Fixed¶
Fixed bug where leading comments for a query would raise an exception.
Fixed bug where header line processing ignored delimiter.
Fixed bug introduced in 1.11 where transferring data between client and server no longer worked for all type conversions.
1.11.0 (6/17/2022)¶
New Features¶
A whole graph algorithm (WGA) catalog callable from TQL. Initial included algorithms are PageRank and Breadth-First Search (BFS).
Automatic xGT frame creation using an inferred schema derived from Pandas frames, Arrow tables, or CSV files.
Single sign-on Kerberos-based authentication via GSSAPI libraries.
Changed¶
Improved type support for Parquet and Arrow.
Improved support for very large strings in Arrow.
Support for row-level frames with Parquet files.
Ingest transactions now will stop on the first encountered error instead of trying to ingest as many rows as possible.
Target release changed to RedHat Enterprise Linux (RHEL) v8.
Moved to a new version of Google’s protobuf client package to resolve compatibility problems.
Improved type support for jobs with immediate results to cover all xGT types.
Improved the performance of data transfers between the xGT client and server.
Allow for CSV headers with quotes in them.
Fixed¶
Fixed an issue where queries with multiple mandatory
MATCH
patterns were not reported as erroneous.Fixed issues with supporting frame and namespace names with symbols.
1.10.1 (3/7/2022)¶
New Features¶
Added the ability for
load()
to map schema columns in the frame to columns in a Parquet file using theframe_to_file_column_mapping
parameter.
Changed¶
The xgtd Arrow Flight server now uses the same port as the xgtd server, 4367.
Improved the performance of some queries, notably
count(*)
queries.Can now read in times that have more than 6 digits of fractional second precision. The extra digits are rounded to 6. This makes inserting time data from pandas dataframes easier.
Fixed¶
Fixed some corner case bugs with transferring lists to / from the client and when using lists in Arrow flights.
Fixed handling of null columns in Arrow flights.
Fixed a bug where an edge frame could be created where the type of the source and target columns didn’t match the corresponding vertex key column type.
Fixed some issues where symbols in names weren’t always handled correctly.
1.10.0 (2/11/2022)¶
New Features¶
Added support for writing to Parquet files on the server filesystem (xgtd://).
Added support for xgt IP addresses as strings in Parquet files.
Egesting to multiple files from a single frame is now supported.
Added the ability for
load()
to map schema columns in the frame to columns in a CSV file using theframe_to_file_column_mapping
parameter.The xGT server is now an Arrow Flight endpoint. Arrow Flight clients can connect to it and insert Flight data into frames and get data from frames as flights.
Cypher lists can now be inserted into and retrieved from frames in the client.
Changed¶
The logger now rotates the log files and ages off old ones to avoid filling up disk space.
Improved performance and reduced memory usage for
LIMIT
when used with group by orDISTINCT
.The configuration setting
license.server
was renamed tolicense.location
.
Fixed¶
Fixed a bug where vertex labels carried across a
WITH
statement weren’t allowed inunique_vertices()
.Fixed a bug where
save()
incorrectly handled the offset and length parameters.Fixed a bug where, for some queries that use both
UNWIND
andWITH
, an incorrect result was returned.
1.9.1 (1/17/2022)¶
Fixed¶
Fixed a bug where, for some queries that allow multiple types for an edge in a
MATCH
pattern ([:REL1 | :REL2]
), an incorrect result was returned.Fixed a bug where an exception was incorrectly raised in some cases when combining
UNION
andWITH
.
1.9.0 (11/1/2021)¶
New Features¶
Added support for
UNION
andUNION ALL
.Added support for reading from Parquet files on the server filesystem (xgtd://).
Added support for ingesting integers in hexadecimal notation.
Added support for the list concatenation operator:
+
.Added support for list subscripting and slicing.
Added support for the list functions
keys()
,reverse()
,size()
, andtail()
.Added support for the string function
size()
.
Changed¶
Immediate results can now be returned with
schedule_job()
, and they are now stored in the job history.Improved performance of
insert()
from both Python lists and Pandas frames.Improved performance of
ORDER BY
when used in conjunction withSKIP
orLIMIT
.Improved performance of
SKIP
orLIMIT
with large values.Improved performance of group by queries.
Improved performance of some queries using a
WITH
clause. Vertex index.Limited number of files returned in
load()
error message to prevent extremely large messages.Some improvements to query planning and expression optimization.
Fixed¶
Fixed a bug when using
OPTIONAL MATCH
.Fixed a bug where missing integer values in Pandas frames raised an exception during
insert()
instead of creating null values.
1.8.0 (8/30/2021)¶
New Features¶
Added support for a default namespace.
Added support for
OPTIONAL MATCH
withWHERE <pattern>
.Added support for the group by
collect
aggregator.Added support for
collect
on lists for both global and group by aggregation.Added support for lists with count, min, and max aggregators, both global and group by.
Added support for lists as group by keys.
Added support for lists as
ORDER BY
keys.Added support for lists as
DISTINCT
keys.Added support for nested lists everywhere lists are currently allowed.
Changed¶
Improved query compile time.
Improved query planning.
Improved performance of
outdegree()
andindegree()
.Improved performance of
ORDER BY
, especially when used in conjunction withLIMIT
.Improved performance of global aggregators.
Fixed¶
Fixed bugs where some aggregator corner case usages, both global and group by, returned the wrong answer and some caused a segfault.
Fixed corner case bug in
range()
.
1.7.1 (7/7/2021)¶
Fixed¶
Fixed bug where some corner case usages of
UNWIND
caused a segfault.Fixed bug where space around column entries in CSV file caused an error. Now, the surrounding space is ignored.
Fixed bug where the combination of
LIMIT
and global aggregators sometimes caused the wrong number of results.
1.7.0 (6/21/2021)¶
New Features¶
Added support for non-nested literal lists.
Added support for the
collect
global aggregator to generate non-nested lists.Added support for the
UNWIND
keyword for non-nested lists.Added support for the
range()
function.Added support for queries with
WHERE <pattern>
andWHERE NOT <pattern>
.Added support for
OPTIONAL MATCH
.Added support for multi-edge frame traversals.
Added support for Cypher functions
rand()
andsign(expression)
.
Changed¶
Improved query planning.
Improved error messages for malformed queries.
Improved error messages for invalid UTF-8 data.
Added line numbers for error messages in ingest.
Changed license management to log expiration information.
The max threads configuration setting now takes hyperthreading into account.
Changed the RPM installation to install the latest xGT Python client.
Changed the installation RPM to include a tarball of xGT documentation.
Fixed¶
Fixed bug that could occur with long lines in ingest.
Unresolved¶
The query planner improvements in a few cases increased compile times.
There were also a few cases where the query planner chose a plan that reduced query performance compared to 1.6.
1.6.0 (4/2/2021)¶
New Features¶
Added new mechanism and license management system for core-based licensing through X-Formation’s LM-X license manager suite.
Added support for the Cypher functions
sum(DISTINCT expression)
,avg(DISTINCT expression)
,max(DISTINCT expression
,min(DISTINCT expression)
,toBoolean()
, andtoFloat()
.Added support for queries with bounded variable length paths.
Added support for Cypher parameters designated in queries by $varName.
Added support for the Cypher
CASE
expressions.Added support for ingesting DATE, TIME, and DATETIME, with optional time zone, which is then converted to UTC.
Improved support for query patterns without a direct syntactic connection.
Added query planner support for
WITH
clause multi-section queries.
Changed¶
Ended Python 2 support.
Changed RPM installation to no longer overwrite xgtd pam.d file if one already exists.
Changed xGT server to exit gracefully for unhandled exceptions and print exceptions when possible.
Changed ingest to ignore empty lines in csv files.
Improved ingest error messages.
Improved logging.
Improved performance of
toInteger()
for date and datetime.
Fixed¶
Fixed potential segfault that could occur in the query scheduler.
Fixed potential segfault that could occur with a key to non-key comparison.
1.5.1 (2/19/2021)¶
Changed¶
Improved performance of group by when using
DISTINCT
.
Fixed¶
Fixed a performance bug on some queries introduced in 1.5.0.
1.5.0 (2/8/2021)¶
New Features¶
Added support for access control on a per row basis. Rows may have attached security labels which will cause the row to be accessible or inaccessible by the user depending on the permission labels held by the user.
Frames have a row-label universe, which is the set of available labels that can be applied at the row level. This is applied at frame creation and is currently limited to 128 unique labels.
Added the sandbox concept for ingest of files into xGT. This restricts valid file paths to be within the sandbox and every ingest or egest must be within the sandbox boundary. This location is configurable by the server admin.
Added support for the Cypher keyword
WITH
.Added support for the Cypher functions
count(expression)
,count(DISTINCT expression)
,substring()
,toInteger()
,abs()
,ceil()
,float()
, andround()
.Added the preserve_order parameter to
save()
that allows writing rows in the exact order they occur in the frame.Added the delimiter parameter to
load()
that allows a user to select the column delimiter character.
Changed¶
Changed logging to always log license info.
Added audit message on password match failure.
Added audit message for permission denied on a protected frame.
Changed load and insert errors to be returned as part of the raised exception on the local Python client instead of being in a separate error frame.
Improved Python client-side error messages for streaming ingest problems.
Eliminated the timeout parameter from
schedule_job()
.Time type constants can now give fractions of a second with less than 6 digits.
Queries without
MATCH
statements are now allowed.Improved performance and reduced memory usage of computing cached metrics data for frames.
Fixed¶
Fixed a bug in the Query Planner causing potential memory corruption.
Fixed a potential segfault that could occur during a transaction rollback when acquiring configuration values.
Fixed a bug where a metrics data calculation failure due to an out-of-memory error would infinitely rerun and continue failing.
Fixed a bug where non-property expressions in an
ORDER BY
clause caused an error if the expression wasn’t in theRETURN
statement.
Unresolved¶
Post 1.30.0 versions of grpcio causes connection issues when forking or using multiprocessing with the python client. If forking or multiprocessing is required use version 1.30.0 of grpcio.
1.4.2 (9/28/2020)¶
Fixed¶
Fixed a bug where PAM authentication could fail when a large number of users were in a group on a system.
1.4.1 (9/18/2020)¶
New Features¶
Added more parameters to some of the Python API functions to allow customization of behavior. See the Python API documentation for individual functions for more details.
Changed¶
Changed the exception type raised if a duplicate name is given when creating a namespace or frame from
XgtValueError
toXgtNameError
.The default admin group for the “xgtdadmin” user is now “xgtd” to align with the group used for the RPM install.
The behavior of the timeout parameter to
wait_for_metrics()
has changed to unify the behavior with the timeout parameter for other functions. A value of 0 now indicates no timeout and waiting until the metrics computation is completed.Improved performance of computing cached metrics data for frames.
Improved performance of deleting rows from a frame.
Fixed¶
Items can be removed from the job history using a TQL query.
The DiskID program that is used in generating a hardware-locked license now excludes USB devices and is no longer tied to a single file system drive.
1.4.0 (8/24/2020)¶
New Features¶
Added support for authenticating users via Linux PAM (Pluggable Authentication Modules). Authentication sources can be specified via the PAM configuration for
xgtd
. Enterprise authentication through LDAP is supported via a PAM module.Added support for access control on a per frame basis. Separate controls are provided for Create, Read, Update and Delete operations on each frame.
Access control is based on the concept of security labels which both users and frames possess. The matching of labels from a user to a frame enables or disables access for that user.
The combination of authentication and access control enable multi-user support in xGT.
Added the notion of a session to represent the duration of an authenticated connection to the
xgtd
server. All user-initiated operations are rooted in an authenticated session.Added a namespace concept to enable grouping together related data frames. Namespaces are also frames and subject to access control.
Added support for security logging and auditing via the Apache log4cxx library. Security events are logged to a configurable location for eventual auditing.
Added support for runtime configurable logging to
xgtd
via the g3log library. Logging levels can be changed while the server is running.
Changed¶
Added Python methods to create and drop namespaces to the
Connection
object.Added support throughout xGT for specifying fully qualified names for frames. Fully qualified names include the namespace that contains the frame. The separator
__
(double underscore) is used to separate the name of the namespace from the name of the frame: e.g.graph__VertexFrame
.Modified transactional support inside xGT to unify it with access control. All user-initiated operations are fully transactional and access controlled.
Modified configuration support in xGT to make it transactional and access controlled.
Improved performance of different TQL aggregator operations (
DISTINCT
,ORDER BY
, group by, etc.)Improved performance of ingest.
Added support for querying more internal xGT data structures via TQL: configuration, namespaces, job history.
Reduced client-server round trips over the network by moving more functionality server-side (job wait).
Added another Jupyter Notebook example for computing Jaccard scores.
Made the
RETURN
andINTO
TQL clauses optional by supporting query execution for just its side effects (optionalRETURN
) and supporting returning small results embedded in the PythonJob
object (optionalINTO
).Added support for performance monitoring by counting the number of traversed edges in a TQL query.
Improved CSV parsing support by using a third party library.
Fixed¶
Corrected a problem with query rewriting after the query planning algorithm has modified the execution order. Anonymous bound variables were not getting correctly generated for edge steps.
Unresolved¶
There is no way to remove items from the job history. Over time, a growing job history may consume a significant amount of memory. 1.3 suffered from this as well, but 1.4 adds more information to the job history, so it grows faster.
1.3.0 (11/5/2019)¶
New Features¶
Added the ability to compute metrics about frame data and build a cache of these metrics.
Added a query planner that reorders queries to improve query performance. The planner explores many query plans and computes a cost-metric for each plan using the cached metrics. The least-cost plan is chosen for the actual running of the query.
Turned on the metrics cache collection and query planning by default. Metrics cache collection may be turned off via an xgtd configuration variable. Query planning can be turned off by calling
set_optimization_level()
.Added the
Connection
object methodswait_for_metrics()
andget_metrics_status()
to know synchronously or asynchronously when queries to fill the metrics cache have finished running.Added Cypher support for the string concatenation operator,
+
, and thetoString()
function.A space is now allowed as a separator of the date and time in a
DATETIME
in addition toT
.Added the ability to collect the number of visited edges for a job. The collection is turned on and off via an xgtd configuration variable.
Added DRM to the xgtd server to support on-premises installations.
Changed¶
Updated the names and format of the xgtd configuration variables given in
xgtd.conf
. The new format uses multiple levels of JSON objects to group categories of variables.Changed
error_frame_name
to be a property instead of a method for all frame types.Changed the behavior when a transaction fails to now raise an exception to the Python client from
run_job()
andwait_for_job()
. The previous behavior was for the functions to complete successfully, and the user had to check the job status to know if the transaction failed.Added the job status
rollback
to indicate when a job failed because of a transactional conflict with another job.Improved transactional consistency when querying for frame existence or metadata information about frames.
Improved the performance of queries over frames that have had data deleted.
Improved the performance and memory usage of some of the aggregators and solution modifiers used in queries.
Fixed¶
Improved the parsing of all input types to better enforce constraints on the types.
Added a check to prevent null values for key columns in vertex and edge frames when inserting or loading data.
Fixed a bug where two queries writing to the same results table could cause a segfault.
Fixed a bug where creating an edge frame with vertex frames that already contained data wasn’t handled properly.
Fixed a bug that severely slowed downloads from URLs with libcurl versions < 7.38 which includes installations on CentOS 7.
Fixed numerous bugs relating to scheduling and managing jobs.
Fixed a segfault that occurred in the xgtd server in some cases when memory is exhausted instead of gracefully stopping the current operation and raising an exception to the user.
Fixed a bug where query results could be added to a pre-existing results table even if the transaction failed.
Fixed a bug where canceling a job didn’t always terminate the job execution on the server.
Fixed a bug where sometimes frames weren’t immediately removed from the server when dropped.
Fixed a bug where the server wasn’t always releasing file descriptors.
Fixed a bug where using the same column for the source and target keys of an edge wasn’t allowed.
Fixed a bug where a floating point value of
nan
was encoded improperly when being transferred between client and server.Fixed a bug where the header line was being written twice when saving a CSV file of frame data local to the client.
Fixed a bug where the unsupported Cypher fragment
RETURN *
caused a segfault. It now raises an exception to the Python client.
1.2.0 (6/4/2019)¶
New Features¶
Released the xGT Python interface into open source. The
xgt
package can now be found on the Python Package Index and installed usingpip install xgt
.Added support for the Cypher keyword
SET
for updating non-key properties in frames.Added support for the Cypher keyword
CREATE
for adding new vertices and edges to a frame.Added support for the Cypher keyword
MERGE
for adding new vertices to a frame if the vertex key doesn’t already exist.Added support for the Cypher keyword
DETACH DELETE
for removing vertices and all their incident edges from a frame.Added support for the Cypher keyword
DELETE
for removing edges from a frame.Added a transactional model. All operations are now transactional and safe to be run concurrently.
Added the methods
max_user_memory_size()
andfree_user_memory_size()
toConnection
so a user can get the maximum and free memory available to them on the server.Added the method
error_frame_name()
toTableFrame
,VertexFrame
, andEdgeFrame
that returns the name used for the error table frame generated when there are errors on inserting data into a frame.Added the
.xgtd.conf
configuration variableio_threads
that limits the number of threads used to read a single input file to improve performance on systems with large thread counts.
Changed¶
Improved handling of out-of-memory situations so that in many cases a transaction that causes an out-of-memory error performs a rollback and leaves the system in a usable state.
Improved exception types and messages returned to the user.
Improved error detection for vertex keys to catch duplicates earlier.
Added a check for unsupported Cypher keywords and raise exceptions when they are used.
Added a check that the columns in a
RETURN
statement match the schema of a pre-existing results table and raise an exception when they don’t.Added a check that the Cypher functions
outdegree()
andindegree()
aren’t being called on an edge and raise an exception when they are.Removed unneeded files from the xgt library distribution package.
Improved performance of downloading frame data from the server to the client.
Improved performance of inserting data into frames.
Added user documentation for all new features.
Fixed¶
Fixed a bug where the server wasn’t responsive to commands when a query was running.
Fixed a bug where using the Cypher keyword
STARTS WITH
on a null property incorrectly raised an exception.Fixed a bug where a group by on multiple Cypher
outdegree()
function calls yielded incorrect results.Fixed a bug that occasionally occurred when using
ORDER BY
in a query.Fixed a bug where downloading frame data was limited to 4MB.
Fixed a bug on reconnection that occurred when a server unexpectedly exits during a script.
The minimum version of the
grpcio
Python package required byxgt
is increased to 1.20.Python2 versions below 2.7.10 are now deprecated. Added a deprecation warning for a too low version of Python at
xgt
module import time.
1.1.0 (2/21/2019)¶
New Features¶
Added support for an SSL-encrypted gRPC connection between client and server.
Changed¶
Removed the REST API from the Python library, server, and documentation.
Removed the option to run in ‘local’ (unixsocket) mode as that security need has been replaced by the SSL-encrypted gRPC connection.
Added the ability for the client to detect when the server has restarted and notify the user they need to reestablish a connection.
Improved the hierarchical organization of exception messages.
Improved the text of many errors returned to the Python user.
Improved load and insert error reporting to return a usable frame with the valid data and an error table with all the invalid data and errors instead of raising on the first error detected in the data.
Improved logging of out-of-memory errors.
Added error reporting when a Cypher statement compares a vertex with an edge.
Added documentation for xGT data types.
Added compiler optimizations to improve performance of some queries.
Improved the client-server connection to reduce latency.
Improved ingestion performance for large edge frames when implicitly filling the vertex frame.
Fixed¶
Fixed numerous bugs where Null values weren’t handled correctly or were handled differently from Cypher’s behavior.
Fixed numerous bugs where datatype conversion wasn’t being handled correctly.
Fixed a bug where xGT ran out of memory when ingesting large edge frames due to fragmentation when plenty of memory was available.
Fixed a deadlock that could occur during load or insert operations.
Fixed some client-server connection bugs when a Jupyter Notebook is used as the client.
1.0.0 (12/1/2018)¶
New Features¶
Added ability to query a
TableFrame
using syntax for querying aVertexFrame
.Names can now contain Unicode characters.
Changed¶
Improved stability and performance of connection between Python client and xgtd server by switching to using gRPC instead of REST as the default communication protocol. The REST protocol is now deprecated.
Added check when creating
VertexFrame
that the key parameter is given and is a valid column in the schema.Added check when creating
EdgeFrame
that the source_key and target_key parameters are given and are valid columns in the schema.Modified behavior when dropping frames to prevent a user from dropping a frame on which other frames depend.
Modified behavior when creating frames to require a name for the frame.
Disallowed the creation of frames with names containing periods.
Renamed
xgtd
configuration file from.gemsconfig
to.xgtd.conf
, and now look for it in/etc
if the file doesn’t exist in the user’s home directory.In
xgtd.conf
, renamed the propertys3_key_id
toaws_access_key_id
ands3_key
toaws_secret_access_key
to be consistent with the naming in the AWS credentials file.Errors are now reported when .xgtd.conf contains invalid property names or values of the wrong type.
Modified exceptions to use new naming scheme.
Improved handling of filename paths passed to I/O commands to make all filename paths unambiguous as to the location of the file.
Added support for loading CSV files where some of the lines have more columns than the required number.
Improved documentation and tutorials.
Fixed¶
Fixed a rare bug where simultaneously creating and dropping frames caused an error.
Fixed a bug where frames with numeric names couldn’t be dropped.
Fixed a bug where using a list of tuples as the schema to create a frame had stopped working.
Fixed a bug where exceptions were sometimes reported to the wrong job when multiple jobs were running simultaneously.
Fixed numerous bugs that occurred when loading data.
0.18.0 (10/24/2018)¶
Changed¶
Modified the API for creating and destroying graphs. See the Python API documentation for details.
Simplified the specification of source and target keys by supporting only a single key column.
Names now have to be unique across all objects vs. unique only for a particular type of object.
Changed working directory of AWS marketplace daemon to
/srv/xgtd/data
from/home/ec2-user/
.
Fixed¶
Improved type checking in the query compiler to provide better error messages to users for invalid queries.
Improved error messages in some cases to make them more understandable.
Fixed a bug where exceptions were sometimes reported to the wrong job when multiple jobs were running simultaneously.
Fixed a bug where exceptions sometimes had malformed JSON.
Fixed a segfault that occasionally occurred in the job manager.
0.17.0 (10/1/2018)¶
Changed¶
Added additional error messages.
Added query annotations to more types of invalid Cypher query error messages.
Can now read AWS credentials directly from
.aws/credentials
.
Fixed¶
Fixed some Unicode bugs.
Fixed a segfault occurring when an exception is raised while ingesting.
0.16.0 (9/1/2018)¶
Changed¶
The query compiler can now automatically infer the vertex or edge frame when there is a single type.
Standardized error message handling between
run_job()
andwait_for_jobs()
.
Fixed¶
Improved compatibility between Python 2 and 3.
Fixed spurious warnings about numpy datatype incompatibilities.
Improved error reporting to give more meaningful error messages.
Fixed many cases where errors were not being reported to the user.
Fixed numerous bugs.