8. xGT Release Notes¶
8.1. 1.11.0 (06/17/2022)¶
8.1.1. New Features¶
A whole graph algorithm (WGA) catalog callable from TQL. Initial included algorithms are PageRank and Breadth-First Search (BFS).
Automatic xGT frame creation using an inferred schema derived from Pandas frames, Arrow tables, or CSV files.
Single sign-on Kerberos-based authentication via GSSAPI libraries.
8.1.2. Changed¶
Improved type support for Parquet and Arrow.
Improved support for very large strings in Arrow.
Support for row-level frames with Parquet files.
Ingest transactions now will stop on the first encountered error instead of trying to ingest as many rows as possible.
Target release changed to RedHat Enterprise Linux (RHEL) v8.
Moved to a new version of Google’s protobuf client package to resolve compatibility problems.
Improved type support for jobs with immediate results to cover all xGT types.
Improved the performance of data transfers between the xGT client and server.
Allow for CSV headers with quotes in them.
8.1.3. Fixed¶
Fixed an issue where queries with multiple mandatory MATCHes were not reported as erroneous.
Fixed issues with supporting frame and namespace names with symbols.
8.2. 1.10.1 (3/7/2022)¶
8.2.1. New Features¶
Added the ability for
load()
to map schema columns in the frame to columns in a Parquet file using theframe_to_file_column_mapping
parameter.
8.2.2. Changed¶
The xgtd Arrow Flight server now uses the same port as the xgtd server, 4367.
Improved the performance of some queries, notably
count(*)
queries.Can now read in times that have more than 6 digits of fractional second precision. The extra digits are rounded to 6. This makes inserting time data from pandas dataframes easier.
8.2.3. Fixed¶
Fixed some corner case bugs with transferring lists to / from the client and when using lists in Arrow flights.
Fixed handling of null columns in Arrow flights.
Fixed a bug where an edge frame could be created where the type of the source and target columns didn’t match the corresponding vertex key column type.
Fixed some issues where symbols in names weren’t always handled correctly.
8.3. 1.10.0 (2/11/2022)¶
8.3.1. New Features¶
Added support for writing to Parquet files on the server filesystem (xgtd://).
Added support for xgt IP addresses as strings in Parquet files.
Egesting to multiple files from a single frame is now supported.
Added the ability for
load()
to map schema columns in the frame to columns in a CSV file using theframe_to_file_column_mapping
parameter.The xGT server is now an Arrow Flight endpoint. Arrow Flight clients can connect to it and insert Flight data into frames and get data from frames as flights.
Cypher lists can now be inserted into and retrieved from frames in the client.
8.3.2. Changed¶
The logger now rotates the log files and ages off old ones to avoid filling up disk space.
Improved performance and reduced memory usage for
LIMIT
when used with group by orDISTINCT
.The configuration setting
license.server
was renamed tolicense.location
.
8.3.3. Fixed¶
Fixed a bug where vertex labels carried across a
WITH
statement weren’t allowed inunique_vertices()
.Fixed a bug where
save()
incorrectly handled the offset and length parameters.Fixed a bug where, for some queries that use both
UNWIND
andWITH
, an incorrect result was returned.
8.4. 1.9.1 (1/17/2022)¶
8.4.1. Fixed¶
Fixed a bug where, for some queries that allow multiple types for an edge in a
MATCH
pattern ([:REL1 | :REL2]
), an incorrect result was returned.Fixed a bug where an error was incorrectly thrown in some cases when combining
UNION
andWITH
.
8.5. 1.9.0 (11/1/2021)¶
8.5.1. New Features¶
Added support for
UNION
andUNION ALL
.Added support for reading from Parquet files on the server filesystem (xgtd://).
Added support for ingesting integers in hexadecimal notation.
Added support for the list concatenation operator:
+
.Added support for list subscripting and slicing.
Added support for the list functions
keys()
,reverse()
,size()
, andtail()
.Added support for the string function
size()
.
8.5.2. Changed¶
Immediate results can now be returned with
schedule_job()
, and they are now stored in the job history.Improved performance of
insert()
from both Python lists and Pandas frames.Improved performance of
ORDER BY
when used in conjunction withSKIP
orLIMIT
.Improved performance of
SKIP
orLIMIT
with large values.Improved performance of group by queries.
Improved performance of some queries using a WITH clause. Vertex index.
Limited number of files returned in
load()
error message to prevent extremely large messages.Some improvements to query planning and expression optimization.
8.5.3. Fixed¶
Fixed a bug when using
OPTIONAL MATCH
.Fixed a bug where missing integer values in Pandas frames caused an error during
insert()
instead of creating null values.
8.6. 1.8.0 (8/30/2021)¶
8.6.1. New Features¶
Added support for a default namespace.
Added support for
OPTIONAL MATCH
withWHERE <pattern>
.Added support for the group by
collect
aggregator.Added support for
collect
on lists for both global and group by aggregation.Added support for lists with count, min, and max aggregators, both global and group by.
Added support for lists as group by keys.
Added support for lists as
ORDER BY
keys.Added support for lists as
DISTINCT
keys.Added support for nested lists everywhere lists are currently allowed.
8.6.2. Changed¶
Improved query compile time.
Improved query planning.
Improved performance of
outdegree()
andindegree()
.Improved performance of
ORDER BY
, especially when used in conjunction withLIMIT
.Improved performance of global aggregators.
8.6.3. Fixed¶
Fixed bugs where some aggregator corner case usages, both global and group by, returned the wrong answer and some caused a segfault.
Fixed corner case bug in
range()
.
8.7. 1.7.1 (7/7/2021)¶
8.7.1. Fixed¶
Fixed bug where some corner case usages of UNWIND caused a segfault.
Fixed bug where space around column entries in CSV file caused an error. Now, the surrounding space is ignored.
Fixed bug where the combination of LIMIT and global aggregators sometimes caused the wrong number of results.
8.8. 1.7.0 (6/21/2021)¶
8.8.1. New Features¶
Added support for non-nested literal lists.
Added support for the
collect
global aggregator to generate non-nested lists.Added support for the
UNWIND
keyword for non-nested lists.Added support for the
range()
function.Added support for queries with
WHERE <pattern>
andWHERE NOT <pattern>
.Added support for
OPTIONAL MATCH
.Added support for multi-edge frame traversals.
Added support for Cypher functions
rand()
andsign(expression)
.
8.8.2. Changed¶
Improved query planning.
Improved error messages for malformed queries.
Improved error messages for invalid UTF-8 data.
Added line numbers for error messages in ingest.
Changed license management to log expiration information.
The max threads configuration setting now takes hyperthreading into account.
Changed the RPM installation to install the latest xGT Python client.
Changed the installation RPM to include a tarball of xGT documentation.
8.8.3. Fixed¶
Fixed bug that could occur with long lines in ingest.
8.8.4. Unresolved¶
The query planner improvements in a few cases increased compile times.
There were also a few cases where the query planner chose a plan that reduced query performance compared to 1.6.
8.9. 1.6.0 (4/2/2021)¶
8.9.1. New Features¶
Added new mechanism and license management system for core-based licensing through X-Formation’s LM-X license manager suite.
Added support for the Cypher functions
sum(DISTINCT expression)
,avg(DISTINCT expression)
,max(DISTINCT expression
,min(DISTINCT expression)
,toBoolean()
, andtoFloat()
.Added support for queries with bounded variable length paths.
Added support for Cypher parameters designated in queries by $varName.
Added support for the Cypher CASE expressions.
Added support for ingesting DATE, TIME, and DATETIME, with optional time zone, which is then converted to UTC.
Improved support for query patterns without a direct syntactic connection.
Added query planner support for WITH clause multi-section queries.
8.9.2. Changed¶
Ended Python 2 support.
Changed RPM installation to no longer overwrite xgtd pam.d file if one already exists.
Changed xGT server to exit gracefully for unhandled exceptions and print exceptions when possible.
Changed ingest to ignore empty lines in csv files.
Improved ingest error messages.
Improved logging.
Improved performance of
toInteger()
for date and datetime.
8.9.3. Fixed¶
Fixed potential segfault that could occur in the query scheduler.
Fixed potential segfault that could occur with a key to non-key comparison.
8.10. 1.5.1 (2/19/2021)¶
8.10.1. Changed¶
Improved performance of group by when using
DISTINCT
.
8.10.2. Fixed¶
Fixed a performance bug on some queries introduced in 1.5.0.
8.11. 1.5.0 (2/8/2021)¶
8.11.1. New Features¶
Added support for access control on a per row basis. Rows may have attached security labels which will cause the row to be accessible or inaccessible by the user depending on the permission labels held by the user.
Frames have a row-label universe, which is the set of available labels that can be applied at the row level. This is applied at frame creation and is currently limited to 128 unique labels.
Added the sandbox concept for ingest of files into xGT. This restricts valid file paths to be within the sandbox and every ingest or egest must be within the sandbox boundary. This location is configurable by the server admin.
Added support for the Cypher keyword
WITH
.Added support for the Cypher functions
count(expression)
,count(DISTINCT expression)
,substring()
,toInteger()
,abs()
,ceil()
,float()
, andround()
.Added the preserve_order parameter to
save()
that allows writing rows in the exact order they occur in the frame.Added the delimiter parameter to
load()
that allows a user to select the column delimiter character.
8.11.2. Changed¶
Changed logging to always log license info.
Added audit message on password match failure.
Added audit message for permission denied on a protected frame.
Changed load and insert errors to be returned as part of the raised exception on the local Python client instead of being in a separate error frame.
Improved Python client-side error messages for streaming ingest problems.
Eliminated the timeout parameter from
schedule_job()
.Time type constants can now give fractions of a second with less than 6 digits.
Queries without
MATCH
statements are now allowed.Improved performance and reduced memory usage of computing cached metrics data for frames.
8.11.3. Fixed¶
Fixed a bug in the Query Planner causing potential memory corruption.
Fixed a potential segfault that could occur during a transaction rollback when acquiring configuration values.
Fixed a bug where a metrics data calculation failure due to an out-of-memory error would infinitely rerun and continue failing.
Fixed a bug where non-property expressions in an ORDER BY clause caused an error if the expression wasn’t in the RETURN statement.
8.11.4. Unresolved¶
Post 1.30.0 versions of grpcio causes connection issues when forking or using multiprocessing with the python client. If forking or multiprocessing is required use version 1.30.0 of grpcio.
8.12. 1.4.2 (9/28/2020)¶
8.12.1. Fixed¶
Fixed a bug where PAM authentication could fail when a large number of users were in a group on a system.
8.13. 1.4.1 (9/18/2020)¶
8.13.1. New Features¶
Added more parameters to some of the Python API functions to allow customization of behavior. See the Python API documentation for individual functions for more details.
8.13.2. Changed¶
Changed the error type thrown if a duplicate name is given when creating a namespace or frame from
XgtValueError
toXgtNameError
.The default admin group for the “xgtdadmin” user is now “xgtd” to align with the group used for the RPM install.
The behavior of the timeout parameter to
wait_for_metrics()
has changed to unify the behavior with the timeout parameter for other functions. A value of 0 now indicates no timeout and waiting until the metrics computation is completed.Improved performance of computing cached metrics data for frames.
Improved performance of deleting rows from a frame.
8.13.3. Fixed¶
Items can be removed from the job history using a TQL query.
The DiskID program that is used in generating a hardware-locked license now excludes USB devices and is no longer tied to a single file system drive.
8.14. 1.4.0 (8/24/2020)¶
8.14.1. New Features¶
Added support for authenticating users via Linux PAM (Pluggable Authentication Modules). Authentication sources can be specified via the PAM configuration for
xgtd
. Enterprise authentication through LDAP is supported via a PAM module.Added support for access control on a per frame basis. Separate controls are provided for Create, Read, Update and Delete operations on each frame.
Access control is based on the concept of security labels which both users and frames possess. The matching of labels from a user to a frame enables or disables access for that user.
The combination of authentication and access control enable multi-user support in xGT.
Added the notion of a session to represent the duration of an authenticated connection to the
xgtd
server. All user-initiated operations are rooted in an authenticated session.Added a namespace concept to enable grouping together related data frames. Namespaces are also frames and subject to access control.
Added support for security logging and auditing via the Apache log4cxx library. Security events are logged to a configurable location for eventual auditing.
Added support for runtime configurable logging to
xgtd
via the g3log library. Logging levels can be changed while the server is running.
8.14.2. Changed¶
Added Python methods to create and drop namespaces to the
Connection
object.Added support throughout xGT for specifying fully qualified names for frames. Fully qualified names include the namespace that contains the frame. The separator
__
(double underscore) is used to separate the name of the namespace from the name of the frame: e.g.graph__VertexFrame
.Modified transactional support inside xGT to unify it with access control. All user-initiated operations are fully transactional and access controlled.
Modified configuration support in xGT to make it transactional and access controlled.
Improved performance of different TQL aggregator operations (
DISTINCT
,ORDER BY
, group by, etc.)Improved performance of ingest.
Added support for querying more internal xGT data structures via TQL: configuration, namespaces, job history.
Reduced client-server round trips over the network by moving more functionality server-side (job wait).
Added another Jupyter Notebook example for computing Jaccard scores.
Made the
RETURN
andINTO
TQL clauses optional by supporting query execution for just its side effects (optionalRETURN
) and supporting returning small results embedded in the PythonJob
object (optionalINTO
).Added support for performance monitoring by counting the number of traversed edges in a TQL query.
Improved CSV parsing support by using a third party library.
8.14.3. Fixed¶
Corrected a problem with query rewriting after the query planning algorithm has modified the execution order. Anonymous bound variables were not getting correctly generated for edge steps.
8.14.4. Unresolved¶
There is no way to remove items from the job history. Over time, a growing job history may consume a significant amount of memory. 1.3 suffered from this as well, but 1.4 adds more information to the job history, so it grows faster.
8.15. 1.3.0 (11/5/2019)¶
8.15.1. New Features¶
Added the ability to compute metrics about frame data and build a cache of these metrics.
Added a query planner that reorders queries to improve query performance. The planner explores many query plans and computes a cost-metric for each plan using the cached metrics. The least-cost plan is chosen for the actual running of the query.
Turned on the metrics cache collection and query planning by default. Metrics cache collection may be turned off via an xgtd configuration variable. Query planning can be turned off by calling
set_optimization_level()
.Added the
Connection
object functionswait_for_metrics()
andget_metrics_status()
to know synchronously or asynchronously when queries to fill the metrics cache have finished running.Added Cypher support for the string concatenation operator,
+
, and thetoString()
function.A space is now allowed as a separator of the date and time in a
DATETIME
in addition toT
.Added the ability to collect the number of visited edges for a job. The collection is turned on and off via an xgtd configuration variable.
Added DRM to the xgtd server to support on-premises installations.
8.15.2. Changed¶
Updated the names and format of the xgtd configuration variables given in
xgtd.conf
. The new format uses multiple levels of JSON objects to group categories of variables.Changed
error_frame_name
to be a property instead of a method for all frame types.Changed the behavior when a transaction fails to now throw an error to the Python client from
run_job()
andwait_for_job()
. The previous behavior was for the functions to complete successfully, and the user had to check the job status to know if the transaction failed.Added the job status
rollback
to indicate when a job failed because of a transactional conflict with another job.Improved transactional consistency when querying for frame existence or metadata information about frames.
Improved the performance of queries over frames that have had data deleted.
Improved the performance and memory usage of some of the aggregators and solution modifiers used in queries.
8.15.3. Fixed¶
Improved the parsing of all input types to better enforce constraints on the types.
Added a check to prevent null values for key columns in vertex and edge frames when inserting or loading data.
Fixed a bug where two queries writing to the same results table could cause a segfault.
Fixed a bug where creating an edge frame with vertex frames that already contained data wasn’t handled properly.
Fixed a bug that severely slowed downloads from URLs with libcurl versions < 7.38 which includes installations on CentOS 7.
Fixed numerous bugs relating to scheduling and managing jobs.
Fixed a segfault that occurred in the xgtd server in some cases when memory is exhausted instead of gracefully stopping the current operation and throwing an error to the user.
Fixed a bug where query results could be added to a pre-existing results table even if the transaction failed.
Fixed a bug where canceling a job didn’t always terminate the job execution on the server.
Fixed a bug where sometimes frames weren’t immediately removed from the server when dropped.
Fixed a bug where the server wasn’t always releasing file descriptors.
Fixed a bug where using the same column for the source and target keys of an edge wasn’t allowed.
Fixed a bug where a floating point value of
nan
was encoded improperly when being transferred between client and server.Fixed a bug where the header line was being written twice when saving a CSV file of frame data local to the client.
Fixed a bug where the unsupported Cypher fragment
RETURN *
caused a segfault. It now throws an error to the Python client.
8.16. 1.2.0 (6/4/2019)¶
8.16.1. New Features¶
Released the xGT Python interface into open source. The
xgt
package can now be found on the Python Package Index and installed usingpip install xgt
.Added support for the Cypher keyword
SET
for updating non-key properties in frames.Added support for the Cypher keyword
CREATE
for adding new vertices and edges to a frame.Added support for the Cypher keyword
MERGE
for adding new vertices to a frame if the vertex key doesn’t already exist.Added support for the Cypher keyword
DETACH DELETE
for removing vertices and all their incident edges from a frame.Added support for the Cypher keyword
DELETE
for removing edges from a frame.Added a transactional model. All operations are now transactional and safe to be run concurrently.
Added the functions
max_user_memory_size()
andfree_user_memory_size()
toConnection
so a user can get the maximum and free memory available to them on the server.Added the function
error_frame_name()
toTableFrame
,VertexFrame
, andEdgeFrame
that returns the name used for the error table frame generated when there are errors on inserting data into a frame.Added the
.xgtd.conf
configuration variableio_threads
that limits the number of threads used to read a single input file to improve performance on systems with large thread counts.
8.16.2. Changed¶
Improved handling of out-of-memory situations so that in many cases a transaction that causes an out-of-memory error performs a rollback and leaves the system in a usable state.
Improved error types and messages returned to the user.
Improved error detection for vertex keys to catch duplicates earlier.
Added a check for unsupported Cypher keywords and throw errors when they are used.
Added a check that the columns in a
RETURN
statement match the schema of a pre-existing results table and throw an error when they don’t.Added a check that the Cypher functions
outdegree()
andindegree()
aren’t being called on an edge and throw an error when they are.Removed unneeded files from the xgt library distribution package.
Improved performance of downloading frame data from the server to the client.
Improved performance of inserting data into frames.
Added user documentation for all new features.
8.16.3. Fixed¶
Fixed a bug where the server wasn’t responsive to commands when a query was running.
Fixed a bug where using the Cypher keyword
STARTS WITH
on a null property incorrectly threw an exception.Fixed a bug where a group by on multiple Cypher
outdegree()
function calls yielded incorrect results.Fixed a bug that occasionally occurred when using
ORDER BY
in a query.Fixed a bug where downloading frame data was limited to 4MB.
Fixed a bug on reconnection that occurred when a server unexpectedly exits during a script.
The minimum version of the
grpcio
Python package required byxgt
is increased to 1.20.Python2 versions below 2.7.10 are now deprecated. Added a deprecation warning for a too low version of Python at
xgt
module import time.
8.17. 1.1.0 (2/21/2019)¶
8.17.1. New Features¶
Added support for an SSL-encrypted gRPC connection between client and server.
8.17.2. Changed¶
Removed the REST API from the Python library, server, and documentation.
Removed the option to run in ‘local’ (unixsocket) mode as that security need has been replaced by the SSL-encrypted gRPC connection.
Added the ability for the client to detect when the server has restarted and notify the user they need to reestablish a connection.
Improved the hierarchical organization of exception messages.
Improved the text of many errors returned to the Python user.
Improved load and insert error reporting to return a usable frame with the valid data and an error table with all the invalid data and errors instead of throwing on the first error detected in the data.
Improved logging of out-of-memory errors.
Added error reporting when a Cypher statement compares a vertex with an edge.
Added documentation for xGT data types.
Added compiler optimizations to improve performance of some queries.
Improved the client-server connection to reduce latency.
Improved ingestion performance for large edge frames when implicitly filling the vertex frame.
8.17.3. Fixed¶
Fixed numerous bugs where Null values weren’t handled correctly or were handled differently from Cypher’s behavior.
Fixed numerous bugs where datatype conversion wasn’t being handled correctly.
Fixed a bug where xGT ran out of memory when ingesting large edge frames due to fragmentation when plenty of memory was available.
Fixed a deadlock that could occur during load or insert operations.
Fixed some client-server connection bugs when a Jupyter Notebook is used as the client.
8.18. 1.0.0 (12/1/2018)¶
8.18.1. New Features¶
Added ability to query a
TableFrame
using syntax for querying aVertexFrame
.Names can now contain Unicode characters.
8.18.2. Changed¶
Improved stability and performance of connection between Python client and xgtd server by switching to using gRPC instead of REST as the default communication protocol. The REST protocol is now deprecated.
Added check when creating
VertexFrame
that the key parameter is given and is a valid column in the schema.Added check when creating
EdgeFrame
that the source_key and target_key parameters are given and are valid columns in the schema.Modified behavior when dropping frames to prevent a user from dropping a frame on which other frames depend.
Modified behavior when creating frames to require a name for the frame.
Disallowed the creation of frames with names containing periods.
Renamed
xgtd
configuration file from.gemsconfig
to.xgtd.conf
, and now look for it in/etc
if the file doesn’t exist in the user’s home directory.In
xgtd.conf
, renamed the propertys3_key_id
toaws_access_key_id
ands3_key
toaws_secret_access_key
to be consistent with the naming in the AWS credentials file.Errors are now reported when .xgtd.conf contains invalid property names or values of the wrong type.
Modified exceptions to use new naming scheme.
Improved handling of filename paths passed to I/O commands to make all filename paths unambiguous as to the location of the file.
Added support for loading CSV files where some of the lines have more columns than the required number.
Improved documentation and tutorials.
8.18.3. Fixed¶
Fixed a rare bug where simultaneously creating and dropping frames caused an error.
Fixed a bug where frames with numeric names couldn’t be dropped.
Fixed a bug where using a list of tuples as the schema to create a frame had stopped working.
Fixed a bug where exceptions were sometimes reported to the wrong job when multiple jobs were running simultaneously.
Fixed numerous bugs that occurred when loading data.
8.19. 0.18.0 (10/24/2018)¶
8.19.1. Changed¶
Modified the API for creating and destroying graphs. See the Python API documentation for details.
Simplified the specification of source and target keys by supporting only a single key column.
Names now have to be unique across all objects vs. unique only for a particular type of object.
Changed working directory of AWS marketplace daemon to
/srv/xgtd/data
from/home/ec2-user/
.
8.19.2. Fixed¶
Improved type checking in the query compiler to provide better error messages to users for invalid queries.
Improved error messages in some cases to make them more understandable.
Fixed a bug where exceptions were sometimes reported to the wrong job when multiple jobs were running simultaneously.
Fixed a bug where exceptions sometimes had malformed JSON.
Fixed a segfault that occasionally occurred in the job manager.
8.20. 0.17.0 (10/1/2018)¶
8.20.1. Changed¶
Added additional error messages.
Added query annotations to more types of invalid Cypher query error messages.
Can now read AWS credentials directly from
.aws/credentials
.
8.20.2. Fixed¶
Fixed some Unicode bugs.
Fixed a segfault occurring when an exception is thrown while ingesting.
8.21. 0.16.0 (9/1/2018)¶
8.21.1. Changed¶
The query compiler can now automatically infer the vertex or edge frame when there is a single type.
Standardized error message handling between
run_job()
andwait_for_jobs()
.
8.21.2. Fixed¶
Improved compatibility between Python 2 and 3.
Fixed spurious warnings about numpy datatype incompatibilities.
Improved error reporting to give more meaningful error messages.
Fixed many cases where errors were not being reported to the user.
Fixed numerous bugs.