Release Notes

1.16.0 (4/4/2024)

New Features

  • List types are now supported in CASE clauses.

  • The outdegree() and indegree() Cypher functions can now be used on expressions that could come from multiple frames.

  • The Cypher function unique_vertices() can now be used in places other than WHERE clauses.

  • Added truncate() Cypher function for temporal types.

  • Added update_columns() method for updating column data.

  • Added the ability for a user to supply separate SSL key and certificate locations when creating a Connection object.

  • Added support for topology deletions and property modifications for labels defined in an OPTIONAL MATCH.

  • Added parameter rows to the frame get_data() method to allow selection of the data for a list of row positions.

  • Added support for saving both CSV and Parquet files to S3.

  • Added support for the large string type for reading / writing Parquet files.

  • Added Cypher support for geospatial point types including the functions point(), distance(), and withinbbox().

Changed

  • Added key columns to all ingest error messages for vertex and edge frames.

  • Improved some error message descriptions.

  • Improved performance when using WITH clause.

  • Improved performance of WGAs when filtering graphs.

  • Improved performance of some queries by applying compiler optimizations in more places.

  • Reduced memory usage of frames in some cases.

  • Improved handling of passing variables of the wrong type to Python functions.

  • Significantly improved performance of ingest in the presence of errors when the suppress_errors parameter is given to load().

  • Changed the behavior when inspecting frames in a namespace to only show frames the user has permissions to see.

  • Changed the behavior when inspecting jobs in the job history to only show jobs belonging to the connected user.

  • Changed S3 authentication from deprecated AWS signature version 2 to AWS signature version 4.

Fixed

  • Fixed the behavior of null in some cases.

  • Fixed a bug that occurred when the same label was used multiple times in unique_vertices().

  • Fixed some bugs in edge type inferencing on variable length edges.

  • Fixed a bug where the RPM install failed when the “xgtd” group already existed.

  • Fixed a bug where in a few corner cases get_data() was incorrectly converting data to the Python format.

  • Fixed a bug where the offset and length parameters to get_data() were interpreted incorrectly for frames with row-level permissions.

  • Fixed a bug when using client side RowIDs with row-level permissions.

  • Fixed a bug where the degree Cypher functions could give an incorrect result when the source or target vertex frames had row permissions.

  • Fixed a crash that occurred when loading Parquet files from a URL if the connection timed out.

  • Fixed crashes that were caused by unhandled exceptions.

  • Fixed a bug where AWS credentials weren’t being passed from the client’s low-code ‘create frame’ methods to the xGT server, causing authentication to fail.

  • Fixed crash when some configuration values were invalid during boot.

  • Fixed crashes caused by inconsistent container tracking states that occurred during transactional retries.

  • Fixed crashes resulting from simultaneous insertion into and deletion from a namespace.

  • Fixed crashes caused by partial commit states.

  • Fixed Connection object raising an error sometimes when the default namespace already existed.

1.15.0 (11/29/2023)

New Features

  • Added support for authenticating using PKI.

  • Added support for automatically creating a private namespace for each user.

  • Added functions for adding, deleting, and reordering frame columns.

  • Added support for DISTINCT on edge entity variables crossing a WITH boundary.

  • Added filtering to CALL of whole graph algorithms. This is beta.

  • Added support for using a list of frame identifiers to define a CALL graph.

  • Added support for accessing list properties of entity variables that can come from multiple frames.

  • Added parameter row_filter to the frame get_data() and save() methods to allow filtering and modification of frame data along with column mapping.

  • Added support for aggregating durations using sum() and avg().

  • Added epochSeconds as a component of datetime.

  • Added support for using arbitrary expressions in the duration component map constructor.

  • Added support for the component week to the duration component map constructor.

  • Added support for passing a component map to date(), time(), and datetime().

  • Added support for generating the current date, time, and datetime in a query.

Changed

  • Changed behavior of DISTINCT on entity variables to distinct on the entity identifier instead of the union of the entity’s properties.

  • Renamed total_rows() to num_rows().

  • Renamed input_filter parameter for the load() and insert() methods to row_filter.

  • Renamed frame_to_input_column_mapping parameter for the frame insert() and low-code create frame methods to column_mapping.

  • Improved accuracy of reported memory usage.

  • Reduced memory usage of frames under some scenarios.

  • Improved performance of evaluating Cypher expressions.

Fixed

  • Fixed some cases where null was handled inconsistently for some Cypher functions and features.

  • Fixed some cases where column mapping when reading into row-level frames from files was incorrect.

  • Fixed issues with pandas API changing in version 2.1.0.

  • Fixed deadlock in client CSV load.

1.14.1 (7/17/2023)

New Features

  • Added the ability to SET an entity variable generated by CREATE in the same WITH section.

Fixed

  • Fixed bug where a CREATE wasn’t executed when combined with a SET in the same WITH section and the SET came last.

  • Fixed bug where some Cypher functionality used on an entity variable that could come from multiple frames incorrectly raised exceptions.

  • Fixed bug where a property name that was the same as a temporal component (month, hour, etc.) would incorrectly be identified as a temporal component instead of a property.

1.14.0 (6/19/2023)

New Features

  • Added unsigned integer type.

  • Added support for IPv6 addresses to IPADDRESS type.

  • Added parameter input_filter to the frame load() and insert() methods to allow filtering and modification of input data along with column mapping.

  • Added support to accept a pyarrow Table as input data to the frame insert() method.

  • RowIDs can be passed as Cypher parameters.

  • RowIDs can be returned from Cypher queries.

  • RowIDs can be returned to the client using the frame and job get_data() methods and are supported for Python list of lists and pandas DataFrame formats.

  • Added id() Cypher function used to construct RowIDs and indicate returning a RowID in a RETURN clause instead of expanding to all properties.

  • Added the ability to get components of all temporal types using variable.component syntax in a Cypher query.

  • Added the ability to have a CALL clause follow a WITH clause.

  • Added the ability to set all the properties from one entity variable to another using SET.

  • Added support for aliasing entity variables in a Cypher query.

  • Added the frame drop_frames() method to allow deleting multiple frames in a single call.

  • Added parameter columns to frame and job get_data() methods to allow selecting which columns to download.

  • Added key_column property to VertexFrame to return the key column position.

  • Added source_key_column and target_key_column properties to EdgeFrame to return the source and target key column positions.

  • Added support for licensing based on sockets and cores.

Changed

  • Results stored in jobs now contain all the rows and are associated with a Connection object. The results are deleted when the connection is terminated.

  • WGA vertex input parameters are now RowIDs instead of frame name, vertex pairs.

  • WGA vertex output parameters are now RowIDs instead of frame name, vertex pairs. The outputs are now similar to graph elements and can be used to get properties, etc.

  • Removed getDay(), getHour(), getMinute(), getSecond(), and getMicrosecond() Cypher functions for duration types. Use the new variable.component syntax to access the duration components.

  • Added format parameter to the frame and job get_data() methods to select the output data type. The methods get_data_pandas() and get_data_arrow() are now deprecated.

  • Added get_frame() and get_frames() methods that return frame objects regardless of the type. The methods get_table_frame(), get_vertex_frame(), get_edge_frame(), get_table_frames(), get_vertex_frames(), and get_edge_frames() are now deprecated.

  • Calls to frame load(), save(), insert(), and get_data() can now be cancelled.

  • Improved metrics cache queries performance.

Fixed

  • Fixed bug where null values in a list were sometimes not returned correctly to the client.

  • Fixed bug where runtime-generated nulls passed to some Cypher functions incorrectly raised an exception.

  • Fixed bug where CREATE on a table frame raised an exception if properties were specified.

  • Fixed bug where in some cases using UNWIND on a non-list wasn’t raising an exception.

  • Fixed bug where deleting a row in a row-level frame using DELETE without a RETURN incorrectly raised an exception.

  • Fixed bug where a Cypher query with an AND in the WHERE and both sides of the AND constructed a constant type incorrectly raised an exception.

  • Fixed bug where creating and deleting frames and namespaces didn’t preempt metrics cache queries.

1.13.1 (2/17/2023)

New Features

  • Added parameter delimiter to the frame save() methods to allow selecting the column separator character when writing a CSV file.

  • Added direction output to breadth_first_search() that returns whether the search traversed the edge in the forward or reverse direction.

Changed

  • Can now read from URLs (including S3 buckets) using the low-code create frame methods.

  • Can now read from mixed URLs, server files, and client files using the frame load() methods and the low-code create frame methods.

  • The encoding of strings to time now supports a leading T to adhere to ISO 8601.

  • The Python decimal and timedelta objects can now be passed as Cypher parameters.

  • Improved error text for ingest errors from Parquet and CSV files.

Fixed

  • Fixed bug in encoding fractional seconds from a string to a duration.

  • Fixed bug where inserting a list of dates into a text column yielded the wrong text.

  • Fixed bug where a header mode of IGNORE didn’t work for the low-code create frame methods.

  • Fixed bug where the metrics cache was off by default.

1.13.0 (1/27/2023)

New Features

  • Added support for accessing properties of vertex and edge elements in paths.

  • Added support for accessing properties of bound edges from edge steps with multiple frames.

  • Added support for CREATE for table frames.

  • Added DURATION type.

  • Added support for multiple edge frames on a frame step in a WHERE pattern.

  • Added support for inserting Python ipaddress type into IPADDRESS frame columns.

  • Added support for inserting Python decimal type into INT, FLOAT, and TEXT frame columns.

  • Added support for passing Python date, time, datetime, and ipaddress objects as Cypher parameters.

  • Added support for comparing dates and datetimes.

  • Added support for reading Parquet files from http:// and ftp:// locations and S3 buckets.

  • Added the ability to use wildcard characters in filenames when loading data from both the client and server filesystems.

  • Added support for using wildcards or passing lists of filename to the low-code create frame methods.

  • Added support for LIST columns to the low-code create frame methods.

  • Added on_duplicate_keys parameter to create_vertex_frame_from_data() that allows choosing how to handle duplicate vertices.

  • Added frame_to_input_column_mapping parameter to the frame insert() methods.

  • Added all_edges parameter to breadth_first_search() to return all edges traversed by the search.

  • Added support for using LDAP or Kerberos for Docker xGT authentication.

Changed

  • Changed get_data() and get_data_pandas() to return native types for DATE, TIME, DATETIME, and IPADDRESS columns instead of strings.

  • Changed get_data_arrow() to return native types for DATE, TIME, and DATETIME columns instead of strings.

  • Improved Cypher coverage of using path variables and elements of paths.

  • Improved performance and reduced memory usage for some usages of WGAs.

  • Improved type inferencing for edge steps with multiple frames and for WHERE patterns.

  • Improved consistency of inserting scalar data across all input methods and converting types to string.

  • Improved type constructors (such as date()) to work with any valid string expression, not just constants.

  • Renamed frame_to_file_column_mapping parameter for the frame insert() and low-code create frame methods to frame_to_input_column_mapping.

Fixed

  • Fixed bug where sometimes Connection objects were not properly cleaned up.

  • Fixed some corner case bugs for using lists in Cypher queries.

1.12.1 (10/19/2022)

Changed

  • Exposed config files to Docker and AWS Marketplace users.

  • Added mutual SSL mode, and client and chain certificates are now only required for mutual SSL mode.

Fixed

  • Fixed bug where a line of only white space in a query caused a compile error.

  • Fixed bug where CSV files weren’t able to be loaded from an S3 bucket.

  • Fixed bug where writing Parquet files with null data sometimes wrote the wrong data.

  • Fixed bug where a query returning a single row with an empty multi-level list sometimes caused a segfault.

  • Fixed bug where null values caused the wrong result for some global aggregators.

  • Fixed bug when executing some variable length edge traversal queries.

  • Fixed bug where some list comprehension expressions returned the wrong result.

1.12.0 (9/2/2022)

New Features

  • Added support for path variables.

  • Added Cypher functions nodes(), relationships(), and length() for path variables.

  • Added weakly connected components algorithm to the whole graph algorithm (WGA) catalog.

  • Added strongly connected components algorithm to the WGA catalog.

  • Added support for multiple MATCH statements within a query.

  • Added support for CREATE with multiple edges and vertices.

  • Added support for list properties when using MERGE, CREATE, and SET.

  • Added support for null entity returns from an OPTIONAL MATCH.

  • Added Cypher support for undirected edges where source and target vertex frames are the same.

  • Added support for list comprehension.

  • Added Cypher support for ^ (exponentiation) and XOR operators.

  • Added Cypher support for % (modulus) operator for float values.

  • Added parameter on_duplicate_keys to VertexFrame’s insert() and load() to allow skipping vertices if the key or all the properties are the same instead of raising an exception.

  • Added support for single-level lists as query parameters.

Changed

  • Improved breadth-first search and PageRank performance in the WGA catalog.

  • Improved type inference for pandas and pyarrow.

  • Further improved type support for Arrow and Parquet.

  • Improved access control for Arrow.

  • Added default_namespace parameter to Connection() that initializes the connection with a default namespace.

  • Improved cached metrics data for frames.

Fixed

  • Fixed bug where leading comments for a query would raise an exception.

  • Fixed bug where header line processing ignored delimiter.

  • Fixed bug introduced in 1.11 where transferring data between client and server no longer worked for all type conversions.

1.11.0 (6/17/2022)

New Features

  • A whole graph algorithm (WGA) catalog callable from TQL. Initial included algorithms are PageRank and Breadth-First Search (BFS).

  • Automatic xGT frame creation using an inferred schema derived from Pandas frames, Arrow tables, or CSV files.

  • Single sign-on Kerberos-based authentication via GSSAPI libraries.

Changed

  • Improved type support for Parquet and Arrow.

  • Improved support for very large strings in Arrow.

  • Support for row-level frames with Parquet files.

  • Ingest transactions now will stop on the first encountered error instead of trying to ingest as many rows as possible.

  • Target release changed to RedHat Enterprise Linux (RHEL) v8.

  • Moved to a new version of Google’s protobuf client package to resolve compatibility problems.

  • Improved type support for jobs with immediate results to cover all xGT types.

  • Improved the performance of data transfers between the xGT client and server.

  • Allow for CSV headers with quotes in them.

Fixed

  • Fixed an issue where queries with multiple mandatory MATCH patterns were not reported as erroneous.

  • Fixed issues with supporting frame and namespace names with symbols.

1.10.1 (3/7/2022)

New Features

  • Added the ability for load() to map schema columns in the frame to columns in a Parquet file using the frame_to_file_column_mapping parameter.

Changed

  • The xgtd Arrow Flight server now uses the same port as the xgtd server, 4367.

  • Improved the performance of some queries, notably count(*) queries.

  • Can now read in times that have more than 6 digits of fractional second precision. The extra digits are rounded to 6. This makes inserting time data from pandas dataframes easier.

Fixed

  • Fixed some corner case bugs with transferring lists to / from the client and when using lists in Arrow flights.

  • Fixed handling of null columns in Arrow flights.

  • Fixed a bug where an edge frame could be created where the type of the source and target columns didn’t match the corresponding vertex key column type.

  • Fixed some issues where symbols in names weren’t always handled correctly.

1.10.0 (2/11/2022)

New Features

  • Added support for writing to Parquet files on the server filesystem (xgtd://).

  • Added support for xgt IP addresses as strings in Parquet files.

  • Egesting to multiple files from a single frame is now supported.

  • Added the ability for load() to map schema columns in the frame to columns in a CSV file using the frame_to_file_column_mapping parameter.

  • The xGT server is now an Arrow Flight endpoint. Arrow Flight clients can connect to it and insert Flight data into frames and get data from frames as flights.

  • Cypher lists can now be inserted into and retrieved from frames in the client.

Changed

  • The logger now rotates the log files and ages off old ones to avoid filling up disk space.

  • Improved performance and reduced memory usage for LIMIT when used with group by or DISTINCT.

  • The configuration setting license.server was renamed to license.location.

Fixed

  • Fixed a bug where vertex labels carried across a WITH statement weren’t allowed in unique_vertices().

  • Fixed a bug where save() incorrectly handled the offset and length parameters.

  • Fixed a bug where, for some queries that use both UNWIND and WITH, an incorrect result was returned.

1.9.1 (1/17/2022)

Fixed

  • Fixed a bug where, for some queries that allow multiple types for an edge in a MATCH pattern ([:REL1 | :REL2]), an incorrect result was returned.

  • Fixed a bug where an exception was incorrectly raised in some cases when combining UNION and WITH.

1.9.0 (11/1/2021)

New Features

  • Added support for UNION and UNION ALL.

  • Added support for reading from Parquet files on the server filesystem (xgtd://).

  • Added support for ingesting integers in hexadecimal notation.

  • Added support for the list concatenation operator: +.

  • Added support for list subscripting and slicing.

  • Added support for the list functions keys(), reverse(), size(), and tail().

  • Added support for the string function size().

Changed

  • Immediate results can now be returned with schedule_job(), and they are now stored in the job history.

  • Improved performance of insert() from both Python lists and Pandas frames.

  • Improved performance of ORDER BY when used in conjunction with SKIP or LIMIT.

  • Improved performance of SKIP or LIMIT with large values.

  • Improved performance of group by queries.

  • Improved performance of some queries using a WITH clause. Vertex index.

  • Limited number of files returned in load() error message to prevent extremely large messages.

  • Some improvements to query planning and expression optimization.

Fixed

  • Fixed a bug when using OPTIONAL MATCH.

  • Fixed a bug where missing integer values in Pandas frames raised an exception during insert() instead of creating null values.

1.8.0 (8/30/2021)

New Features

  • Added support for a default namespace.

  • Added support for OPTIONAL MATCH with WHERE <pattern>.

  • Added support for the group by collect aggregator.

  • Added support for collect on lists for both global and group by aggregation.

  • Added support for lists with count, min, and max aggregators, both global and group by.

  • Added support for lists as group by keys.

  • Added support for lists as ORDER BY keys.

  • Added support for lists as DISTINCT keys.

  • Added support for nested lists everywhere lists are currently allowed.

Changed

  • Improved query compile time.

  • Improved query planning.

  • Improved performance of outdegree() and indegree().

  • Improved performance of ORDER BY, especially when used in conjunction with LIMIT.

  • Improved performance of global aggregators.

Fixed

  • Fixed bugs where some aggregator corner case usages, both global and group by, returned the wrong answer and some caused a segfault.

  • Fixed corner case bug in range().

1.7.1 (7/7/2021)

Fixed

  • Fixed bug where some corner case usages of UNWIND caused a segfault.

  • Fixed bug where space around column entries in CSV file caused an error. Now, the surrounding space is ignored.

  • Fixed bug where the combination of LIMIT and global aggregators sometimes caused the wrong number of results.

1.7.0 (6/21/2021)

New Features

  • Added support for non-nested literal lists.

  • Added support for the collect global aggregator to generate non-nested lists.

  • Added support for the UNWIND keyword for non-nested lists.

  • Added support for the range() function.

  • Added support for queries with WHERE <pattern> and WHERE NOT <pattern>.

  • Added support for OPTIONAL MATCH.

  • Added support for multi-edge frame traversals.

  • Added support for Cypher functions rand() and sign(expression).

Changed

  • Improved query planning.

  • Improved error messages for malformed queries.

  • Improved error messages for invalid UTF-8 data.

  • Added line numbers for error messages in ingest.

  • Changed license management to log expiration information.

  • The max threads configuration setting now takes hyperthreading into account.

  • Changed the RPM installation to install the latest xGT Python client.

  • Changed the installation RPM to include a tarball of xGT documentation.

Fixed

  • Fixed bug that could occur with long lines in ingest.

Unresolved

  • The query planner improvements in a few cases increased compile times.

  • There were also a few cases where the query planner chose a plan that reduced query performance compared to 1.6.

1.6.0 (4/2/2021)

New Features

  • Added new mechanism and license management system for core-based licensing through X-Formation’s LM-X license manager suite.

  • Added support for the Cypher functions sum(DISTINCT expression), avg(DISTINCT expression), max(DISTINCT expression, min(DISTINCT expression), toBoolean(), and toFloat() .

  • Added support for queries with bounded variable length paths.

  • Added support for Cypher parameters designated in queries by $varName.

  • Added support for the Cypher CASE expressions.

  • Added support for ingesting DATE, TIME, and DATETIME, with optional time zone, which is then converted to UTC.

  • Improved support for query patterns without a direct syntactic connection.

  • Added query planner support for WITH clause multi-section queries.

Changed

  • Ended Python 2 support.

  • Changed RPM installation to no longer overwrite xgtd pam.d file if one already exists.

  • Changed xGT server to exit gracefully for unhandled exceptions and print exceptions when possible.

  • Changed ingest to ignore empty lines in csv files.

  • Improved ingest error messages.

  • Improved logging.

  • Improved performance of toInteger() for date and datetime.

Fixed

  • Fixed potential segfault that could occur in the query scheduler.

  • Fixed potential segfault that could occur with a key to non-key comparison.

1.5.1 (2/19/2021)

Changed

  • Improved performance of group by when using DISTINCT.

Fixed

  • Fixed a performance bug on some queries introduced in 1.5.0.

1.5.0 (2/8/2021)

New Features

  • Added support for access control on a per row basis. Rows may have attached security labels which will cause the row to be accessible or inaccessible by the user depending on the permission labels held by the user.

  • Frames have a row-label universe, which is the set of available labels that can be applied at the row level. This is applied at frame creation and is currently limited to 128 unique labels.

  • Added the sandbox concept for ingest of files into xGT. This restricts valid file paths to be within the sandbox and every ingest or egest must be within the sandbox boundary. This location is configurable by the server admin.

  • Added support for the Cypher keyword WITH.

  • Added support for the Cypher functions count(expression), count(DISTINCT expression), substring(), toInteger(), abs(), ceil(), float(), and round().

  • Added the preserve_order parameter to save() that allows writing rows in the exact order they occur in the frame.

  • Added the delimiter parameter to load() that allows a user to select the column delimiter character.

Changed

  • Changed logging to always log license info.

  • Added audit message on password match failure.

  • Added audit message for permission denied on a protected frame.

  • Changed load and insert errors to be returned as part of the raised exception on the local Python client instead of being in a separate error frame.

  • Improved Python client-side error messages for streaming ingest problems.

  • Eliminated the timeout parameter from schedule_job().

  • Time type constants can now give fractions of a second with less than 6 digits.

  • Queries without MATCH statements are now allowed.

  • Improved performance and reduced memory usage of computing cached metrics data for frames.

Fixed

  • Fixed a bug in the Query Planner causing potential memory corruption.

  • Fixed a potential segfault that could occur during a transaction rollback when acquiring configuration values.

  • Fixed a bug where a metrics data calculation failure due to an out-of-memory error would infinitely rerun and continue failing.

  • Fixed a bug where non-property expressions in an ORDER BY clause caused an error if the expression wasn’t in the RETURN statement.

Unresolved

  • Post 1.30.0 versions of grpcio causes connection issues when forking or using multiprocessing with the python client. If forking or multiprocessing is required use version 1.30.0 of grpcio.

1.4.2 (9/28/2020)

Fixed

  • Fixed a bug where PAM authentication could fail when a large number of users were in a group on a system.

1.4.1 (9/18/2020)

New Features

  • Added more parameters to some of the Python API functions to allow customization of behavior. See the Python API documentation for individual functions for more details.

Changed

  • Changed the exception type raised if a duplicate name is given when creating a namespace or frame from XgtValueError to XgtNameError.

  • The default admin group for the “xgtdadmin” user is now “xgtd” to align with the group used for the RPM install.

  • The behavior of the timeout parameter to wait_for_metrics() has changed to unify the behavior with the timeout parameter for other functions. A value of 0 now indicates no timeout and waiting until the metrics computation is completed.

  • Improved performance of computing cached metrics data for frames.

  • Improved performance of deleting rows from a frame.

Fixed

  • Items can be removed from the job history using a TQL query.

  • The DiskID program that is used in generating a hardware-locked license now excludes USB devices and is no longer tied to a single file system drive.

1.4.0 (8/24/2020)

New Features

  • Added support for authenticating users via Linux PAM (Pluggable Authentication Modules). Authentication sources can be specified via the PAM configuration for xgtd. Enterprise authentication through LDAP is supported via a PAM module.

  • Added support for access control on a per frame basis. Separate controls are provided for Create, Read, Update and Delete operations on each frame.

  • Access control is based on the concept of security labels which both users and frames possess. The matching of labels from a user to a frame enables or disables access for that user.

  • The combination of authentication and access control enable multi-user support in xGT.

  • Added the notion of a session to represent the duration of an authenticated connection to the xgtd server. All user-initiated operations are rooted in an authenticated session.

  • Added a namespace concept to enable grouping together related data frames. Namespaces are also frames and subject to access control.

  • Added support for security logging and auditing via the Apache log4cxx library. Security events are logged to a configurable location for eventual auditing.

  • Added support for runtime configurable logging to xgtd via the g3log library. Logging levels can be changed while the server is running.

Changed

  • Added Python methods to create and drop namespaces to the Connection object.

  • Added support throughout xGT for specifying fully qualified names for frames. Fully qualified names include the namespace that contains the frame. The separator __ (double underscore) is used to separate the name of the namespace from the name of the frame: e.g. graph__VertexFrame.

  • Modified transactional support inside xGT to unify it with access control. All user-initiated operations are fully transactional and access controlled.

  • Modified configuration support in xGT to make it transactional and access controlled.

  • Improved performance of different TQL aggregator operations (DISTINCT, ORDER BY, group by, etc.)

  • Improved performance of ingest.

  • Added support for querying more internal xGT data structures via TQL: configuration, namespaces, job history.

  • Reduced client-server round trips over the network by moving more functionality server-side (job wait).

  • Added another Jupyter Notebook example for computing Jaccard scores.

  • Made the RETURN and INTO TQL clauses optional by supporting query execution for just its side effects (optional RETURN) and supporting returning small results embedded in the Python Job object (optional INTO).

  • Added support for performance monitoring by counting the number of traversed edges in a TQL query.

  • Improved CSV parsing support by using a third party library.

Fixed

  • Corrected a problem with query rewriting after the query planning algorithm has modified the execution order. Anonymous bound variables were not getting correctly generated for edge steps.

Unresolved

  • There is no way to remove items from the job history. Over time, a growing job history may consume a significant amount of memory. 1.3 suffered from this as well, but 1.4 adds more information to the job history, so it grows faster.

1.3.0 (11/5/2019)

New Features

  • Added the ability to compute metrics about frame data and build a cache of these metrics.

  • Added a query planner that reorders queries to improve query performance. The planner explores many query plans and computes a cost-metric for each plan using the cached metrics. The least-cost plan is chosen for the actual running of the query.

  • Turned on the metrics cache collection and query planning by default. Metrics cache collection may be turned off via an xgtd configuration variable. Query planning can be turned off by calling set_optimization_level().

  • Added the Connection object methods wait_for_metrics() and get_metrics_status() to know synchronously or asynchronously when queries to fill the metrics cache have finished running.

  • Added Cypher support for the string concatenation operator, +, and the toString() function.

  • A space is now allowed as a separator of the date and time in a DATETIME in addition to T.

  • Added the ability to collect the number of visited edges for a job. The collection is turned on and off via an xgtd configuration variable.

  • Added DRM to the xgtd server to support on-premises installations.

Changed

  • Updated the names and format of the xgtd configuration variables given in xgtd.conf. The new format uses multiple levels of JSON objects to group categories of variables.

  • Changed error_frame_name to be a property instead of a method for all frame types.

  • Changed the behavior when a transaction fails to now raise an exception to the Python client from run_job() and wait_for_job(). The previous behavior was for the functions to complete successfully, and the user had to check the job status to know if the transaction failed.

  • Added the job status rollback to indicate when a job failed because of a transactional conflict with another job.

  • Improved transactional consistency when querying for frame existence or metadata information about frames.

  • Improved the performance of queries over frames that have had data deleted.

  • Improved the performance and memory usage of some of the aggregators and solution modifiers used in queries.

Fixed

  • Improved the parsing of all input types to better enforce constraints on the types.

  • Added a check to prevent null values for key columns in vertex and edge frames when inserting or loading data.

  • Fixed a bug where two queries writing to the same results table could cause a segfault.

  • Fixed a bug where creating an edge frame with vertex frames that already contained data wasn’t handled properly.

  • Fixed a bug that severely slowed downloads from URLs with libcurl versions < 7.38 which includes installations on CentOS 7.

  • Fixed numerous bugs relating to scheduling and managing jobs.

  • Fixed a segfault that occurred in the xgtd server in some cases when memory is exhausted instead of gracefully stopping the current operation and raising an exception to the user.

  • Fixed a bug where query results could be added to a pre-existing results table even if the transaction failed.

  • Fixed a bug where canceling a job didn’t always terminate the job execution on the server.

  • Fixed a bug where sometimes frames weren’t immediately removed from the server when dropped.

  • Fixed a bug where the server wasn’t always releasing file descriptors.

  • Fixed a bug where using the same column for the source and target keys of an edge wasn’t allowed.

  • Fixed a bug where a floating point value of nan was encoded improperly when being transferred between client and server.

  • Fixed a bug where the header line was being written twice when saving a CSV file of frame data local to the client.

  • Fixed a bug where the unsupported Cypher fragment RETURN * caused a segfault. It now raises an exception to the Python client.

1.2.0 (6/4/2019)

New Features

  • Released the xGT Python interface into open source. The xgt package can now be found on the Python Package Index and installed using pip install xgt.

  • Added support for the Cypher keyword SET for updating non-key properties in frames.

  • Added support for the Cypher keyword CREATE for adding new vertices and edges to a frame.

  • Added support for the Cypher keyword MERGE for adding new vertices to a frame if the vertex key doesn’t already exist.

  • Added support for the Cypher keyword DETACH DELETE for removing vertices and all their incident edges from a frame.

  • Added support for the Cypher keyword DELETE for removing edges from a frame.

  • Added a transactional model. All operations are now transactional and safe to be run concurrently.

  • Added the methods max_user_memory_size() and free_user_memory_size() to Connection so a user can get the maximum and free memory available to them on the server.

  • Added the method error_frame_name() to TableFrame, VertexFrame, and EdgeFrame that returns the name used for the error table frame generated when there are errors on inserting data into a frame.

  • Added the .xgtd.conf configuration variable io_threads that limits the number of threads used to read a single input file to improve performance on systems with large thread counts.

Changed

  • Improved handling of out-of-memory situations so that in many cases a transaction that causes an out-of-memory error performs a rollback and leaves the system in a usable state.

  • Improved exception types and messages returned to the user.

  • Improved error detection for vertex keys to catch duplicates earlier.

  • Added a check for unsupported Cypher keywords and raise exceptions when they are used.

  • Added a check that the columns in a RETURN statement match the schema of a pre-existing results table and raise an exception when they don’t.

  • Added a check that the Cypher functions outdegree() and indegree() aren’t being called on an edge and raise an exception when they are.

  • Removed unneeded files from the xgt library distribution package.

  • Improved performance of downloading frame data from the server to the client.

  • Improved performance of inserting data into frames.

  • Added user documentation for all new features.

Fixed

  • Fixed a bug where the server wasn’t responsive to commands when a query was running.

  • Fixed a bug where using the Cypher keyword STARTS WITH on a null property incorrectly raised an exception.

  • Fixed a bug where a group by on multiple Cypher outdegree() function calls yielded incorrect results.

  • Fixed a bug that occasionally occurred when using ORDER BY in a query.

  • Fixed a bug where downloading frame data was limited to 4MB.

  • Fixed a bug on reconnection that occurred when a server unexpectedly exits during a script.

  • The minimum version of the grpcio Python package required by xgt is increased to 1.20.

  • Python2 versions below 2.7.10 are now deprecated. Added a deprecation warning for a too low version of Python at xgt module import time.

1.1.0 (2/21/2019)

New Features

  • Added support for an SSL-encrypted gRPC connection between client and server.

Changed

  • Removed the REST API from the Python library, server, and documentation.

  • Removed the option to run in ‘local’ (unixsocket) mode as that security need has been replaced by the SSL-encrypted gRPC connection.

  • Added the ability for the client to detect when the server has restarted and notify the user they need to reestablish a connection.

  • Improved the hierarchical organization of exception messages.

  • Improved the text of many errors returned to the Python user.

  • Improved load and insert error reporting to return a usable frame with the valid data and an error table with all the invalid data and errors instead of raising on the first error detected in the data.

  • Improved logging of out-of-memory errors.

  • Added error reporting when a Cypher statement compares a vertex with an edge.

  • Added documentation for xGT data types.

  • Added compiler optimizations to improve performance of some queries.

  • Improved the client-server connection to reduce latency.

  • Improved ingestion performance for large edge frames when implicitly filling the vertex frame.

Fixed

  • Fixed numerous bugs where Null values weren’t handled correctly or were handled differently from Cypher’s behavior.

  • Fixed numerous bugs where datatype conversion wasn’t being handled correctly.

  • Fixed a bug where xGT ran out of memory when ingesting large edge frames due to fragmentation when plenty of memory was available.

  • Fixed a deadlock that could occur during load or insert operations.

  • Fixed some client-server connection bugs when a Jupyter Notebook is used as the client.

1.0.0 (12/1/2018)

New Features

  • Added ability to query a TableFrame using syntax for querying a VertexFrame.

  • Names can now contain Unicode characters.

Changed

  • Improved stability and performance of connection between Python client and xgtd server by switching to using gRPC instead of REST as the default communication protocol. The REST protocol is now deprecated.

  • Added check when creating VertexFrame that the key parameter is given and is a valid column in the schema.

  • Added check when creating EdgeFrame that the source_key and target_key parameters are given and are valid columns in the schema.

  • Modified behavior when dropping frames to prevent a user from dropping a frame on which other frames depend.

  • Modified behavior when creating frames to require a name for the frame.

  • Disallowed the creation of frames with names containing periods.

  • Renamed xgtd configuration file from .gemsconfig to .xgtd.conf, and now look for it in /etc if the file doesn’t exist in the user’s home directory.

  • In xgtd.conf, renamed the property s3_key_id to aws_access_key_id and s3_key to aws_secret_access_key to be consistent with the naming in the AWS credentials file.

  • Errors are now reported when .xgtd.conf contains invalid property names or values of the wrong type.

  • Modified exceptions to use new naming scheme.

  • Improved handling of filename paths passed to I/O commands to make all filename paths unambiguous as to the location of the file.

  • Added support for loading CSV files where some of the lines have more columns than the required number.

  • Improved documentation and tutorials.

Fixed

  • Fixed a rare bug where simultaneously creating and dropping frames caused an error.

  • Fixed a bug where frames with numeric names couldn’t be dropped.

  • Fixed a bug where using a list of tuples as the schema to create a frame had stopped working.

  • Fixed a bug where exceptions were sometimes reported to the wrong job when multiple jobs were running simultaneously.

  • Fixed numerous bugs that occurred when loading data.

0.18.0 (10/24/2018)

Changed

  • Modified the API for creating and destroying graphs. See the Python API documentation for details.

  • Simplified the specification of source and target keys by supporting only a single key column.

  • Names now have to be unique across all objects vs. unique only for a particular type of object.

  • Changed working directory of AWS marketplace daemon to /srv/xgtd/data from /home/ec2-user/.

Fixed

  • Improved type checking in the query compiler to provide better error messages to users for invalid queries.

  • Improved error messages in some cases to make them more understandable.

  • Fixed a bug where exceptions were sometimes reported to the wrong job when multiple jobs were running simultaneously.

  • Fixed a bug where exceptions sometimes had malformed JSON.

  • Fixed a segfault that occasionally occurred in the job manager.

0.17.0 (10/1/2018)

Changed

  • Added additional error messages.

  • Added query annotations to more types of invalid Cypher query error messages.

  • Can now read AWS credentials directly from .aws/credentials.

Fixed

  • Fixed some Unicode bugs.

  • Fixed a segfault occurring when an exception is raised while ingesting.

0.16.0 (9/1/2018)

Changed

  • The query compiler can now automatically infer the vertex or edge frame when there is a single type.

  • Standardized error message handling between run_job() and wait_for_jobs().

Fixed

  • Improved compatibility between Python 2 and 3.

  • Fixed spurious warnings about numpy datatype incompatibilities.

  • Improved error reporting to give more meaningful error messages.

  • Fixed many cases where errors were not being reported to the user.

  • Fixed numerous bugs.