2.2. Configuring the xGT Server

Most configuration parameters have reasonable defaults that work in many cases, and installations generally won’t need to change them. The main exceptions to this guidance is the security information:

  1. The defined security labels and “group-label” connections may need to be updated to reflect a site’s specific user base and user-authentication system. Details of this set up are in the Configuring Groups and Labels section.

  2. The AWS credential information has no reasonable defaults for securely accessing data in S3 buckets.

  3. To run with SSL encryption between the Python client and the xGT server the configuration parameters for system.hostname, system.ssl_root_dir, and system.usessl must be provided.

Configuring the xGT server is done using key-value pairs. The pairs can be given either in configuration files written in JSON format or as command line arguments when launching xgtd, the xGT server process, by modifying /etc/systemd/system/xgtd.service.

The standard configuration file is located at /etc/xgtd/xgtd.conf. Administrators may split configuration information into multiple files, which need to be passed to the xGT server as program arguments, for example: -c /path/to/xgtd.part1.conf. It is also possible to specify individual key-value pairs in the form of a program argument: -Dkey=value or -Dkey:value. The -c and -D program arguments can appear multiple times.

The algorithm that xGT uses to establish the working configuration is to process the -c and -D program arguments in the left-to-right order they appear on the ExecStart statement in the /etc/systemd/system/xgtd.service file. The default is to have only the -c /etc/xgtd/xgtd.conf configuration argument. For any key that appears multiple times (from any combination of configuration files or as -D program arguments), the last configuration value for that key is used. That is, repeated key entries will overwrite previous entries in the working configuration. An example ExecStart statement is:

ExecStart==/usr/bin/numactl --interleave=all -- /opt/xgtd/bin/xgtd -c /etc/xgtd/xgtd.conf -c /etc/xgtd/logging.conf -Dlogging.level.transactions=debug

The following keys for configuring xGT are supported. The Boot Only column indicates if the configuration variable can only be set at system startup. Changes to the value of a boot only variable after system startup will be ignored.

Key

Type

Default

Boot Only

Description

audit.config_file

string

Y

Provides the log4cxx configuration file name.

license.feature

string

Standard xgtd license feature

Y

Controls the feature to checkout in a license.

license.location

string

/etc/xgtd/licenses/xgtd.lic

Y

Controls the path xgtd will look for when checking out a license. The format is file location. Multiple paths may be given, but should be separated by a colon (:).

logging.filepath

string

/tmp/

Y

Provides the directory path where the log file will be written. If the filepath is “stdout” or “stderr”, then the log stream is written to standard out or standard error.

logging.fileprefix

string

xGT

Y

Provides the prefix of the log file name.

logging.level.<component>

string

warning

Sets the level of detail to place into a log stream for the given logging component. The components are described in Logging Components. The levels are: “debug”, “info”, “warning”, “fatal”.

metrics.cache

bool

true

Controls whether the metrics cache is running and generating statistical information about each frame. These statistics are used for query order optimization.

security.adminlabel

string

xgtadmin

Y

Value for the administrator label.

security.grouplabelfile

string

Y

The location of the group label file listing the mappings from groups to security labels to be loaded at boot time.

security.labelfile

string

Y

The location of the label file listing the security labels to be loaded at boot time.

security.session_refresh_window

int

-1

The time in seconds for how often to refresh the life extent of a session. A value of -1 indicates an unlimited time meaning a refresh will always be done.

security.session_ttl

int

-1

The time-to-live for the server. The maximum amount of time in seconds that a session will stay alive. A value of -1 indicates no limit.

system.hostname

string

localhost

Y

Controls the hostname (or IP address) the server will listen on for client connections. An IP address or “localhost” are allowed.

system.io_directory

string

/srv/xgtd/data/

Y

All input/output (ingest/egest) must be done in this filesystem location or below (sandbox directory).

system.io_follow_symbolic_links

bool

false

Y

Controls whether to follow symbolic links in the underlying filesystem.

system.io_threads

int

16

Controls the number of I/O threads xGT uses per file. The number will never exceed the system cores count including hyperthreads.

system.locale

string

C

Y

Locale for comparing strings. Details provided in Configuring the locale.

system.max_memory

int

Available system RAM

Y

Controls the maximum amount of memory (in GiB) that xGT will use. The value given is capped at the available system RAM.

system.port

int

4367

Y

Controls the port the server will listen on for client connections. Valid values are [0, 65535].

system.ssl_root_dir

string

Home directory of the user running xgtd

Y

The path to the root directory that holds the server’s SSL certificates and private keys.

system.usessl

bool

false

Y

Controls whether to turn on SSL authentication between the client and server.

system.worker_threads

int

Number of cores including hyperthreads on the system

Y

Controls the number of worker threads xGT uses. The number will never exceed the system cores count including hyperthreads.

system.pin_threads

bool

false

Y

Internal use only.

The log filename is created using both logging.filepath and logging.fileprefix. Assuming a fileprefix of “prefix” and a filepath of “/path/to”, the log file will be created with a name: /path/to/prefix.log. The logger will periodically rotate by renaming and then compressing the renamed log file, along with opening a new file with the original name.

Warning

We recommend tunneling Arrow Flight connections using ssh, as xGT does NOT support SSL-encrypted connections on that port.

Consider this example configuration file located at /path/to/xgtd.conf:

{
  "system": {
    "worker_threads": 16,
    "io_threads": 10,
    "port": 4369
  },

  "system.locale": "en_US.UTF-8"
}

Launching the xgtd daemon is done by invoking this command:

$ systemctl start xgtd

and is controlled by the /etc/systemd/system/xgtd.service file. To change the configuration of xgtd, either modify the standard configuration file at /etc/xgtd/xgtd.conf or update the ExecStart command in the systemctl service file.

To launch xgtd using the example configuration file show above and supplementing with a definition for system.host, modify etc/systemd/system/xgtd.service as follows:

$ ExecStart=/opt/xgtd/bin/xgtd -c /path/to/xgtd.conf -Dsystem.host=127.0.0.1

To launch xgtd using this configuration file but overriding the worker_threads configuration value:

$ ExecStart=/opt/xgtd/bin/xgtd -c /path/to/xgtd.conf -Dsystem.worker_threads=8

To launch xgtd using multiple configuration files:

$ ExecStart=/opt/xgtd/bin/xgtd -c /path/to/xgtd_conf1.conf -Dsystem.port=4369 -c /path/to/xgtd_conf2.conf

2.2.1. Configuring Groups and Labels

The collection of groups and labels is used extensively by the Access Control system. Users are not configured in xGT, relying instead on some external authentication system such as LDAP to manage the users. This external authentication system is queried for a list of groups of which an authenticated user is a member. The xGT application maintains recognized groups, security labels (one may imagine these to be names of roles), and which groups possess which labels.

These labels and group-to-label relationships are held in CSV files that are read into xGT when it launches. The configuration variables security.labelfile and security.grouplabelfile indicate where xGT should find these CSV files. To establish appropriate security configuration, the content of the two CSV files should be updated. It is recommended that these CSV files contain the “master” copy of the groups and labels information. It is possible to alter the online data frames holding this information (with the appropriate administrator privilege), but the only way this information is remembered across application restarts is if the information is stored into the CSV files.

An authenticated user may belong to several groups, each of which is associated with many security labels. The union of the labels that are reachable through the groups that the user belongs to is called the user’s label set.

An example group and label setup is shown below. The groups listed are those that exist in an external authentication system such as LDAP. In this example, a user that belongs to group1 and group3 has a label set consisting of labels labelA, labelB, and labelD. A user that belongs to group adminGroup has the label xgtadmin in their label set.

Example contents of security.labelfile:

xgtadmin
labelA
labelB
labelC
labelD

Example contents of security.grouplabelfile:

adminGroup, xgtadmin
group1, labelA
group1, labelB
group2, labelC
group2, labelD
group3, labelD

2.2.2. Configuring Administrator Privileges

Within the xGT application, there is a special security label that gives administrator privilege. This security label can be set by the security.adminlabel configuration variable, and its default is “xgtadmin”. Any user that has the configured security.adminlabel in their label set has administrator privileges.

By default, xGT is configured to authenticate users using their local UNIX credentials. Section User Authentication and Access Control explains in more detail how authentication operates for the xGT server. To configure an xGT site administrator, the group xgtgroup can be created locally on the Linux system running xGT and added to any existing UNIX user on that system. xgtgroup is the default name of the group that maps to the administrator label xgtadmin. A Linux user belonging to that group can now log in to xGT using their regular Linux credentials and will have system administrator privileges inside the xGT server.

The default contents of the security.labelfile provided with the xGT installation are as follows:

xgtadmin

The default contents of the security.grouplabelfile provided with the xGT installation are as follows:

xgtgroup,xgtadmin

Any user that has the security.adminlabel in their label set has unrestricted visibility and access to any data in the server. Any access control check with a user that holds the security.adminlabel behaves as though the user holds all possible labels.

2.2.3. Configuring the locale

The system.locale (locale) is used by xGT to impact how to compare two different strings according to the lexicographic comparison rules defined by the locale. The default value for locale is “C”, which essentially means using the standard C library for string comparisons.

The xGT configuration mechanism will check every configured value for semantic correctness. This means that before a configured value is accepted as valid, xGT will try to configure the underlying C++ and xGT runtime systems passing in the configured value. If no exception is thrown, the string is accepted as a valid locale value.

2.2.4. Using an SSL Secure Channel

By default, xGT uses an insecure channel, but an admin or user can enable a secure channel using SSL certificates. To run a secure server, pass the flags -s (or --ssl) and -d (or --ssl_root_dir) when starting the xgtd executable. The ssl_root_dir argument should be the path to the root directory that holds the server’s SSL certificates and private keys. A user can also turn on the secure server by setting the configuration variable system.usessl to True. Setting the configuration variable system.ssl_root_dir is an alternative way to specify the root directory that holds the server’s SSL certificates and private keys. xGT expects the following directory structure:

.
├── certs
│   ├── ca-chain.cert.pem
│   └── server.cert.pem
└── private
    └── server.key.pem

To connect to an xGT server using SSL, the client needs to pass the following flags to the xgt.Connection method: ssl, ssl_root_dir, and ssl_server_cn. The ssl flag needs to be set to true. The ssl_root_dir flag should be set to the root directory containing the SSL certificates and private keys. The ssl_server_cn flag should be set to the common name for the server listed on the server side SSL certificate. The xGT client expects the following directory structure for SSL certificates and private keys:

.
├── certs
│   ├── ca-chain.cert.pem
│   └── client.cert.pem
└── private
    └── client.key.pem