2.4. Memory and Compute Resources

2.4.1. Monitoring Usage

It is important to understand how much memory is required to hold the data graph(s) and other intermediate data resulting from computations. To see the total memory in the server, a Python script can refer to the xgt.Connection.max_user_memory_size attribute.

import xgt
server = xgt.Connection()
print("Total Available Memory: {.3f}GiB".format(server.max_user_memory_size))

To see the total available (free) memory, a Python script can refer to the xgt.Connection.free_user_memory_size attribute.

import xgt
server = xgt.Connection()
print("Total Free Memory: {.3f}GiB".format(server.free_user_memory_size))

2.4.2. Setup and Configuration On-premises

The set up of the memory and computing resources involves establishing limits on memory use, limits on use of CPUs, and how these memory and compute resources are to be used within the server platform. There are generally three ways of managing these resources:

  1. A single instance of xgtd running on the server platform that uses the whole system.

  2. A single instance of xgtd running on the server platform that needs to share the platform with other software applications.

  3. Multiple instances of xgtd sharing a server platform.

2.4.3. xgtd Using the Whole Server Platform

In this scenario, where the xGT server will use the entire machine, the memory assigned to the server application will be the smaller of:

  • the configured limit for system.max_memory

  • the amount of available RAM on the server platform.

If the system.max_memory is not configured, then xgtd will use the smaller of the other values.

If the server hardware has Non-Uniform Memory Access (NUMA), then the strategy on how to allocate data among the NUMA nodes can have a significant impact on performance. The recommended NUMA placement is to evenly distribute data across all NUMA nodes. This can be done with numactl when launching the server:

numactl --interleave=all /path/to/bin/xgtd

The default xgtd.service file for this scenario is:

[Unit]
Description=xgtd
After=multi-user.target

[Service]
Type=simple
ExecStart=/usr/bin/numactl --interleave=all -- /opt/xgtd/bin/xgtd -c /etc/xgtd/xgtd.conf -p 4367
User=xgtd
Environment=LD_LIBRARY_PATH=/opt/xgtd/lib
WorkingDirectory=/

[Install]
WantedBy=multi-user.target

This file, if placed in the /etc/systemd/system/xgtd.service location, will control the xgtd daemon via:

  • $ systemctl status xgtd

  • $ systemctl start xgtd

  • $ ststemctl stop xgtd

There are two components in this xgtd.service file that a site may consider changing:

  1. If no configuration is provided for the system.io_directory setting, the WorkingDirectory= establishes the location the server will look for data on a load() operation and the location to create CSV files on a save() operation (see Configuring the xGT Server).

  2. The parameters on the ExecStart= statement. Some likely candidates are the program arguments for the numactl program and the port number to listen on. Note that the port number may also be inside the /etc/xgtd/xgtd.conf file.

2.4.4. xgtd Sharing the Server Platform

In order to have xgtd share a server platform with some other application, it is best to set up xgtd to use only a portion of the available RAM and/or a portion of the available CPUs. We will work an example where our intent is to use 3/4 of the available RAM and 3/4 of the number of CPUs for xgtd, and the balance of the resources will be used by some other application. For purposes of this example, we assume that each server platform has the available licensed cores set appropriately. Detailed information about licensing is provided in xGT License Management System.

We begin with a look at the amount of available memory:

$ free
              total        used        free      shared  buff/cache   available
Mem:     8189337924    22626480  6318278772      121572  1848432672  8165324300

Three-fourths of the total memory is approximately 6,142 GiB.

It is important to understand more than just the number of available CPUs. On larger server platforms, there is typically a Non-Uniform Memory Access (NUMA) effect in memory references from a specific core. To get the best performance out of such a platform, it is important to align the portions of the CPUs selected with the portions of the RAM reserved. In order to configure xgtd for this kind of sharing, it is necessary to understand the NUMA nodes. The lscpu command supplies this information:

$ lscpu
Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                1152
On-line CPU(s) list:   0-1151
Thread(s) per core:    2
Core(s) per socket:    18
Socket(s):             32
NUMA node(s):          32
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 63
Model name:            Intel(R) Xeon(R) CPU E7-8890 v3 @ 2.50GHz
Stepping:              3
CPU MHz:               2899.932
CPU max MHz:           3300.0000
CPU min MHz:           1200.0000
BogoMIPS:              5000.51
Virtualization:        VT-x
L1d cache:             32K
L1i cache:             32K
L2 cache:              256K
L3 cache:              46080K
NUMA node0 CPU(s):     0-17,576-593
NUMA node1 CPU(s):     18-35,594-611
NUMA node2 CPU(s):     36-53,612-629
NUMA node3 CPU(s):     54-71,630-647
NUMA node4 CPU(s):     72-89,648-665
NUMA node5 CPU(s):     90-107,666-683
NUMA node6 CPU(s):     108-125,684-701
NUMA node7 CPU(s):     126-143,702-719
NUMA node8 CPU(s):     144-161,720-737
NUMA node9 CPU(s):     162-179,738-755
NUMA node10 CPU(s):    180-197,756-773
NUMA node11 CPU(s):    198-215,774-791
NUMA node12 CPU(s):    216-233,792-809
NUMA node13 CPU(s):    234-251,810-827
NUMA node14 CPU(s):    252-269,828-845
NUMA node15 CPU(s):    270-287,846-863
NUMA node16 CPU(s):    288-305,864-881
NUMA node17 CPU(s):    306-323,882-899
NUMA node18 CPU(s):    324-341,900-917
NUMA node19 CPU(s):    342-359,918-935
NUMA node20 CPU(s):    360-377,936-953
NUMA node21 CPU(s):    378-395,954-971
NUMA node22 CPU(s):    396-413,972-989
NUMA node23 CPU(s):    414-431,990-1007
NUMA node24 CPU(s):    432-449,1008-1025
NUMA node25 CPU(s):    450-467,1026-1043
NUMA node26 CPU(s):    468-485,1044-1061
NUMA node27 CPU(s):    486-503,1062-1079
NUMA node28 CPU(s):    504-521,1080-1097
NUMA node29 CPU(s):    522-539,1098-1115
NUMA node30 CPU(s):    540-557,1116-1133
NUMA node31 CPU(s):    558-575,1134-1151
Flags:                 fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand lahf_lm abm epb intel_ppin tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm cqm xsaveopt cqm_llc cqm_occup_llc dtherm ida arat pln pts

Since there are 32 NUMA nodes, we need to select a contiguous block of 3/4 of them to reserve for our application. In our discussion, we will reserve NUMA nodes 8 through 31 for xgtd.

The two configuration files used to establish our resource sharing include:

2.4.4.1. Configuring xgtd.config

Assuming that the O/S is set up to support hyper-threading, there should be two worker threads for every CPU. With no hyper-threading, there should be the same number of worker threads as there are CPUs. The number of CPUs in NUMA nodes 8 through 31 is 432. With hyper-threading, that means there should be 864 worker threads.

These configuration variables need to be set up as follows:

  • "system.max_memory" : 6142,

  • "system.worker_threads" : 864,

2.4.4.2. Configuring xgtd.service

For this scenario, the only part of this service file that needs to be updated is the numactl parameter list. This is where the NUMA node restrictions are specified.

The following xgtd.service file shows the numactl parameters for our example; restricting the application to NUMA nodes 8-31.

[Unit]
Description=xgtd
After=multi-user.target

[Service]
Type=simple
ExecStart=/usr/bin/numactl --cpunodebind=8-31 --interleave=8-31 -- /opt/xgtd/bin/xgtd -c /etc/xgtd/xgtd.conf -p 4367
User=xgtd
Environment=LD_LIBRARY_PATH=/opt/xgtd/lib
WorkingDirectory=/

[Install]
WantedBy=multi-user.target

2.4.5. Multiple xgtd Instances Running on a Server Platform

This scenario is very similar to the previous one where xgtd is sharing a platform with some other application. The major difference is that the sharing is now between multiple instances of the xgtd server. In order to accomplish this, it may be necessary to have multiple xgtd.conf files and multiple xgtd.service files. We recommend placing the xGT configuration files in /etc (e.g., /etc/xgtd/xgtd.conf, /etc/xgtd/xgtd.production.conf). The xgtd.service file is placed into /etc/systemd/system/ by the package installer. Launching multiple instances of xgtd is best handled by replicating the xgtd.service file with unique names that have instance-specific information inside such as a port number. Note that multiple xgtd.service files can use a shared xgtd.conf file if the only difference would be a port number, since the -p program argument can override the configured port.

A common setup is to have multiple xgtd.service files, each with different CPU sets and port numbers.

We show two sample service files that split a machine into two server daemons, one using 3/4 of the machine for production on port 4367 and the other using the other 1/4 of the machine for testing on port 4368.

# /etc/system/system/xgtd-production.service
[Unit]
Description=xgtd
After=multi-user.target

[Service]
Type=simple
ExecStart=/usr/bin/numactl --cpunodebind=8-31 --interleave=8-31 -- /opt/xgtd/bin/xgtd -c /etc/xgtd/xgtd.conf -p 4367
User=xgtd
Environment=LD_LIBRARY_PATH=/opt/xgtd/lib
WorkingDirectory=/

[Install]
WantedBy=multi-user.target
# /etc/system/system/xgtd-test.service
[Unit]
Description=xgtd
After=multi-user.target

[Service]
Type=simple
ExecStart=/usr/bin/numactl --cpunodebind=0-7 --interleave=0-7 -- /opt/xgtd/bin/xgtd -c /etc/xgtd/xgtd.conf -p 4368
User=xgtd
Environment=LD_LIBRARY_PATH=/opt/xgtd/lib
WorkingDirectory=/

[Install]
WantedBy=multi-user.target