Sqlg mark github is a implementation of Apache TinkerPop on a RDBMS. Currently Postgresql, HSQLDB, H2, MariaDB, MySQL and MSSqlServer are supported.

Sqlg has a github discussions page.

1. Introduction

Sqlg primary challenge is to reduce latency by combining TinkerPop steps into as few as possible database calls. The fine-grained nature of graph traversals makes this crucial, otherwise the database call latency has prohibitively high performance impact.

Sqlg supports various bulk modes to reduce latency when modifying the graph.

Note
Hsqldb and H2 do not suffer the same latency as Postgresql, MSSqlServer and MariaDB as it runs embedded in the jvm.

2. License

MIT MIT

3. TinkerPop supported features

Sqlg version 2.1.6 runs on TinkerPop 3.5.1.

Sqlg passes TinkerPop’s StructureStandardSuite and ProcessStandardSuite test suites.

Graph Features not implemented.
  • Computer

  • ThreadedTransactions

  • Variables

Vertex Features not implemented.
  • MultiProperties

  • MetaProperties

  • UserSuppliedIds

  • NumericIds

  • StringIds

  • UuidIds

  • CustomIds

  • AnyIds

Edge Features not implemented.
  • UserSuppliedIds

  • NumericIds

  • StringIds

  • UuidIds

  • CustomIds

  • AnyIds

Vertex property features not implemented.
  • AddProperty

  • RemoveProperty

  • UserSuppliedIds

  • NumericIds

  • StringIds

  • UuidIds

  • CustomIds

  • AnyIds

  • MapValues

  • MixedListValues

  • SerializableValues

  • UniformListValues

Edge property feature not implemented.
  • MapValues

  • MixedListValues

  • SerializableValues

  • UniformListValues

Note
Sqlg supports user supplied ids but not quite as defined by TinkerPop. This is explained below.

4. Limitations

Postgresql schema, table and column names can not be more than 63 characters long.

5. Getting Started

There are a few ways to initialize Sqlg. The easiest is to include one of the provided connection pools and choose a supported database.

e.g.

C3P0
<dependency>
    <groupId>org.umlg</groupId>
    <artifactId>sqlg-c3p0</artifactId>
    <version>2.1.6</version>
</dependency>
Postgresql
<dependency>
    <groupId>org.umlg</groupId>
    <artifactId>sqlg-postgres</artifactId>
    <version>2.1.6</version>
</dependency>

You can also provide your own custom datasource by implementing org.umlg.sqlg.structure.SqlgDataSource. To let Sqlg know about your custom datasource you need to specify the fully qualified class name in sqlg.properties

sqlg.dataSource=your.custom.SqlgDataSource

Sqlg can also do a JNDI lookup to find the datasource. For this to work the jdbc.url property must begin with jndi:

5.1. Maven coordinates

Sqlg has support for c3p0 and hikari connection pools.

C3P0
<dependency>
    <groupId>org.umlg</groupId>
    <artifactId>sqlg-c3p0</artifactId>
    <version>2.1.6</version>
</dependency>
Hikari
<dependency>
    <groupId>org.umlg</groupId>
    <artifactId>sqlg-hikari</artifactId>
    <version>2.1.6</version>
</dependency>

The various Sqlg supported databases.

Postgresql
<dependency>
    <groupId>org.umlg</groupId>
    <artifactId>sqlg-postgres</artifactId>
    <version>2.1.6</version>
</dependency>
HSQLDB
<dependency>
    <groupId>org.umlg</groupId>
    <artifactId>sqlg-hsqldb</artifactId>
    <version>2.1.6</version>
</dependency>
H2
<dependency>
    <groupId>org.umlg</groupId>
    <artifactId>sqlg-h2</artifactId>
    <version>2.1.6</version>
</dependency>
MariaDB
<dependency>
    <groupId>org.umlg</groupId>
    <artifactId>sqlg-mariadb</artifactId>
    <version>2.1.6</version>
</dependency>
MySQL
<dependency>
    <groupId>org.umlg</groupId>
    <artifactId>sqlg-mysql</artifactId>
    <version>2.1.6</version>
</dependency>
MSSqlServer
<dependency>
    <groupId>org.umlg</groupId>
    <artifactId>sqlg-mssqlserver</artifactId>
    <version>2.1.6</version>
</dependency>

These will include gremlin-groovy. If you have no need for that then only include the dialect.

Postgresql
<dependency>
    <groupId>org.umlg</groupId>
    <artifactId>sqlg-postgres-dialect</artifactId>
    <version>2.1.6</version>
</dependency>
HSQLDB
<dependency>
    <groupId>org.umlg</groupId>
    <artifactId>sqlg-hsqldb-dialect</artifactId>
    <version>2.1.6</version>
</dependency>
H2
<dependency>
    <groupId>org.umlg</groupId>
    <artifactId>sqlg-h2-dialect</artifactId>
    <version>2.1.6</version>
</dependency>
MariaDB
<dependency>
    <groupId>org.umlg</groupId>
    <artifactId>sqlg-mariadb-dialect</artifactId>
    <version>2.1.6</version>
</dependency>
MySQL
<dependency>
    <groupId>org.umlg</groupId>
    <artifactId>sqlg-mysql-dialect</artifactId>
    <version>2.1.6</version>
</dependency>
MSSqlServer
<dependency>
    <groupId>org.umlg</groupId>
    <artifactId>sqlg-mssqlserver-dialect</artifactId>
    <version>2.1.6</version>
</dependency>

5.2. Start

SqlgGraph is a singleton that can be shared among multiple threads. You instantiate SqlgGraph using the standard TinkerPop static constructors.

  • Graph g = SqlgGraph.open(final Configuration configuration)

  • Graph g = SqlgGraph.open(final String pathToSqlgProperties)

The configuration object requires the following properties.

Postgresql
jdbc.url=jdbc:postgresql://localhost:5432/yourdb
jdbc.username=postgres
jdbc.password=******
HSQLDB
jdbc.url=jdbc:hsqldb:file:/tmp/yourdb
jdbc.username=SA
jdbc.password=
H2
jdbc.url=jdbc:h2:file:target/tmp/yourdb
jdbc.username=SA
jdbc.password=
MariaDB
jdbc.url=jdbc:mariadb://localhost:3306/?useSSL=false
jdbc.username=mariadb
jdbc.password=mariadb
MySQL
jdbc.url=jdbc:mysql://localhost:3306/?useSSL=false
jdbc.username=mysql
jdbc.password=mysql
MSSqlServer
jdbc.url=jdbc:sqlserver://localhost:1433;databaseName=yourdb;
jdbc.username=SA
jdbc.password=*****

In the case of Postgresql and MSSqlServer the database must already exist.

Once you have access to the graph you can use it as per normal.

@Test
public void useAsPerNormal() {
    Vertex person = this.sqlgGraph.addVertex(T.label, "Person", "name", "John");
    Vertex address = this.sqlgGraph.addVertex(T.label, "Address", "street", "13th");
    person.addEdge("livesAt", address, "since", LocalDate.of(2010, 1, 21));
    this.sqlgGraph.tx().commit(); # (1)
    List<Vertex> addresses = this.sqlgGraph.traversal().V().hasLabel("Person").out("livesAt").toList();
    assertEquals(1, addresses.size());
}
  1. It is very important to always commit or rollback the transaction. If you do not, connections to the database will remain open and eventually the connection pool will run out of connections.

5.3. Gremlin Console

Postgresql

pieter@pieter-Precision-7510:~/Downloads/tinkerpop-console/apache-tinkerpop-gremlin-console-3.5.1-bin/apache-tinkerpop-gremlin-console-3.5.1/bin/$ ./gremlin.sh

         \,,,/
         (o o)
-----oOOo-(3)-oOOo-----
plugin activated: tinkerpop.server
plugin activated: tinkerpop.utilities
plugin activated: tinkerpop.tinkergraph
gremlin> :install org.umlg sqlg-postgres 2.1.6
==>Loaded: [org.umlg, sqlg-postgres, 2.1.6] - restart the console to use [sqlg.postgres]
gremlin> :x
pieter@pieter-Precision-7510:~/Downloads/tinkerpop-console/apache-tinkerpop-gremlin-console-3.5.1-bin/apache-tinkerpop-gremlin-console-3.5.1/bin/$ ./gremlin.sh
         \,,,/
         (o o)
-----oOOo-(3)-oOOo-----
plugin activated: tinkerpop.server
plugin activated: tinkerpop.utilities
plugin activated: tinkerpop.tinkergraph
gremlin> :plugin list
==>tinkerpop.server[active]
==>tinkerpop.gephi
==>tinkerpop.utilities[active]
==>tinkerpop.sugar
==>tinkerpop.credentials
==>sqlg.postgres
==>tinkerpop.tinkergraph[active]
gremlin> :plugin use sqlg.postgres
==>sqlg.postgres activated
gremlin> graph = SqlgGraph.open('/pathTo/sqlg.properties')
==>sqlggraph[SqlGraph] (jdbc:postgresql://localhost:5432/sqlgraphdb) (user = postgres)
gremlin> g = graph.traversal()
==>sqlggraphtraversalsource[sqlggraph[SqlGraph] (jdbc:postgresql://localhost:5432/sqlgraphdb) (user = postgres), standard]
gremlin> graph.io(GraphSONIo.build(GraphSONVersion.V3_0)).readGraph("/pathTo/grateful-dead-v3d0.json")
==>null
gremlin> g.V().count()
==>808
gremlin>

6. Data types

Table 1. Table Data types
Java Postgresql HSQLDB H2 MariaDB MSSqlServer

Boolean

BOOLEAN

BOOLEAN

BOOLEAN

BOOLEAN

BIT

Byte

Not supported

TINYINT

TINYINT

TINYINT

TINYINT

Short

SMALLINT

SMALLINT

SMALLINT

SMALLINT

SMALLINT

Integer

INTEGER

INTEGER

INT

INTEGER

INT

Long

BIGINT

BIGINT

BIGINT

BIGINT

BIGINT

Float

REAL

Not supported

REAL

Not supported

REAL

Double

DOUBLE PRECISION

DOUBLE

DOUBLE

DOUBLE

DOUBLE PRECISION

String

TEXT

LONGVARCHAR

VARCHAR

LONGTEXT

VARCHAR(2000)

String (fixed length)

VARCHAR(x)

VARCHAR(x)

VARCHAR(x)

VARCHAR(x)

VARCHAR(x)

UUID

UUID

UUID

UUID

Not supported

Not supported

Boolean[]

BOOLEAN[]

BOOLEAN ARRAY DEFAULT ARRAY[]

ARRAY

BOOLEAN ARRAY DEFAULT ARRAY[]

Not supported

Byte[]

BYTEA

LONGVARBINARY

BINARY

BLOB

VARBINARY(max)

Short[]

SMALLINT[]

SMALLINT ARRAY DEFAULT ARRAY[]

ARRAY

Not supported

Not supported

Integer[]

INTEGER[]

INTEGER ARRAY DEFAULT ARRAY[]

ARRAY

Not supported

Not supported

Long[]

BIGINT[]

BIGINT ARRAY DEFAULT ARRAY[]

ARRAY

Not supported

Not supported

Float[]

REAL[]

Not supported

ARRAY

Not supported

Not supported

Double[]

DOUBLE PRECISION[]

DOUBLE ARRAY DEFAULT ARRAY[]

ARRAY

Not supported

Not supported

String[]

TEXT[]

LONGVARCHAR ARRAY DEFAULT ARRAY[]

ARRAY

Not supported

Not supported

java.time.LocalDateTime

TIMESTAMP

TIMESTAMP

TIMESTAMP

DATETIME(3)

DATETIME2(3)

java.time.LocalDate

DATE

DATE

DATE

DATE

DATE

java.time.LocalTime

TIME

TIME

TIME

TIME

TIME

java.time.ZonedDateTime

TIMESTAMP, TEXT

TIMESTAMP, LONGVARCHAR

TIMESTAMP, VARCHAR

DATETIME(3), TINYTEXT

DATETIME2(3), VARCHAR(255)

java.time.Period

INTEGER, INTEGER, INTEGER

INTEGER, INTEGER, INTEGER

INT, INT, INT

INTEGER, INTEGER, INTEGER

INT, INT, INT

java.time.Duration

BIGINT, INTEGER

BIGINT, INTEGER

BIGINT, INT

BIGINT, INTEGER

BIGINT, INT

java.time.LocalDateTime[]

TIMESTAMP[]

TIMESTAMP ARRAY DEFAULT ARRAY[]

ARRAY

Not supported

Not supported

java.time.LocalDate[]

DATE[]

DATE ARRAY DEFAULT ARRAY[]

ARRAY

Not supported

Not supported

java.time.LocalTime[]

TIME[]

TIME ARRAY DEFAULT ARRAY[]

ARRAY

Not supported

Not supported

java.time.ZonedDateTime[]

TIMESTAMP[], TEXT[]

TIMESTAMP ARRAY DEFAULT ARRAY[], LONGVARCHAR ARRAY DEFAULT ARRAY[]

ARRAY

Not supported

Not supported

java.time.Period[]

INTEGER[], INTEGER[], INTEGER[]

INTEGER ARRAY DEFAULT ARRAY[], INTEGER ARRAY DEFAULT ARRAY[], INTEGER ARRAY DEFAULT ARRAY[]

ARRAY

Not supported

Not supported

java.time.Duration[]

BIGINT[], INTEGER[]

BIGINT ARRAY DEFAULT ARRAY[], INTEGER ARRAY DEFAULT ARRAY[]

ARRAY

Not supported

Not supported

com.fasterxml.jackson.databind.JsonNode

JSONB

LONGVARCHAR

VARCHAR

LONGTEXT

VARCHAR(max)

com.fasterxml.jackson.databind.JsonNode[]

JSONB[]

ARRAY

ARRAY

Not supported

Not supported

org.postgis.Point

geometry(POINT)

Not supported

Not supported

Not supported

Not supported

org.umlg.sqlg.gis.GeographyPoint

geography(POINT, 4326)

Not supported

Not supported

Not supported

Not supported

org.postgis.LineString

geometry(LINESTRING)

Not supported

Not supported

Not supported

Not supported

org.postgis.Polygon

geometry(POLYGON)

Not supported

Not supported

Not supported

Not supported

org.umlg.sqlg.gis.GeographyPolygon

geography(POLYGON, 4326)

Not supported

Not supported

Not supported

Not supported

Note
java.time.LocalTime drops the nano second precision.

7. Architecture

TinkerPop’s property graph semantics specifies that every vertex and edge has a single label. Modelling this in a RDBMS is trivial. TinkerPop has no notion of cardinality nor of order. Every relationship between vertex labels is modelled as many to many relationship with no specified order.

This realizes itself as a classic many to many relationship in a RDBMS database.

VertexLabel <---- EdgeLabel ----> VertexLabel

7.1. Vertex tables

Every unique vertex label maps to a table. Vertex tables are prefixed with a V_. i.e. V_Person. The vertex table stores the vertex’s properties.

7.2. Edge tables

Every unique edge label maps to a table. Edge tables are prefixed with a E_. i.e. E_friend. The edge table stores each edge’s adjacent vertex ids and the edge properties. The column corresponding to each adjacent vertex id (IN and OUT) has a foreign key to the adjacent vertex’s table. The foreign key is optional, instead just an index on the adjacent vertex id can be used.

Note
By default, Sqlg will use an auto increment ID bigint for the primary key. You can however use the topology interface to define which properties to use as the primary key.
Note
sqlg.properties implement.foreign.keys = false
Edge foreign keys have a significant impact on performance.
Edge foreign keys are enabled by default.

From a rdbms' perspective each edge table is the classic many to many join table between vertices.

7.3. TinkerPop-modern

Taken from TinkerPop

image of tinkerpop-classic

ER Diagram

image of tinkerpop-classic

V_person

image of tinkerpop-classic

V_software

image of tinkerpop-classic

E_knows

image of tinkerpop-classic

E_created

image of tinkerpop-classic

7.4. Namespacing and Schemas

Many RDBMS databases have the notion of a schema as a namespace for tables. Sqlg supports schemas for vertex labels. Distinct schemas for edge tables are unnecessary as edge tables are created in the schema of the adjacent out vertex. By default schemas for vertex tables go into the underlying databases' default schema. For Postgresql, hsqldb and H2 this is the public schema.

To specify the schema for a label Sqlg uses the dot . notation.

@Test
public void testElementsInSchema() {
    Vertex john = this.sqlgGraph.addVertex(T.label, "Manager", "name", "john"); # (1)
    Vertex palace1 = this.sqlgGraph.addVertex(T.label, "continent.House", "name", "palace1"); # (2)
    Vertex corrola = this.sqlgGraph.addVertex(T.label, "fleet.Car", "model", "corrola"); # (3)
    palace1.addEdge("managedBy", john);
    corrola.addEdge("owner", john);
    this.sqlgGraph.tx().commit();
    assertEquals(1, this.sqlgGraph.traversal().V().hasLabel("Manager").count().next().intValue()); # (4)
    assertEquals(0, this.sqlgGraph.traversal().V().hasLabel("House").count().next().intValue()); # (5)
    assertEquals(1, this.sqlgGraph.traversal().V().hasLabel("continent.House").count().next().intValue()); (6)
    assertEquals(0, this.sqlgGraph.traversal().V().hasLabel("Car").count().next().intValue());
    assertEquals(1, this.sqlgGraph.traversal().V().hasLabel("fleet.Car").count().next().intValue());
    assertEquals(1, this.sqlgGraph.traversal().E().hasLabel("managedBy").count().next().intValue());
    assertEquals(1, this.sqlgGraph.traversal().E().hasLabel("owner").count().next().intValue());
}
  1. 'Manager' will be in the default 'public' schema.

  2. 'House' will be in the 'continent' schema.

  3. 'Car' will be in the 'fleet' schema.

  4. Vertices in the public schema do not need to be qualified with the schema.

  5. Vertices not in the public schema must be qualified with its schema. In this case 'House' will not be found.

  6. As 'House' is qualified with the 'continent' schema it will be found.

Table V_manager is in the public (default) schema.
Table V_house is in the continent schema.
Table V_car is in the fleet schema.
Table E_managedBy is in the continent schema as its out vertex palace1 is in the continent schema.
Table E_owner is in the fleet schema as its out vertex is in the `fleet`schema.

postgresql schemas

image of tinkerpop-classic image of tinkerpop-classic image of tinkerpop-classic image of tinkerpop-classic

7.4.1. Edge label

An edge label can have many different out vertex labels. This means that its possible for a single edge label to be stored in multiple schemas and tables. One for each distinct out vertex label. Gremlin queries will work as per normal. However it is possible to target the edges per out vertex schema directly.

eg.
@Test
public void testEdgeAcrossSchema() {
    Vertex a = this.sqlgGraph.addVertex(T.label, "A.A");
    Vertex b = this.sqlgGraph.addVertex(T.label, "B.B");
    Vertex c = this.sqlgGraph.addVertex(T.label, "C.C");
    a.addEdge("specialEdge", b);
    b.addEdge("specialEdge", c);
    this.sqlgGraph.tx().commit();
    assertEquals(2, this.sqlgGraph.traversal().E().hasLabel("specialEdge").count().next().intValue()); # (1)
    assertEquals(1, this.sqlgGraph.traversal().E().hasLabel("A.specialEdge").count().next().intValue()); # (2)
    assertEquals(1, this.sqlgGraph.traversal().E().hasLabel("B.specialEdge").count().next().intValue()); # (3)
}
  1. Query 'specialEdge'

  2. Query 'specialEdge' with, out vertex labels in the 'A' schema.

  3. Query 'specialEdge' with, out vertex labels in the 'B' schema.

8. Indexes

8.1. Basic indexing

Sqlg supports adding a unique or non-unique index to any property or properties.

To add an index one has to use Sqlg’s topology interface.

@Test
public void testIndex() {
    VertexLabel personVertexLabel = this.sqlgGraph.getTopology().getPublicSchema().ensureVertexLabelExist("Person", new HashMap<String, PropertyType>() {{
        put("name", PropertyType.STRING);
    }}); # (1)
    Optional<PropertyColumn> namePropertyOptional = personVertexLabel.getProperty("name");
    assertTrue(namePropertyOptional.isPresent());
    Index index = personVertexLabel.ensureIndexExists(IndexType.NON_UNIQUE, Collections.singletonList(namePropertyOptional.get())); $ (2)
    this.sqlgGraph.tx().commit(); # (3)

    this.sqlgGraph.addVertex(T.label, "Person", "name", "John");
    List<Vertex> johns = this.sqlgGraph.traversal().V()
            .hasLabel("Person")
            .has("name", "John")
            .toList(); # (4)

    /* This will execute the following sql.
    SELECT
        "public"."V_Person"."ID" AS "alias1",
        "public"."V_Person"."name" AS "alias2"
    FROM
        "public"."V_Person"
    WHERE
        ( "public"."V_Person"."name" = ?)
    */ # (5)

    assertEquals(1, johns.size());
}
  1. Create the 'Person' VertexLabel.

  2. On the 'Person' VertexLabel create a non unique index on the 'name' property.

  3. Index creation is transactional on Postgresql.

  4. The given gremlin query will use the index.

  5. The underlying RDBMS will use the index for the executed sql.

postgresql V_Person sql definition

image of tinkerpop-classic

8.1.1. Composite indexes

It is possible to create composite indexes.

@Test
public void testCompositeIndex() {
    VertexLabel personVertexLabel = this.sqlgGraph.getTopology().getPublicSchema().ensureVertexLabelExist("Person", new HashMap<String, PropertyType>() {{
        put("firstName", PropertyType.STRING);
        put("lastName", PropertyType.STRING);
    }}); # (1)
    personVertexLabel.ensureIndexExists(IndexType.NON_UNIQUE, new ArrayList<>(personVertexLabel.getProperties().values())); # (2)
    this.sqlgGraph.tx().commit();
    this.sqlgGraph.addVertex(T.label, "Person", "firstName", "John", "lastName", "Smith");
    List<Vertex> johnSmiths = this.sqlgGraph.traversal().V()
            .hasLabel("Person")
            .has("firstName", "John")
            .has("lastName", "Smith")
            .toList();
    assertEquals(1, johnSmiths.size());
}
  1. Create the 'Person' VertexLabel with 2 properties, 'firstName' and 'lastName'.

  2. Create a composite index on 'firstName' and 'lastName'

postgresql V_Person composite index sql definition

image of tinkerpop-classic

Outside of creating the index Sqlg has no further direct interaction with the index. However gremlin queries with a HasStep targeting a property with an index on it will translate to a sql where clause on that property and the underlying RDBMS will utilize the index.

Note
The index does not need to be created upfront. It can be added any time.

8.2. Full-text indexing

On postgresql full text indexing is supported.

@Test
public void testFullTextIndex() {
    Vertex v0 = this.sqlgGraph.addVertex(T.label, "Sentence", "name", "a fat cat sat on a mat and ate a fat rat");
    Vertex v1 = this.sqlgGraph.addVertex(T.label, "Sentence", "name", "fatal error");
    Vertex v2 = this.sqlgGraph.addVertex(T.label, "Sentence", "name", "error is not fatal");

    VertexLabel vl = this.sqlgGraph.getTopology().getVertexLabel("public", "Sentence").get();
    vl.ensureIndexExists(IndexType.getFullTextGIN("english"), Collections.singletonList(vl.getProperty("name").get())); (1)
    this.sqlgGraph.tx().commit();

    List<Vertex> vts = this.sqlgGraph.traversal()
            .V().hasLabel("Sentence")
            .has("name", FullText.fullTextMatch("english", "fat & rat")) (2)
            .toList();
    Assert.assertEquals(1, vts.size());
    Assert.assertTrue(vts.contains(v0));
}
  1. Create a full-text gin index.

  2. Query the full-text index using Sqlg’s custom FullText predicate.

8.3. Global unique indexing

Global unique indexing is a way of specifying that multiple properties across different labels are unique. For every GlobalUniqueIndex Sqlg maintains a separate table with a unique index defined on it. Every property that partakes in the GlobalUniqueIndex will have its value duplicated in this table. These tables are kept in the gui_schema

@Test
public void testPersonAndDogDoNotHaveTheSameName() {
    Map<String, PropertyType> properties = new HashMap<String, PropertyType>() {{
        put("name", PropertyType.STRING);
    }}; # (1)
    VertexLabel personVertexLabel = this.sqlgGraph.getTopology().getPublicSchema().ensureVertexLabelExist("Person", properties); # (2)
    VertexLabel dogVertexLabel = this.sqlgGraph.getTopology().getPublicSchema().ensureVertexLabelExist("Dog", properties); # (3)
    PropertyColumn personName = personVertexLabel.getProperty("name").get(); # (4)
    PropertyColumn dogName = dogVertexLabel.getProperty("name").get(); # (5)
    this.sqlgGraph.getTopology().ensureGlobalUniqueIndexExist(new HashSet<PropertyColumn>() {{
        add(personName);
        add(dogName);
    }}); # (6)
    this.sqlgGraph.tx().commit();

    this.sqlgGraph.addVertex(T.label, "Person", "name", "Tyson"); # (7)
    try {
        //This will fail
        this.sqlgGraph.addVertex(T.label, "Dog", "name", "Tyson"); # (8)
        fail("Duplicate key violation suppose to prevent this from executing");
    } catch (RuntimeException e) {
        //swallow
        this.sqlgGraph.tx().rollback();
    }
}
  1. A map of the properties to add.

  2. Create the 'Person' VertexLabel with its properties.

  3. Create the 'Dog' VertexLabel with its properties.

  4. Get the PropertyColumn for the 'name' property of 'Person'.

  5. Get the PropertyColumn for the 'name' property of 'Dog'.

  6. Create the GlobalUniqueIndex on the 'name' property of 'Person' and 'Dog'. This will ensure that 'Person’s and 'Dog’s do not have the same name.

  7. Add a 'Person' with the name "Tyson".

  8. Try to add a 'Dog' with the name "Tyson". This will fail as the `GlobalUniqueIndex' will prevent 'Person’s and 'Dog’s from having the same name.

GlobalUniqueIndexes do not support composite indexes.

9. Multiple JVMs

It is possible to run many Sqlg instances pointing to the same underlying database. These instances can be in the same jvm but is primarily intended for separate jvm(s) pointing to the same underlying database.

To make multiple graphs point to the same underlying database it is important to add in the distributed property to sqlg.properties.

sqlg.properties
distributed = true
Note
Multiple JVMs is only supported for Postgresql. Hsqldb and H2 are primarily intended to run embedded so multiple JVMs do not make sense for them. Multiple JVM support has not been implemented for MariaDB and MSSqlServer.

Postgresql’s notify mechanism is used to distribute the cached schema across multiple JVMs.

10. Gremlin

Sqlg optimizes a gremlin traversal by analyzing the steps and where possible combining them into custom Sqlg steps. This can significantly reduce the number of database calls.

Sqlg has two strategies for optimizing TinkerPop steps.

  • Starting with the GraphStep, consecutive optimizable steps are folded into SqlgGraphStep. This stops at the first unoptimizable step after which the second strategy is used.

  • The second strategy is to barrier the incoming elements to the unoptimizable step. This means to exhaust the traversal/iterator up to the step and cache all the incoming elements for this step. From here the step is executed for all the incoming elements at once. This strategy effectively changes the semantics to a breath first retrieval.

Note
Optimizing gremlin is an ongoing task as gremlin is a large language.
Note
Turn sql logging on by setting log4j.logger.org.umlg.sqlg=debug

10.1. Optimization (strategy 1)

The following steps are optimized. Steps are

The combined step will then in turn generate the sql statements to retrieve the data. It attempts to retrieve the data in as few distinct sql statements as possible.

10.1.1. Graph Step

The Graph Step is the start of any traversal. Sqlg optimizes the graph step by analysing subsequent steps and if possible folding them into a few steps as possible. Often into only one step, SqlgGraphStepCompiled.

@Test
public void showGraphStep() {
    Vertex a1 = this.sqlgGraph.addVertex(T.label, "A", "name", "a1");
    Vertex b1 = this.sqlgGraph.addVertex(T.label, "B", "name", "b1");
    Vertex b2 = this.sqlgGraph.addVertex(T.label, "B", "name", "b2");
    Vertex c1 = this.sqlgGraph.addVertex(T.label, "C", "name", "c1");
    Vertex c2 = this.sqlgGraph.addVertex(T.label, "C", "name", "c2");
    a1.addEdge("ab", b1);
    a1.addEdge("ab", b2);
    b1.addEdge("bc", c1);
    b2.addEdge("bc", c2);
    this.sqlgGraph.tx().commit();

    GraphTraversal<Vertex, Vertex> traversal = this.sqlgGraph.traversal().V()
            .hasLabel("A")
            .out()
            .out();
    System.out.println(traversal);
    traversal.hasNext();
    System.out.println(traversal);
    List<Vertex> c = traversal.toList();
    assertEquals(2, c.size());
}
Before optimization
[GraphStep(vertex,[]), HasStep([~label.eq(A)]), VertexStep(OUT,vertex), VertexStep(OUT,vertex)]
After optimization
[SqlgGraphStep(vertex,[])@[sqlgPathFakeLabel]]

The Before optimization output shows the steps that will execute if no optimization is performed. Without optimization the query this.sqlgGraph.traversal().V().hasLabel("A").out().out() will first get the A s, then for each A the B s and then for each B the C s. In the above example unoptimized it would be at least five round trips to the db. Optimized it is only one trip to the db.

For an embedded db like HSQLDB this is still ok but for a database server like postgresql the performance impact is significant.

After optimization there is only one SqlgGraphStep step. All the steps have been folded into one step.

The SqlgGraphStep will generate the following sql to retrieve the data.

SELECT
	"public"."V_C"."ID" AS "alias1",
	"public"."V_C"."name" AS "alias2"
FROM
	"public"."V_A" INNER JOIN
	"public"."E_ab" ON "public"."V_A"."ID" = "public"."E_ab"."public.A__O" INNER JOIN
	"public"."V_B" ON "public"."E_ab"."public.B__I" = "public"."V_B"."ID" INNER JOIN
	"public"."E_bc" ON "public"."V_B"."ID" = "public"."E_bc"."public.B__O" INNER JOIN
	"public"."V_C" ON "public"."E_bc"."public.C__I" = "public"."V_C"."ID"

10.1.2. Vertex Step

Consecutive Vertex Step are folded into the Graph Step.

@Test
public void showVertexStep() {
    Vertex a1 = this.sqlgGraph.addVertex(T.label, "A", "name", "a1");
    Vertex b1 = this.sqlgGraph.addVertex(T.label, "B", "name", "b1");
    Vertex b2 = this.sqlgGraph.addVertex(T.label, "B", "name", "b2");
    Vertex c1 = this.sqlgGraph.addVertex(T.label, "C", "name", "c1");
    Vertex c2 = this.sqlgGraph.addVertex(T.label, "C", "name", "c2");
    a1.addEdge("ab", b1);
    a1.addEdge("ab", b2);
    b1.addEdge("bc", c1);
    b2.addEdge("bc", c2);
    this.sqlgGraph.tx().commit();

    GraphTraversal<Vertex, Vertex> traversal = this.sqlgGraph.traversal().V()
            .hasLabel("A")
            .out()
            .out();
    System.out.println(traversal);
    traversal.hasNext();
    System.out.println(traversal);
    List<Vertex> c = traversal.toList();
    assertEquals(2, c.size());
}
Before optimization
[GraphStep(vertex,[]), HasStep([~label.eq(A)]), VertexStep(OUT,vertex), VertexStep(OUT,vertex)]
After optimization
[SqlgGraphStep(vertex,[])@[sqlgPathFakeLabel]]

This example is the same as the Graph Step. It shows the two Vertex Steps being folded into the SqlgGraphStep.

The SqlgGraphStep will generate the following sql to retrieve the data.

SELECT
	"public"."V_C"."ID" AS "alias1",
	"public"."V_C"."name" AS "alias2"
FROM
	"public"."V_A" INNER JOIN
	"public"."E_ab" ON "public"."V_A"."ID" = "public"."E_ab"."public.A__O" INNER JOIN
	"public"."V_B" ON "public"."E_ab"."public.B__I" = "public"."V_B"."ID" INNER JOIN
	"public"."E_bc" ON "public"."V_B"."ID" = "public"."E_bc"."public.B__O" INNER JOIN
	"public"."V_C" ON "public"."E_bc"."public.C__I" = "public"."V_C"."ID"

10.1.3. Has Step

Has Steps are folded into the Graph Step or Vertex Step.

@Test
public void showHasStep() {
    Vertex a1 = this.sqlgGraph.addVertex(T.label, "A", "name", "a1");
    Vertex b1 = this.sqlgGraph.addVertex(T.label, "B", "name", "b1");
    Vertex b2 = this.sqlgGraph.addVertex(T.label, "B", "name", "b2");
    Vertex c1 = this.sqlgGraph.addVertex(T.label, "C", "name", "c1");
    Vertex c2 = this.sqlgGraph.addVertex(T.label, "C", "name", "c2");
    a1.addEdge("ab", b1);
    a1.addEdge("ab", b2);
    b1.addEdge("bc", c1);
    b2.addEdge("bc", c2);
    this.sqlgGraph.tx().commit();

    GraphTraversal<Vertex, Vertex> traversal = this.sqlgGraph.traversal().V()
            .hasLabel("A")
            .out().has("name", "b1")
            .out();
    System.out.println(traversal);
    traversal.hasNext();
    System.out.println(traversal);
    List<Vertex> c = traversal.toList();
    assertEquals(1, c.size());
}
Before optimization
[GraphStep(vertex,[]), HasStep([~label.eq(A)]), VertexStep(OUT,vertex), HasStep([name.eq(b1)]), VertexStep(OUT,vertex)]
After optimization
[SqlgGraphStep(vertex,[])@[sqlgPathFakeLabel]]

This example is similar to the Graph Step example except for an additional HasStep. It shows the two Vertex Step s and the Has Step being folded into the SqlgGraphStep.

The SqlgGraphStep will generate the following sql to retrieve the data.

SELECT
	"public"."V_C"."ID" AS "alias1",
	"public"."V_C"."name" AS "alias2"
FROM
	"public"."V_A" INNER JOIN
	"public"."E_ab" ON "public"."V_A"."ID" = "public"."E_ab"."public.A__O" INNER JOIN
	"public"."V_B" ON "public"."E_ab"."public.B__I" = "public"."V_B"."ID" INNER JOIN
	"public"."E_bc" ON "public"."V_B"."ID" = "public"."E_bc"."public.B__O" INNER JOIN
	"public"."V_C" ON "public"."E_bc"."public.C__I" = "public"."V_C"."ID"
WHERE
	( "public"."V_B"."name" = ?) (1)
  1. The Has Step realizes itself as a sql where clause.

10.1.4. Or Step

Or Steps are folded into the Graph Step or Vertex Step.

@Test
public void showOrStep() {
    Vertex a1 = this.sqlgGraph.addVertex(T.label, "A", "name", "a1");
    Vertex b1 = this.sqlgGraph.addVertex(T.label, "B", "name", "b1");
    Vertex b2 = this.sqlgGraph.addVertex(T.label, "B", "name", "b2");
    Vertex c1 = this.sqlgGraph.addVertex(T.label, "C", "name", "c1");
    Vertex c2 = this.sqlgGraph.addVertex(T.label, "C", "name", "c2");
    Vertex c3 = this.sqlgGraph.addVertex(T.label, "C", "name", "c3");
    Vertex c4 = this.sqlgGraph.addVertex(T.label, "C", "name", "c4");
    Vertex c5 = this.sqlgGraph.addVertex(T.label, "C", "name", "c5");
    Vertex c6 = this.sqlgGraph.addVertex(T.label, "C", "name", "c6");
    a1.addEdge("ab", b1);
    a1.addEdge("ab", b2);
    b1.addEdge("bc", c1);
    b2.addEdge("bc", c2);
    b2.addEdge("bc", c3);
    b2.addEdge("bc", c4);
    b2.addEdge("bc", c5);
    b2.addEdge("bc", c6);
    this.sqlgGraph.tx().commit();

    GraphTraversal<Vertex, Vertex> traversal = this.sqlgGraph.traversal().V()
            .hasLabel("A")
            .out()
            .out()
            .or(
                    __.has("name", "c1"),
                    __.has("name", "c3"),
                    __.has("name", "c6")
            );

    System.out.println(traversal);
    traversal.hasNext();
    System.out.println(traversal);
    List<Vertex> c = traversal.toList();
    assertEquals(3, c.size());
}
Before optimization
[GraphStep(vertex,[]), HasStep([~label.eq(A)]), VertexStep(OUT,vertex), VertexStep(OUT,vertex), OrStep([[HasStep([name.eq(c1)])], [HasStep([name.eq(c3)])], [HasStep([name.eq(c6)])]])]
After optimization
[SqlgGraphStep(vertex,[])@[sqlgPathFakeLabel]]

This example is similar to the Graph Step example except for an additional Or Step which in turn contains three Has Step s. It shows the two Vertex Step s the Or Step and the Has Step s being folded into the SqlgGraphStep.

The SqlgGraphStep will generate the following sql to retrieve the data.

SELECT
	"public"."V_C"."ID" AS "alias1",
	"public"."V_C"."name" AS "alias2"
FROM
	"public"."V_A" INNER JOIN
	"public"."E_ab" ON "public"."V_A"."ID" = "public"."E_ab"."public.A__O" INNER JOIN
	"public"."V_B" ON "public"."E_ab"."public.B__I" = "public"."V_B"."ID" INNER JOIN
	"public"."E_bc" ON "public"."V_B"."ID" = "public"."E_bc"."public.B__O" INNER JOIN
	"public"."V_C" ON "public"."E_bc"."public.C__I" = "public"."V_C"."ID"
WHERE
(("public"."V_C"."name" = ?) OR ("public"."V_C"."name" = ?) OR ("public"."V_C"."name" = ?)
) (1)
  1. The Or Step realizes itself as a sql where clause.

10.1.5. And Step

And Steps are folded into the Graph Step or Vertex Step.

@Test
public void showAndStep() {
    Vertex a1 = this.sqlgGraph.addVertex(T.label, "A", "name", "a1");
    Vertex b1 = this.sqlgGraph.addVertex(T.label, "B", "name", "b1");
    Vertex b2 = this.sqlgGraph.addVertex(T.label, "B", "name", "b2");
    Vertex c1 = this.sqlgGraph.addVertex(T.label, "C", "name", "c1", "surname", "x", "address", "y");
    Vertex c2 = this.sqlgGraph.addVertex(T.label, "C", "name", "c2", "surname", "x", "address", "y");
    Vertex c3 = this.sqlgGraph.addVertex(T.label, "C", "name", "c3", "surname", "x", "address", "y");
    Vertex c4 = this.sqlgGraph.addVertex(T.label, "C", "name", "c4", "surname", "x", "address", "y");
    Vertex c5 = this.sqlgGraph.addVertex(T.label, "C", "name", "c5", "surname", "x", "address", "y");
    Vertex c6 = this.sqlgGraph.addVertex(T.label, "C", "name", "c6", "surname", "x", "address", "y");
    a1.addEdge("ab", b1);
    a1.addEdge("ab", b2);
    b1.addEdge("bc", c1);
    b2.addEdge("bc", c2);
    b2.addEdge("bc", c3);
    b2.addEdge("bc", c4);
    b2.addEdge("bc", c5);
    b2.addEdge("bc", c6);
    this.sqlgGraph.tx().commit();

    GraphTraversal<Vertex, Vertex> traversal = this.sqlgGraph.traversal().V()
            .hasLabel("A")
            .out()
            .out()
            .and(
                    __.has("name", "c1"),
                    __.has("surname", "x"),
                    __.has("address", "y")
            );

    System.out.println(traversal);
    traversal.hasNext();
    System.out.println(traversal);
    List<Vertex> c = traversal.toList();
    assertEquals(1, c.size());
}
Before optimization
[GraphStep(vertex,[]), HasStep([~label.eq(A)]), VertexStep(OUT,vertex), VertexStep(OUT,vertex), AndStep([[HasStep([name.eq(c1)])], [HasStep([surname.eq(x)])], [HasStep([address.eq(y)])]])]
After optimization
[SqlgGraphStep(vertex,[])@[sqlgPathFakeLabel]]

This example is similar to the Graph Step example except for an additional And Step which in turn contains three Has Step s. It shows the two Vertex Step s the And Step and the Has Step s being folded into the SqlgGraphStep.

The SqlgGraphStep will generate the following sql to retrieve the data.

SELECT
	"public"."V_C"."ID" AS "alias1",
	"public"."V_C"."address" AS "alias2",
	"public"."V_C"."surname" AS "alias3",
	"public"."V_C"."name" AS "alias4"
FROM
	"public"."V_A" INNER JOIN
	"public"."E_ab" ON "public"."V_A"."ID" = "public"."E_ab"."public.A__O" INNER JOIN
	"public"."V_B" ON "public"."E_ab"."public.B__I" = "public"."V_B"."ID" INNER JOIN
	"public"."E_bc" ON "public"."V_B"."ID" = "public"."E_bc"."public.B__O" INNER JOIN
	"public"."V_C" ON "public"."E_bc"."public.C__I" = "public"."V_C"."ID"
WHERE
(("public"."V_C"."name" = ?) AND ("public"."V_C"."surname" = ?) AND ("public"."V_C"."address" = ?)
) (1)
  1. The And Step realizes itself as a sql where clause.

10.1.6. Not Step

Not Steps are folded into the Graph Step or Vertex Step.

10.1.7. Repeat Step

Sqlg optimizes the RepeatStep so long as the until modulator is not present. RepeatStep can be optimized with the modulator emit and times.

Repeat Step with emit first
@Test
public void showRepeatStepEmitFirst() {
    Vertex a1 = this.sqlgGraph.addVertex(T.label, "A", "name", "a1");
    Vertex a2 = this.sqlgGraph.addVertex(T.label, "A", "name", "a2");
    Vertex b1 = this.sqlgGraph.addVertex(T.label, "B", "name", "b1");
    Vertex b2 = this.sqlgGraph.addVertex(T.label, "B", "name", "b2");
    Vertex c1 = this.sqlgGraph.addVertex(T.label, "C", "name", "c1");
    Vertex c2 = this.sqlgGraph.addVertex(T.label, "C", "name", "c2");
    Vertex c3 = this.sqlgGraph.addVertex(T.label, "C", "name", "c3");
    a1.addEdge("ab", b1);
    a1.addEdge("ab", b2);
    b1.addEdge("bc", c1);
    b1.addEdge("bc", c2);
    b1.addEdge("bc", c3);
    this.sqlgGraph.tx().commit();

    List<Path> paths = this.sqlgGraph.traversal().V().hasLabel("A")
            .emit()
            .times(2)
            .repeat(
                    __.out()
            )
            .path().by("name")
            .toList();
    for (Path path : paths) {
        System.out.println(path);
    }
}
output
[a1, b1, c3]
[a1, b1, c2]
[a1, b1, c1]
[a1]
[a2]
[a1, b1]
[a1, b2]
sql
SELECT
	"public"."V_C"."ID" AS "alias1",
	"public"."V_C"."name" AS "alias2",
	"public"."V_A"."ID" AS "alias3",
	"public"."V_A"."name" AS "alias4",
	"public"."V_B"."ID" AS "alias5",
	"public"."V_B"."name" AS "alias6",
	"public"."E_ab"."ID" AS "alias7"
FROM
	"public"."V_A" INNER JOIN
	"public"."E_ab" ON "public"."V_A"."ID" = "public"."E_ab"."public.A__O" INNER JOIN
	"public"."V_B" ON "public"."E_ab"."public.B__I" = "public"."V_B"."ID" INNER JOIN
	"public"."E_bc" ON "public"."V_B"."ID" = "public"."E_bc"."public.B__O" INNER JOIN
	"public"."V_C" ON "public"."E_bc"."public.C__I" = "public"."V_C"."ID" (1)

SELECT
	"public"."V_A"."ID" AS "alias1",
	"public"."V_A"."name" AS "alias2"
FROM
	"public"."V_A" (2)

SELECT
	"public"."V_B"."ID" AS "alias1",
	"public"."V_B"."name" AS "alias2",
	"public"."V_A"."ID" AS "alias3",
	"public"."V_A"."name" AS "alias4",
	"public"."E_ab"."ID" AS "alias5"
FROM
	"public"."V_A" INNER JOIN
	"public"."E_ab" ON "public"."V_A"."ID" = "public"."E_ab"."public.A__O" INNER JOIN
	"public"."V_B" ON "public"."E_ab"."public.B__I" = "public"."V_B"."ID" (3)
  1. Get the 'A’s to emit.

  2. Get the 'B’s to emit.

  3. Get the 'C’s to emit.

Repeat Step with emit last
@Test
public void showRepeatStepEmitLast() {
    Vertex a1 = this.sqlgGraph.addVertex(T.label, "A", "name", "a1");
    Vertex a2 = this.sqlgGraph.addVertex(T.label, "A", "name", "a2");
    Vertex b1 = this.sqlgGraph.addVertex(T.label, "B", "name", "b1");
    Vertex b2 = this.sqlgGraph.addVertex(T.label, "B", "name", "b2");
    Vertex c1 = this.sqlgGraph.addVertex(T.label, "C", "name", "c1");
    Vertex c2 = this.sqlgGraph.addVertex(T.label, "C", "name", "c2");
    Vertex c3 = this.sqlgGraph.addVertex(T.label, "C", "name", "c3");
    a1.addEdge("ab", b1);
    a1.addEdge("ab", b2);
    b1.addEdge("bc", c1);
    b1.addEdge("bc", c2);
    b1.addEdge("bc", c3);
    this.sqlgGraph.tx().commit();

    List<Path> paths = this.sqlgGraph.traversal().V().hasLabel("A")
            .repeat(
                    __.out()
            )
            .emit()
            .times(2)
            .path().by("name")
            .toList();
    for (Path path : paths) {
        System.out.println(path);
    }
}
output
[a1, b1, c3]
[a1, b1, c2]
[a1, b1, c1]
[a1, b1]
[a1, b2]
sql
SELECT
	"public"."V_C"."ID" AS "alias1",
	"public"."V_C"."name" AS "alias2",
	"public"."V_A"."ID" AS "alias3",
	"public"."V_A"."name" AS "alias4",
	"public"."V_B"."ID" AS "alias5",
	"public"."V_B"."name" AS "alias6",
	"public"."E_ab"."ID" AS "alias7",
	"public"."E_bc"."ID" AS "alias8"
FROM
	"public"."V_A" INNER JOIN
	"public"."E_ab" ON "public"."V_A"."ID" = "public"."E_ab"."public.A__O" INNER JOIN
	"public"."V_B" ON "public"."E_ab"."public.B__I" = "public"."V_B"."ID" INNER JOIN
	"public"."E_bc" ON "public"."V_B"."ID" = "public"."E_bc"."public.B__O" INNER JOIN
	"public"."V_C" ON "public"."E_bc"."public.C__I" = "public"."V_C"."ID" (1)

SELECT
	"public"."V_B"."ID" AS "alias1",
	"public"."V_B"."name" AS "alias2",
	"public"."V_A"."ID" AS "alias3",
	"public"."V_A"."name" AS "alias4",
	"public"."E_ab"."ID" AS "alias5"
FROM
	"public"."V_A" INNER JOIN
	"public"."E_ab" ON "public"."V_A"."ID" = "public"."E_ab"."public.A__O" INNER JOIN
	"public"."V_B" ON "public"."E_ab"."public.B__I" = "public"."V_B"."ID" (2)
  1. Get the 'C’s to emit.

  2. Get the 'B’s to emit.

10.1.8. Optional Step

Sqlg optimizes the OptionalStep.

@Test
public void showOptionalStep() {
    Vertex a1 = this.sqlgGraph.addVertex(T.label, "A", "name", "a1");
    Vertex a2 = this.sqlgGraph.addVertex(T.label, "A", "name", "a2");
    Vertex b1 = this.sqlgGraph.addVertex(T.label, "B", "name", "b1");
    Vertex b2 = this.sqlgGraph.addVertex(T.label, "B", "name", "b2");
    Vertex c1 = this.sqlgGraph.addVertex(T.label, "C", "name", "c1");
    Vertex c2 = this.sqlgGraph.addVertex(T.label, "C", "name", "c2");
    Vertex c3 = this.sqlgGraph.addVertex(T.label, "C", "name", "c3");
    a1.addEdge("ab", b1);
    a1.addEdge("ab", b2);
    b1.addEdge("bc", c1);
    b1.addEdge("bc", c2);
    b1.addEdge("bc", c3);
    this.sqlgGraph.tx().commit();

    List<Path> paths = this.sqlgGraph.traversal()
            .V().hasLabel("A")
            .optional(
                    __.out().optional(
                            __.out()
                    )
            )
            .path().by("name")
            .toList();
    for (Path path : paths) {
        System.out.println(path);
    }
}
Before optimization
[GraphStep(vertex,[]), HasStep([~label.eq(A)]), OptionalStep([VertexStep(OUT,vertex), OptionalStep([VertexStep(OUT,vertex)])]), PathStep([value(name)])]
After optimization
[SqlgGraphStep(vertex,[])@[sqlgPathFakeLabel], PathStep([value(name)])]
output
[a1, b1, c3]
[a1, b1, c2]
[a1, b1, c1]
[a2]
[a1, b2]
sql
SELECT
	"public"."V_C"."ID" AS "alias1",
	"public"."V_C"."name" AS "alias2",
	"public"."V_A"."ID" AS "alias3",
	"public"."V_A"."name" AS "alias4",
	"public"."V_B"."ID" AS "alias5",
	"public"."V_B"."name" AS "alias6"
FROM
	"public"."V_A" INNER JOIN
	"public"."E_ab" ON "public"."V_A"."ID" = "public"."E_ab"."public.A__O" INNER JOIN
	"public"."V_B" ON "public"."E_ab"."public.B__I" = "public"."V_B"."ID" INNER JOIN
	"public"."E_bc" ON "public"."V_B"."ID" = "public"."E_bc"."public.B__O" INNER JOIN
	"public"."V_C" ON "public"."E_bc"."public.C__I" = "public"."V_C"."ID" (1)

SELECT
	"public"."V_A"."ID" AS "alias1",
	"public"."V_A"."name" AS "alias2"
FROM
	"public"."V_A" LEFT JOIN
	"public"."E_ab" ON "public"."V_A"."ID" = "public"."E_ab"."public.A__O"
WHERE
	("public"."E_ab"."public.A__O" IS NULL) (2)

SELECT
	"public"."V_B"."ID" AS "alias1",
	"public"."V_B"."name" AS "alias2",
	"public"."V_A"."ID" AS "alias3",
	"public"."V_A"."name" AS "alias4"
FROM
	"public"."V_A" INNER JOIN
	"public"."E_ab" ON "public"."V_A"."ID" = "public"."E_ab"."public.A__O" INNER JOIN
	"public"."V_B" ON "public"."E_ab"."public.B__I" = "public"."V_B"."ID" LEFT JOIN
	"public"."E_bc" ON "public"."V_B"."ID" = "public"."E_bc"."public.B__O"
WHERE
	("public"."E_bc"."public.B__O" IS NULL) (3)
  1. Get the 'C’s

  2. Get the 'A’s that do not have 'B’s

  3. Get the 'B’s that do not have 'C’s

10.1.9. Choose Step

@Test
public void showChooseStep() {
    Vertex a1 = this.sqlgGraph.addVertex(T.label, "A", "name", "a1");
    Vertex a2 = this.sqlgGraph.addVertex(T.label, "A", "name", "a2");
    Vertex b1 = this.sqlgGraph.addVertex(T.label, "B", "name", "b1");
    Vertex b2 = this.sqlgGraph.addVertex(T.label, "B", "name", "b2");
    a1.addEdge("ab", b1);
    a1.addEdge("ab", b2);
    this.sqlgGraph.tx().commit();

    Traversal<Vertex, Path> traversal = this.sqlgGraph.traversal()
            .V().hasLabel("A")
            .choose(__.out(), __.out())
            .path().by("name");

    printTraversalForm(traversal);

    List<Path> paths = traversal.toList();
    for (Path path : paths) {
        System.out.println(path);
    }
}
Before optimization
[GraphStep(vertex,[]), HasStep([~label.eq(A)]), ChooseStep([VertexStep(OUT,vertex), HasNextStep],{false=[[IdentityStep, EndStep]], true=[[VertexStep(OUT,vertex), EndStep]]}), PathStep([value(name)])]
After optimization
[SqlgGraphStep(vertex,[])@[sqlgPathFakeLabel], PathStep([value(name)])]
output
[a1, b1]
[a1, b2]
[a2]
sql
SELECT
	"public"."V_B"."ID" AS "alias1",
	"public"."V_B"."name" AS "alias2",
	"public"."V_A"."ID" AS "alias3",
	"public"."V_A"."name" AS "alias4"
FROM
	"public"."V_A" INNER JOIN
	"public"."E_ab" ON "public"."V_A"."ID" = "public"."E_ab"."public.A__O" INNER JOIN
	"public"."V_B" ON "public"."E_ab"."public.B__I" = "public"."V_B"."ID"
DEBUG 2018-08-12 19:31:50,944 [main] org.umlg.sqlg.strategy.SqlgSqlExecutor:
SELECT
	"public"."V_A"."ID" AS "alias1",
	"public"."V_A"."name" AS "alias2"
FROM
	"public"."V_A" LEFT JOIN
	"public"."E_ab" ON "public"."V_A"."ID" = "public"."E_ab"."public.A__O"
WHERE
	("public"."E_ab"."public.A__O" IS NULL)

10.1.10. Order Step

Sqlg optimizes the OrderGlobalStep if the data that the order applies to can be retrieved in one sql statement. If not then order the ordering occurs in java via the OrderGlobalStep as per normal.

@Test
public void testOrderBy() {
    Vertex a1 = this.sqlgGraph.addVertex(T.label, "A", "name", "a", "surname", "a");
    Vertex a2 = this.sqlgGraph.addVertex(T.label, "A", "name", "a", "surname", "b");
    Vertex a3 = this.sqlgGraph.addVertex(T.label, "A", "name", "a", "surname", "c");
    Vertex b1 = this.sqlgGraph.addVertex(T.label, "A", "name", "b", "surname", "a");
    Vertex b2 = this.sqlgGraph.addVertex(T.label, "A", "name", "b", "surname", "b");
    Vertex b3 = this.sqlgGraph.addVertex(T.label, "A", "name", "b", "surname", "c");
    this.sqlgGraph.tx().commit();

    Traversal<Vertex, Vertex> traversal = this.sqlgGraph.traversal().V().hasLabel("A")
            .order().by("name", Order.incr).by("surname", Order.decr);
    printTraversalForm(traversal);

    List<Vertex> vertices = traversal.toList();
    for (Vertex v : vertices) {
        System.out.println(v.value("name") + " " + v.value("surname"));
    }
}
Before optimization
[GraphStep(vertex,[]), HasStep([~label.eq(A)]), OrderGlobalStep([[value(name), incr], [value(surname), decr]])]
After optimization
[SqlgGraphStep(vertex,[])@[sqlgPathOrderRangeLabel]]
output
a c
a b
a a
b c
b b
b a
sql
SELECT
	"public"."V_A"."ID" AS "alias1",
	"public"."V_A"."surname" AS "alias2",
	"public"."V_A"."name" AS "alias3"
FROM
	"public"."V_A"
ORDER BY
	 "alias3" ASC,
	 "alias2" DESC

10.1.11. Range Step

Sqlg optimizes the RangeGlobalStep

@Test
public void testRangeOnVertexLabels() {
    for (int i = 0; i < 100; i++) {
        this.sqlgGraph.addVertex(T.label, "Person", "name", "person" + i);
    }
    this.sqlgGraph.tx().commit();
    Traversal<Vertex, String> traversal = this.sqlgGraph.traversal()
            .V().hasLabel("Person")
            .order().by("name")
            .range(1, 4)
            .values("name");
    printTraversalForm(traversal);

    List<String> names = traversal.toList();
    for (String name : names) {
        System.out.println(name);
    }
}
Before optimization
[GraphStep(vertex,[]), HasStep([~label.eq(Person)]), OrderGlobalStep([[value(name), incr]]), RangeGlobalStep(1,4), PropertiesStep([name],value)]
After optimization
[SqlgGraphStep(vertex,[])@[sqlgPathOrderRangeLabel], PropertiesStep([name],value)]
output
person1
person10
person11
sql
SELECT
	"public"."V_Person"."ID" AS "alias1",
	"public"."V_Person"."name" AS "alias2"
FROM
	"public"."V_Person"
ORDER BY
	 "alias2" ASC
LIMIT 3 OFFSET 1

10.1.12. Limit Step

Sqlg optimizes .limit(x)

@Test
public void testLimitOnVertexLabels() {
    for (int i = 0; i < 100; i++) {
        this.sqlgGraph.addVertex(T.label, "Person", "name", "person" + i);
    }
    this.sqlgGraph.tx().commit();
    Traversal<Vertex, String> traversal = this.sqlgGraph.traversal()
            .V().hasLabel("Person")
            .order().by("name")
            .limit(3)
            .values("name");
    printTraversalForm(traversal);

    List<String> names = traversal.toList();
    for (String name : names) {
        System.out.println(name);
    }
}
Before optimization
[GraphStep(vertex,[]), HasStep([~label.eq(Person)]), OrderGlobalStep([[value(name), incr]]), RangeGlobalStep(0,3), PropertiesStep([name],value)]
After optimization
[SqlgGraphStep(vertex,[])@[sqlgPathOrderRangeLabel], PropertiesStep([name],value)]
output
person0
person1
person10
sql
SELECT
	"public"."V_Person"."ID" AS "alias1",
	"public"."V_Person"."name" AS "alias2"
FROM
	"public"."V_Person"
ORDER BY
	 "alias2" ASC
LIMIT 3 OFFSET 0

10.1.13. Drop Step

@Test
public void testsDropStepTrivial() {
    this.sqlgGraph.addVertex(T.label, "A", "name", "a1");
    this.sqlgGraph.addVertex(T.label, "A", "name", "a2");
    this.sqlgGraph.addVertex(T.label, "A", "name", "a3");
    this.sqlgGraph.tx().commit();

    Traversal<Vertex, Vertex> traversal = this.sqlgGraph.traversal().V().hasLabel("A").drop();
    printTraversalForm(traversal);

    traversal.iterate();
    this.sqlgGraph.tx().commit();

    assertEquals(0, this.sqlgGraph.traversal().V().hasLabel("A").count().next(), 0);
}
Before optimization
[GraphStep(vertex,[]), HasStep([~label.eq(A)]), DropStep]
After optimization
[SqlgGraphStep(vertex,[]), SqlgDropStepBarrier]
sql
TRUNCATE ONLY "public"."V_A" (1)
  1. As vertex label 'A' has no in or out edges nor are there any predicates the TRUNCATE command is used.

@Test
public void testsDropStepWithHas() {
    this.sqlgGraph.addVertex(T.label, "A", "name", "a1");
    this.sqlgGraph.addVertex(T.label, "A", "name", "a2");
    this.sqlgGraph.addVertex(T.label, "A", "name", "a3");
    this.sqlgGraph.tx().commit();

    Traversal<Vertex, Vertex> traversal = this.sqlgGraph.traversal().V()
            .hasLabel("A")
            .has("name", P.within("a1", "a2"))
            .drop();
    printTraversalForm(traversal);

    traversal.iterate();
    this.sqlgGraph.tx().commit();

    assertEquals(1, this.sqlgGraph.traversal().V().hasLabel("A").count().next(), 0);
}
Before optimization
[GraphStep(vertex,[]), HasStep([~label.eq(A), name.within([a1, a2])]), DropStep]
After optimization
[SqlgGraphStep(vertex,[]), SqlgDropStepBarrier]
sql
WITH todelete AS (
SELECT
	"public"."V_A"."ID" AS "alias1"
FROM
	"public"."V_A"
WHERE
	( "public"."V_A"."name" in (?, ?))
)
DELETE FROM "public"."V_A" a USING todelete
WHERE a."ID" = todelete."alias1" (1)
  1. DELETE with a where clause.

@Test
public void testDropStepWithEdges() {
    Vertex a1 = this.sqlgGraph.addVertex(T.label, "A", "name", "a1");
    Vertex b1 = this.sqlgGraph.addVertex(T.label, "B", "name", "b1");
    Vertex b2 = this.sqlgGraph.addVertex(T.label, "B", "name", "b2");
    Vertex b3 = this.sqlgGraph.addVertex(T.label, "B", "name", "b3");
    a1.addEdge("ab", b1);
    a1.addEdge("ab", b2);
    a1.addEdge("ab", b3);
    this.sqlgGraph.tx().commit();

    Traversal<Vertex, Vertex> traversal = this.sqlgGraph.traversal().V().hasLabel("A").out().drop();
    printTraversalForm(traversal);

    traversal.iterate();
    this.sqlgGraph.tx().commit();

    assertEquals(0, this.sqlgGraph.traversal().V().hasLabel("B").count().next(), 0);
}
Before optimization
[GraphStep(vertex,[]), HasStep([~label.eq(A)]), VertexStep(OUT,vertex), DropStep]
After optimization
[SqlgGraphStep(vertex,[]), SqlgDropStepBarrier]
sql
SET CONSTRAINTS ALL DEFERRED (1)

WITH todelete AS (
SELECT
	"public"."V_B"."ID" AS "alias1"
FROM
	"public"."V_A" INNER JOIN
	"public"."E_ab" ON "public"."V_A"."ID" = "public"."E_ab"."public.A__O" INNER JOIN
	"public"."V_B" ON "public"."E_ab"."public.B__I" = "public"."V_B"."ID"
)
DELETE FROM "public"."V_B" a USING todelete
WHERE a."ID" = todelete."alias1" (2)

WITH todelete AS (
SELECT
	"public"."E_ab"."ID" AS "alias1"
FROM
	"public"."V_A" INNER JOIN
	"public"."E_ab" ON "public"."V_A"."ID" = "public"."E_ab"."public.A__O" LEFT JOIN
	"public"."V_B" ON "public"."E_ab"."public.B__I" = "public"."V_B"."ID"
WHERE
	("public"."V_B"."ID" IS NULL) AND
	("public"."E_ab"."public.B__I" IS NOT NULL)
)
DELETE FROM "public"."E_ab" a USING todelete
WHERE a."ID" = todelete."alias1" (3)

SET CONSTRAINTS ALL IMMEDIATE (4)
  1. On postgresql we defer (disable) the foreign key constraints.

  2. Delete the 'B' vertices first. As the edge constraints are disabled this is possible.

  3. Delete the edges. <4>. Enable the foreign key constraints.

10.1.14. Reducing Steps

Min Step
@Test
public void testMin() {
    this.sqlgGraph.addVertex(T.label, "Person", "age", 1);
    this.sqlgGraph.addVertex(T.label, "Person", "age", 2);
    this.sqlgGraph.addVertex(T.label, "Person", "age", 3);
    this.sqlgGraph.addVertex(T.label, "Person", "age", 0);
    this.sqlgGraph.tx().commit();

    DefaultTraversal<Vertex, Integer> traversal = (DefaultTraversal) this.sqlgGraph.traversal().V().hasLabel("Person").values("age").min();
    Assert.assertEquals(0, traversal.next(), 0);
}
Before optimization
[GraphStep(vertex,[]), HasStep([~label.eq(Person)]), PropertiesStep([age],value), MinGlobalStep]
After optimization
[SqlgGraphStep(vertex,[])@[sqlgPathTempFakeLabel], SqlgPropertiesStep([age],value), SqlgMinGlobalStep]
sql
SELECT
	MIN("public"."V_Person"."age") AS "alias1"
FROM
	"public"."V_Person"
Max Step
@Test
public void testMax() {
    this.sqlgGraph.addVertex(T.label, "Person", "age", 1, "x", 1);
    this.sqlgGraph.addVertex(T.label, "Person", "age", 2, "x", 1);
    this.sqlgGraph.addVertex(T.label, "Person", "age", 3, "x", 1);
    this.sqlgGraph.addVertex(T.label, "Person", "age", 0, "x", 1);
    this.sqlgGraph.tx().commit();

    DefaultTraversal<Vertex, Integer> traversal = (DefaultTraversal) this.sqlgGraph.traversal().V().hasLabel("Person").values("age").max();
    printTraversalForm(traversal);
    Assert.assertEquals(3, traversal.next(), 0);
}
Before optimization
[GraphStep(vertex,[]), HasStep([~label.eq(Person)]), PropertiesStep([age],value), MaxGlobalStep]
After optimization
[SqlgGraphStep(vertex,[])@[sqlgPathTempFakeLabel], SqlgPropertiesStep([age],value), SqlgMaxGlobalStep]
sql
SELECT
	MAX("public"."V_Person"."age") AS "alias1"
FROM
	"public"."V_Person"
Sum Step
@Test
public void testSum() {
    this.sqlgGraph.addVertex(T.label, "Person", "age", 1);
    this.sqlgGraph.addVertex(T.label, "Person", "age", 2);
    this.sqlgGraph.addVertex(T.label, "Person", "age", 3);
    this.sqlgGraph.addVertex(T.label, "Person", "age", 0);
    this.sqlgGraph.tx().commit();

    DefaultTraversal<Vertex, Long> traversal = (DefaultTraversal) this.sqlgGraph.traversal().V().hasLabel("Person").values("age").sum();
    printTraversalForm(traversal);
    Assert.assertEquals(6, traversal.next(), 0L);
}
Before optimization
[GraphStep(vertex,[]), HasStep([~label.eq(Person)]), PropertiesStep([age],value), SumGlobalStep]
After optimization
[SqlgGraphStep(vertex,[])@[sqlgPathTempFakeLabel], SqlgPropertiesStep([age],value), SqlgSumGlobalStep]
sql
SELECT
	SUM("public"."V_Person"."age") AS "alias1"
FROM
	"public"."V_Person"
Mean Step
@Test
public void testMean() {
    this.sqlgGraph.addVertex(T.label, "Person", "age", 1);
    this.sqlgGraph.addVertex(T.label, "Person", "age", 2);
    this.sqlgGraph.addVertex(T.label, "Person", "age", 3);
    this.sqlgGraph.addVertex(T.label, "Person", "age", 0);
    this.sqlgGraph.tx().commit();
    DefaultTraversal<Vertex, Double> traversal = (DefaultTraversal) this.sqlgGraph.traversal().V().hasLabel("Person").values("age").mean();
    printTraversalForm(traversal);
    Double d = traversal.next();
    Assert.assertEquals(1.5, d, 0D);
}
Before optimization
[GraphStep(vertex,[]), HasStep([~label.eq(Person)]), PropertiesStep([age],value), MeanGlobalStep]
After optimization
[SqlgGraphStep(vertex,[])@[sqlgPathTempFakeLabel], SqlgPropertiesStep([age],value), SqlgAvgGlobalStep]
sql
SELECT
	AVG("public"."V_Person"."age") AS "alias1", COUNT(1) AS "alias1_weight"
FROM
	"public"."V_Person"
Count Step
@Test
public void testCount() {
    this.sqlgGraph.addVertex(T.label, "A", "name", "a");
    this.sqlgGraph.addVertex(T.label, "A", "name", "a");
    this.sqlgGraph.addVertex(T.label, "A", "name", "a");
    this.sqlgGraph.addVertex(T.label, "A", "name", "a");
    this.sqlgGraph.tx().commit();
    DefaultTraversal<Vertex, Long> traversal = (DefaultTraversal<Vertex, Long>) this.sqlgGraph.traversal().V().count();
    printTraversalForm(traversal);
    Assert.assertEquals(4, traversal.next(), 0);
}
Before optimization
[GraphStep(vertex,[]), CountGlobalStep]
After optimization
[SqlgGraphStep(vertex,[])@[sqlgPathTempFakeLabel], SqlgPropertiesStep([count],value), SqlgCountGlobalStep]
sql
SELECT
	COUNT(1)
FROM
	"public"."V_A"
Group By

Group Step's are optimized with sql’s group by clause.

Group By and Min Step
@Test
public void testGroupOverOnePropertyMin() {
    this.sqlgGraph.addVertex(T.label, "Person", "name", "A", "age", 1);
    this.sqlgGraph.addVertex(T.label, "Person", "name", "B", "age", 2);
    this.sqlgGraph.addVertex(T.label, "Person", "name", "A", "age", 3);
    this.sqlgGraph.addVertex(T.label, "Person", "name", "B", "age", 4);
    this.sqlgGraph.tx().commit();

    DefaultTraversal<Vertex, Map<String, Integer>> traversal = (DefaultTraversal) sqlgGraph.traversal()
            .V().hasLabel("Person")
            .<String, Integer>group().by("name").by(__.values("age").min());
    printTraversalForm(traversal);
    Map<String, Integer> result = traversal.next();
    Assert.assertFalse(traversal.hasNext());
    Assert.assertTrue(result.containsKey("A"));
    Assert.assertTrue(result.containsKey("B"));
    Assert.assertEquals(1, result.get("A"), 0);
    Assert.assertEquals(2, result.get("B"), 0);
}
Before optimization
[GraphStep(vertex,[]), HasStep([~label.eq(Person)]), GroupStep(value(name),[PropertiesStep([age],value), MinGlobalStep])]
After optimization
[SqlgGraphStep(vertex,[])@[sqlgPathTempFakeLabel], SqlgGroupStep]
sql
SELECT
	"public"."V_Person"."name" AS "alias1",
	MIN("public"."V_Person"."age") AS "alias2"
FROM
	"public"."V_Person"
GROUP BY
	"public"."V_Person"."name"
Group By and Max Step
    @Test
    public void testGroupByLabelMax() {
        this.sqlgGraph.addVertex(T.label, "Person", "name", "A", "age", 10);
        this.sqlgGraph.addVertex(T.label, "Person", "name", "B", "age", 20);
        this.sqlgGraph.addVertex(T.label, "Person", "name", "C", "age", 100);
        this.sqlgGraph.addVertex(T.label, "Person", "name", "D", "age", 40);

        this.sqlgGraph.addVertex(T.label, "Dog", "name", "A", "age", 10);
        this.sqlgGraph.addVertex(T.label, "Dog", "name", "B", "age", 200);
        this.sqlgGraph.addVertex(T.label, "Dog", "name", "C", "age", 30);
        this.sqlgGraph.addVertex(T.label, "Dog", "name", "D", "age", 40);

        this.sqlgGraph.tx().commit();

        DefaultTraversal<Vertex, Map<String, Integer>> traversal = (DefaultTraversal) this.sqlgGraph.traversal().V().<String, Integer>group().by(T.label).by(__.values("age").max());
        printTraversalForm(traversal);

        Map<String, Integer> result = traversal.next();
        Assert.assertFalse(traversal.hasNext());
        Assert.assertEquals(2, result.size());
        Assert.assertTrue(result.containsKey("Person"));
        Assert.assertTrue(result.containsKey("Dog"));
        Assert.assertEquals(100, result.get("Person"), 0);
        Assert.assertEquals(200, result.get("Dog"), 0);
    }
Before optimization
[GraphStep(vertex,[]), GroupStep(label,[PropertiesStep([age],value), MaxGlobalStep]
After optimization
[SqlgGraphStep(vertex,[])@[sqlgPathTempFakeLabel], SqlgGroupStep]
sql
SELECT
	MAX("public"."V_Person"."age") AS "alias1"
FROM
	"public"."V_Person"

SELECT
	MAX("public"."V_Dog"."age") AS "alias1"
FROM
	"public"."V_Dog"
@Test
public void testGroupOverTwoPropertiesWithValues() {
    this.sqlgGraph.addVertex(T.label, "Person", "name", "A", "surname", "C", "age", 1);
    this.sqlgGraph.addVertex(T.label, "Person", "name", "B", "surname", "D", "age", 2);
    this.sqlgGraph.addVertex(T.label, "Person", "name", "A", "surname", "C", "age", 3);
    this.sqlgGraph.addVertex(T.label, "Person", "name", "B", "surname", "E", "age", 4);
    this.sqlgGraph.addVertex(T.label, "Person", "name", "C", "surname", "E", "age", 5);
    this.sqlgGraph.tx().commit();

    DefaultTraversal<Vertex, Map<List<String>, Integer>> traversal = (DefaultTraversal) this.sqlgGraph.traversal().V().hasLabel("Person")
            .<List<String>, Integer>group()
            .by(__.values("name", "surname").fold())
            .by(__.values("age").max());

    printTraversalForm(traversal);

    Map<List<String>, Integer> result = traversal.next();
    Assert.assertTrue(result.containsKey(Arrays.asList("A", "C")) || result.containsKey(Arrays.asList("C", "A")));
    Assert.assertTrue(result.containsKey(Arrays.asList("B", "D")) || result.containsKey(Arrays.asList("D", "B")));
    Assert.assertTrue(result.containsKey(Arrays.asList("B", "E")) || result.containsKey(Arrays.asList("E", "B")));
    Assert.assertTrue(result.containsKey(Arrays.asList("C", "E")) || result.containsKey(Arrays.asList("E", "C")));
    Assert.assertEquals(4, result.size());
    Assert.assertFalse(traversal.hasNext());

    if (result.containsKey(Arrays.asList("A", "C"))) {
        Assert.assertEquals(3, result.get(Arrays.asList("A", "C")), 0);
    } else {
        Assert.assertEquals(3, result.get(Arrays.asList("C", "A")), 0);
    }
    if (result.containsKey(Arrays.asList("B", "D"))) {
        Assert.assertEquals(2, result.get(Arrays.asList("B", "D")), 0);
    } else {
        Assert.assertEquals(2, result.get(Arrays.asList("D", "B")), 0);
    }
    if (result.containsKey(Arrays.asList("B", "E"))) {
        Assert.assertEquals(4, result.get(Arrays.asList("B", "E")), 0);
    } else {
        Assert.assertEquals(4, result.get(Arrays.asList("E", "B")), 0);
    }
    if (result.containsKey(Arrays.asList("C", "E"))) {
        Assert.assertEquals(5, result.get(Arrays.asList("C", "E")), 0);
    } else {
        Assert.assertEquals(5, result.get(Arrays.asList("E", "C")), 0);
    }
}
Before optimization
[GraphStep(vertex,[]), HasStep([~label.eq(Person)]), GroupStep([PropertiesStep([name, surname],value), FoldStep],[PropertiesStep([age],value), MaxGlobalStep])]
After optimization
[SqlgGraphStep(vertex,[])@[sqlgPathTempFakeLabel], SqlgGroupStep]
sql
SELECT
	"public"."V_Person"."surname" AS "alias1",
	"public"."V_Person"."name" AS "alias2",
	MAX("public"."V_Person"."age") AS "alias3"
FROM
	"public"."V_Person"
GROUP BY
	"public"."V_Person"."name",
	"public"."V_Person"."surname"
Group By and Sum Step
@Test
public void testGroupOverOnePropertySum() {
    this.sqlgGraph.addVertex(T.label, "Person", "name", "A", "age", 1);
    this.sqlgGraph.addVertex(T.label, "Person", "name", "B", "age", 2);
    this.sqlgGraph.addVertex(T.label, "Person", "name", "A", "age", 3);
    this.sqlgGraph.addVertex(T.label, "Person", "name", "B", "age", 4);
    this.sqlgGraph.tx().commit();

    DefaultTraversal<Vertex, Map<String, Long>> traversal = (DefaultTraversal) sqlgGraph.traversal()
            .V().hasLabel("Person")
            .<String, Long>group().by("name").by(__.values("age").sum());
    printTraversalForm(traversal);
    Map<String, Long> result = traversal.next();
    Assert.assertFalse(traversal.hasNext());
    Assert.assertTrue(result.containsKey("A"));
    Assert.assertTrue(result.containsKey("B"));
    Assert.assertEquals(4, result.get("A"), 0L);
    Assert.assertEquals(6, result.get("B"), 0L);
}
Before optimization
[GraphStep(vertex,[]), HasStep([~label.eq(Person)]), GroupStep(value(name),[PropertiesStep([age],value), SumGlobalStep])]
After optimization
[SqlgGraphStep(vertex,[])@[sqlgPathTempFakeLabel], SqlgGroupStep]
sql
SELECT
	"public"."V_Person"."name" AS "alias1",
	SUM("public"."V_Person"."age") AS "alias2"
FROM
	"public"."V_Person"
GROUP BY
	"public"."V_Person"."name"
Group By and Mean Step
@Test
public void testGroupOverOnePropertyMean() {
    this.sqlgGraph.addVertex(T.label, "Person", "name", "A", "age", 1);
    this.sqlgGraph.addVertex(T.label, "Person", "name", "B", "age", 2);
    this.sqlgGraph.addVertex(T.label, "Person", "name", "A", "age", 3);
    this.sqlgGraph.addVertex(T.label, "Person", "name", "B", "age", 4);
    this.sqlgGraph.tx().commit();

    DefaultTraversal<Vertex, Map<String, Double>> traversal = (DefaultTraversal) sqlgGraph.traversal()
            .V().hasLabel("Person")
            .<String, Double>group().by("name").by(__.values("age").mean());
    printTraversalForm(traversal);
    Map<String, Double> result = traversal.next();
    Assert.assertFalse(traversal.hasNext());
    Assert.assertTrue(result.containsKey("A"));
    Assert.assertTrue(result.containsKey("B"));
    Assert.assertEquals(2.0, result.get("A"), 0D);
    Assert.assertEquals(3.0, result.get("B"), 0D);
}
Before optimization
[GraphStep(vertex,[]), HasStep([~label.eq(Person)]), GroupStep(value(name),[PropertiesStep([age],value), MeanGlobalStep])]
After optimization
[SqlgGraphStep(vertex,[])@[sqlgPathTempFakeLabel], SqlgGroupStep]
sql
SELECT
	"public"."V_Person"."name" AS "alias1",
	AVG("public"."V_Person"."age") AS "alias2", COUNT(1) AS "alias2_weight"
FROM
	"public"."V_Person"
GROUP BY
	"public"."V_Person"."name"
Group By and Count Step
@Test
public void testGroupByCount() {
    this.sqlgGraph.addVertex(T.label, "A", "name", "a", "age", 1);
    this.sqlgGraph.addVertex(T.label, "A", "name", "a", "age", 2);
    this.sqlgGraph.addVertex(T.label, "A", "name", "b", "age", 3);
    this.sqlgGraph.tx().commit();
    DefaultTraversal<Vertex, Map<Object, Long>> traversal = (DefaultTraversal<Vertex, Map<Object, Long>>) this.sqlgGraph.traversal().V().hasLabel("A")
            .<Object, Long>group().by("name").by(__.count());
    List<Map<Object, Long>> result = traversal.toList();
    Assert.assertEquals(1, result.size());
    Assert.assertTrue(result.get(0).containsKey("a"));
    Assert.assertTrue(result.get(0).containsKey("b"));
    Assert.assertEquals(2L, result.get(0).get("a"), 0);
    Assert.assertEquals(1L, result.get(0).get("b"), 0);
}
Before optimization
[GraphStep(vertex,[]), HasStep([~label.eq(A)]), GroupStep(value(name),[CountGlobalStep])]
After optimization
[SqlgGraphStep(vertex,[])@[sqlgPathTempFakeLabel], SqlgGroupStep]
sql
SELECT
	COUNT(1) AS "count",
	"public"."V_A"."name" AS "alias1"
FROM
	"public"."V_A"
GROUP BY
	"public"."V_A"."name"
@Test
public void testDuplicatePathGroupCountQuery() {
    Vertex a1 = this.sqlgGraph.addVertex(T.label, "A", "name", "a1", "age", 1);
    Vertex b1 = this.sqlgGraph.addVertex(T.label, "B", "name", "b", "age", 1);
    Vertex b2 = this.sqlgGraph.addVertex(T.label, "B", "name", "b", "age", 2);
    Vertex b3 = this.sqlgGraph.addVertex(T.label, "B", "name", "b", "age", 3);
    Vertex b4 = this.sqlgGraph.addVertex(T.label, "B", "name", "b", "age", 3);
    Vertex c1 = this.sqlgGraph.addVertex(T.label, "C", "name", "b", "age", 1);
    Vertex c2 = this.sqlgGraph.addVertex(T.label, "C", "name", "b", "age", 2);
    Vertex c3 = this.sqlgGraph.addVertex(T.label, "C", "name", "b", "age", 3);
    a1.addEdge("ab", b1);
    a1.addEdge("ab", b2);
    a1.addEdge("ab", b3);
    a1.addEdge("ab", b4);
    a1.addEdge("ac", c1);
    a1.addEdge("ac", c2);
    a1.addEdge("ac", c3);
    this.sqlgGraph.tx().commit();

    DefaultTraversal<Vertex, Map<String, Long>> traversal = (DefaultTraversal) this.sqlgGraph.traversal().V(a1).out("ab", "ac").group().by("name").by(__.count());
    Assert.assertEquals(2, traversal.getSteps().size());
    Assert.assertTrue(traversal.getSteps().get(0) instanceof SqlgGraphStep);
    Assert.assertTrue(traversal.getSteps().get(1) instanceof SqlgGroupStep);
    Map<String, Long> result = traversal.next();
    Assert.assertEquals(1, result.size());
    Assert.assertTrue(result.containsKey("b"));
    Assert.assertEquals(7, result.get("b"), 0);
    Assert.assertFalse(traversal.hasNext());
}
Before optimization
[GraphStep(vertex,[v[public.A:::1]]), VertexStep(OUT,[ab, ac],vertex), GroupStep(value(name),[CountGlobalStep])]
After optimization
[SqlgGraphStep(vertex,[])@[sqlgPathTempFakeLabel], SqlgGroupStep]
sql
SELECT
	COUNT(1) AS "count",
	"public"."V_C"."name" AS "alias1"
FROM
	"public"."V_A" INNER JOIN
	"public"."E_ac" ON "public"."V_A"."ID" = "public"."E_ac"."public.A__O" INNER JOIN
	"public"."V_C" ON "public"."E_ac"."public.C__I" = "public"."V_C"."ID"
WHERE
	( "public"."V_A"."ID" = ?)
GROUP BY
	"public"."V_C"."name";

SELECT
	COUNT(1) AS "count",
	"public"."V_B"."name" AS "alias1"
FROM
	"public"."V_A" INNER JOIN
	"public"."E_ab" ON "public"."V_A"."ID" = "public"."E_ab"."public.A__O" INNER JOIN
	"public"."V_B" ON "public"."E_ab"."public.B__I" = "public"."V_B"."ID"
WHERE
	( "public"."V_A"."ID" = ?)
GROUP BY
	"public"."V_B"."name";

10.2. Optimization (strategy 2)

The following steps are optimized. Steps are

The combined step will then in turn generate the sql statements to retrieve the data. It attempts to retrieve the data in as few distinct sql statements as possible.

10.2.1. Vertex Step

@Test
public void testStrategy2VertexStep() {
    Vertex a1 = this.sqlgGraph.addVertex(T.label, "A", "name", "a1");
    Vertex a2 = this.sqlgGraph.addVertex(T.label, "A", "name", "a2");
    Vertex a3 = this.sqlgGraph.addVertex(T.label, "A", "name", "a3");
    Vertex b1 = this.sqlgGraph.addVertex(T.label, "B", "name", "b1");
    Vertex b2 = this.sqlgGraph.addVertex(T.label, "B", "name", "b2");
    Vertex b3 = this.sqlgGraph.addVertex(T.label, "B", "name", "b3");
    a1.addEdge("ab", b1);
    a2.addEdge("ab", b2);
    a3.addEdge("ab", b3);
    this.sqlgGraph.tx().commit();

    Traversal<Vertex, String> t = this.sqlgGraph.traversal()
            .V().hasLabel("A")
            .limit(2)
            .out()
            .values("name");
    printTraversalForm(t);
    List<String> result = t.toList();
    for (String name : result) {
        System.out.println(name);
    }
}
output
b1
b2
Before optimization
[GraphStep(vertex,[]), HasStep([~label.eq(A)]), RangeGlobalStep(0,2), VertexStep(OUT,vertex), PropertiesStep([name],value)]
After optimization
[SqlgGraphStep(vertex,[])@[sqlgPathOrderRangeLabel], SqlgVertexStep@[sqlgPathFakeLabel], PropertiesStep([name],value)]

after optimization shows that there is a SqlgVertexStep after the SqlgGraphStep. The SqlgVertexStep will barrier the incoming A s and execute the next traversal for all the incoming elements in one sql statement.

SELECT
	"public"."V_A"."ID" AS "alias1",
	"public"."V_A"."name" AS "alias2"
FROM
	"public"."V_A"
LIMIT 2 OFFSET 0 (1)

SELECT
	"index" as "index",
	"public"."V_B"."ID" AS "alias1",
	"public"."V_B"."name" AS "alias2"
FROM
	"public"."V_A" INNER JOIN
	"public"."E_ab" ON "public"."V_A"."ID" = "public"."E_ab"."public.A__O" INNER JOIN
	"public"."V_B" ON "public"."E_ab"."public.B__I" = "public"."V_B"."ID" INNER JOIN
	(VALUES(1, 1),(2, 2)) AS tmp ("tmpId", "index") ON "public"."V_A"."ID" = tmp."tmpId"
ORDER BY
	"index" (2)
  1. Get all the A s.

  2. For all the previously fetched A s get the B s.

10.2.2. Repeat Step

@Test
public void testStrategy2RepeatStep() {
    Vertex a1 = this.sqlgGraph.addVertex(T.label, "A", "name", "a1");
    Vertex a2 = this.sqlgGraph.addVertex(T.label, "A", "name", "a2");
    Vertex b1 = this.sqlgGraph.addVertex(T.label, "B", "name", "b1");
    Vertex b2 = this.sqlgGraph.addVertex(T.label, "B", "name", "b2");
    Vertex b3 = this.sqlgGraph.addVertex(T.label, "B", "name", "b3");
    Vertex b4 = this.sqlgGraph.addVertex(T.label, "B", "name", "b4");
    Vertex b5 = this.sqlgGraph.addVertex(T.label, "B", "name", "b5");
    Vertex b6 = this.sqlgGraph.addVertex(T.label, "B", "name", "b6");
    Vertex c1 = this.sqlgGraph.addVertex(T.label, "C", "name", "c1");
    Vertex c2 = this.sqlgGraph.addVertex(T.label, "C", "name", "c2");
    Vertex c3 = this.sqlgGraph.addVertex(T.label, "C", "name", "c3");
    Vertex c4 = this.sqlgGraph.addVertex(T.label, "C", "name", "c4");
    Vertex x = this.sqlgGraph.addVertex(T.label, "X", "name", "hallo");
    a1.addEdge("ab", b1);
    a1.addEdge("ab", b2);
    a1.addEdge("ab", b3);
    a2.addEdge("ab", b4);
    a2.addEdge("ab", b5);
    a2.addEdge("ab", b6);

    b1.addEdge("bx", x);

    b4.addEdge("bc", c1);
    b4.addEdge("bc", c2);
    b4.addEdge("bc", c3);

    c1.addEdge("cx", x);

    this.sqlgGraph.tx().commit();

    Traversal<Vertex, String> t = this.sqlgGraph.traversal()
            .V().hasLabel("A")
            .repeat(__.out())
            .until(__.out().has("name", "hallo"))
            .values("name");
    printTraversalForm(t);

    List<String> names = t.toList();
    for (String name: names) {
        System.out.println(name);
    }
}
Before optimization
[GraphStep(vertex,[]), HasStep([~label.eq(A)]), RepeatStep([VertexStep(OUT,vertex), RepeatEndStep],until([VertexStep(OUT,vertex), HasStep([name.eq(hallo)])]),emit(false)), PropertiesStep([name],value)]
After optimization
post-strategy:[SqlgGraphStep(vertex,[])@[sqlgPathFakeLabel], SqlgRepeatStepBarrier([SqlgVertexStep@[sqlgPathFakeLabel], SqlgRepeatEndStepBarrier],until([SqlgVertexStep@[sqlgPathFakeLabel]]),emit(false)), PropertiesStep([name],value)]
SELECT
	"public"."V_A"."ID" AS "alias1",
	"public"."V_A"."name" AS "alias2"
FROM
	"public"."V_A" (1)

SELECT
	"index" as "index",
	"public"."V_B"."ID" AS "alias1",
	"public"."V_B"."name" AS "alias2"
FROM
	"public"."V_A" INNER JOIN
	"public"."E_ab" ON "public"."V_A"."ID" = "public"."E_ab"."public.A__O" INNER JOIN
	"public"."V_B" ON "public"."E_ab"."public.B__I" = "public"."V_B"."ID" INNER JOIN
	(VALUES(1, 1),(2, 2)) AS tmp ("tmpId", "index") ON "public"."V_A"."ID" = tmp."tmpId"
ORDER BY
	"index" (2)

SELECT
	"index" as "index",
	"public"."V_X"."ID" AS "alias1",
	"public"."V_X"."name" AS "alias2"
FROM
	"public"."V_B" INNER JOIN
	"public"."E_bx" ON "public"."V_B"."ID" = "public"."E_bx"."public.B__O" INNER JOIN
	"public"."V_X" ON "public"."E_bx"."public.X__I" = "public"."V_X"."ID" INNER JOIN
	(VALUES(3, 1),(2, 2),(1, 3),(6, 4),(5, 5),(4, 6)) AS tmp ("tmpId", "index") ON "public"."V_B"."ID" = tmp."tmpId"
WHERE
	( "public"."V_X"."name" = ?)
ORDER BY
	"index" (3)

SELECT
	"index" as "index",
	"public"."V_C"."ID" AS "alias1",
	"public"."V_C"."name" AS "alias2"
FROM
	"public"."V_B" INNER JOIN
	"public"."E_bc" ON "public"."V_B"."ID" = "public"."E_bc"."public.B__O" INNER JOIN
	"public"."V_C" ON "public"."E_bc"."public.C__I" = "public"."V_C"."ID" INNER JOIN
	(VALUES(3, 1),(2, 2),(1, 3),(6, 4),(5, 5),(4, 6)) AS tmp ("tmpId", "index") ON "public"."V_B"."ID" = tmp."tmpId"
WHERE
	( "public"."V_C"."name" = ?)
ORDER BY
	"index" (4)

SELECT
	"index" as "index",
	"public"."V_X"."ID" AS "alias1",
	"public"."V_X"."name" AS "alias2"
FROM
	"public"."V_B" INNER JOIN
	"public"."E_bx" ON "public"."V_B"."ID" = "public"."E_bx"."public.B__O" INNER JOIN
	"public"."V_X" ON "public"."E_bx"."public.X__I" = "public"."V_X"."ID" INNER JOIN
	(VALUES(3, 3),(2, 4),(6, 5),(5, 6),(4, 7)) AS tmp ("tmpId", "index") ON "public"."V_B"."ID" = tmp."tmpId"
ORDER BY
	"index" (5)

SELECT
	"index" as "index",
	"public"."V_C"."ID" AS "alias1",
	"public"."V_C"."name" AS "alias2"
FROM
	"public"."V_B" INNER JOIN
	"public"."E_bc" ON "public"."V_B"."ID" = "public"."E_bc"."public.B__O" INNER JOIN
	"public"."V_C" ON "public"."E_bc"."public.C__I" = "public"."V_C"."ID" INNER JOIN
	(VALUES(3, 3),(2, 4),(6, 5),(5, 6),(4, 7)) AS tmp ("tmpId", "index") ON "public"."V_B"."ID" = tmp."tmpId"
ORDER BY
	"index" (6)

SELECT
	"index" as "index",
	"public"."V_X"."ID" AS "alias1",
	"public"."V_X"."name" AS "alias2"
FROM
	"public"."V_C" INNER JOIN
	"public"."E_cx" ON "public"."V_C"."ID" = "public"."E_cx"."public.C__O" INNER JOIN
	"public"."V_X" ON "public"."E_cx"."public.X__I" = "public"."V_X"."ID" INNER JOIN
	(VALUES(3, 7),(2, 8),(1, 9)) AS tmp ("tmpId", "index") ON "public"."V_C"."ID" = tmp."tmpId"
WHERE
	( "public"."V_X"."name" = ?)
ORDER BY
	"index" (7)

SELECT
	"index" as "index",
	"public"."V_X"."ID" AS "alias1",
	"public"."V_X"."name" AS "alias2"
FROM
	"public"."V_C" INNER JOIN
	"public"."E_cx" ON "public"."V_C"."ID" = "public"."E_cx"."public.C__O" INNER JOIN
	"public"."V_X" ON "public"."E_cx"."public.X__I" = "public"."V_X"."ID" INNER JOIN
	(VALUES(3, 8),(2, 9)) AS tmp ("tmpId", "index") ON "public"."V_C"."ID" = tmp."tmpId"
ORDER BY
	"index" (8)
  1. Get all the A s.

  2. Get all the B s for the incoming A s. This represent the first out iteration of the repeat.

  3. The until traversal executed for all the incoming B s going out to X.

  4. The until traversal executed for all the incoming B s going out to C.

  5. Get all the X for the incoming B s. This is the second out iteration of the repeat.

  6. Get all the C for the incoming B s. This is the second out iteration of the repeat.

  7. The until traversal executed for all the incoming C s going out to X.

  8. Get all the X for the incoming C s. This is the third out iteration of the repeat.

output
b1
c1

10.2.3. Optional Step

@Test
public void testStrategy2OptionalStep() {
    Vertex a1 = this.sqlgGraph.addVertex(T.label, "A", "name", "a1");
    Vertex a2 = this.sqlgGraph.addVertex(T.label, "A", "name", "a2");
    Vertex b1 = this.sqlgGraph.addVertex(T.label, "B", "name", "b1");
    Vertex b2 = this.sqlgGraph.addVertex(T.label, "B", "name", "b2");
    Vertex c1 = this.sqlgGraph.addVertex(T.label, "C", "name", "c1");
    a1.addEdge("ab", b1);
    a1.addEdge("ab", b2);
    b1.addEdge("bc", c1);


    this.sqlgGraph.tx().commit();

    Traversal<Vertex, String> traversal = this.sqlgGraph.traversal()
            .V().hasLabel("A")
            .optional(
                __.repeat(
                        __.out()
                ).times(2)
            )
            .values("name");
    printTraversalForm(traversal);
    List<String> names = traversal.toList();
    for (String name : names) {
        System.out.println(name);
    }
}
Before optimization
[GraphStep(vertex,[]), HasStep([~label.eq(A)]), OptionalStep([RepeatStep([VertexStep(OUT,vertex), RepeatEndStep],until(loops(2)),emit(false))]), PropertiesStep([name],value)]
After optimization
[SqlgGraphStep(vertex,[])@[sqlgPathFakeLabel], SqlgOptionalStepBarrier([SqlgRepeatStepBarrier([SqlgVertexStep@[sqlgPathFakeLabel], SqlgRepeatEndStepBarrier],until(loops(2)),emit(false))]), PropertiesStep([name],value)]
SELECT
	"public"."V_A"."ID" AS "alias1",
	"public"."V_A"."name" AS "alias2"
FROM
	"public"."V_A"

SELECT
	"index" as "index",
	"public"."V_B"."ID" AS "alias1",
	"public"."V_B"."name" AS "alias2"
FROM
	"public"."V_A" INNER JOIN
	"public"."E_ab" ON "public"."V_A"."ID" = "public"."E_ab"."public.A__O" INNER JOIN
	"public"."V_B" ON "public"."E_ab"."public.B__I" = "public"."V_B"."ID" INNER JOIN
	(VALUES(1, 1),(2, 2)) AS tmp ("tmpId", "index") ON "public"."V_A"."ID" = tmp."tmpId"
ORDER BY
	"index"

SELECT
	3 as "index",
	"public"."V_C"."ID" AS "alias1",
	"public"."V_C"."name" AS "alias2"
FROM
	"public"."V_B" INNER JOIN
	"public"."E_bc" ON "public"."V_B"."ID" = "public"."E_bc"."public.B__O" INNER JOIN
	"public"."V_C" ON "public"."E_bc"."public.C__I" = "public"."V_C"."ID"
WHERE
	"public"."V_B"."ID" = 2
ORDER BY
	"index"

SELECT
	4 as "index",
	"public"."V_C"."ID" AS "alias1",
	"public"."V_C"."name" AS "alias2"
FROM
	"public"."V_B" INNER JOIN
	"public"."E_bc" ON "public"."V_B"."ID" = "public"."E_bc"."public.B__O" INNER JOIN
	"public"."V_C" ON "public"."E_bc"."public.C__I" = "public"."V_C"."ID"
WHERE
	"public"."V_B"."ID" = 1
ORDER BY
	"index"
output
a2
c1

10.2.4. Choose Step

@Test
public void testStrategy2ChooseStep() {
    Vertex a1 = this.sqlgGraph.addVertex(T.label, "A", "name", "a1");
    Vertex a2 = this.sqlgGraph.addVertex(T.label, "A", "name", "a2");
    Vertex b1 = this.sqlgGraph.addVertex(T.label, "B", "name", "a3");
    Vertex b2 = this.sqlgGraph.addVertex(T.label, "B", "name", "a4");
    a1.addEdge("ab", b1);
    a1.addEdge("ab", b2);
    this.sqlgGraph.tx().commit();

    Traversal<Vertex, String> traversal = this.sqlgGraph.traversal()
            .V()
            .hasLabel("A")
            .choose(
                    v -> v.label().equals("A"),
                    __.out(),
                    __.in()
            ).values("name");
    printTraversalForm(traversal);

    List<String> names = traversal.toList();
    for (String name : names) {
        System.out.println(name);
    }
}
Before optimization

After optimization
[GraphStep(vertex,[]), HasStep([~label.eq(A)]), ChooseStep([LambdaFilterStep(lambda), HasNextStep],{false=[[VertexStep(IN,vertex), EndStep]], true=[[VertexStep(OUT,vertex), EndStep]]}), PropertiesStep([name],value)]
[SqlgGraphStep(vertex,[])@[sqlgPathFakeLabel], SqlgChooseStepBarrier([LambdaFilterStep(lambda)],{false=[[SqlgVertexStep, EndStep]], true=[[SqlgVertexStep@[~gremlin.incidentToAdjacent], EndStep]]}), PropertiesStep([name],value)]
output
a4
a3

10.2.5. Local Step

@Test
public void testStrategy2LocalStep() {
    Vertex a1 = this.sqlgGraph.addVertex(T.label, "A", "name", "a1");
    Vertex b1 = this.sqlgGraph.addVertex(T.label, "B", "name", "b1");
    Vertex b2 = this.sqlgGraph.addVertex(T.label, "B", "name", "b2");
    Vertex b3 = this.sqlgGraph.addVertex(T.label, "B", "name", "b3");
    Vertex c11 = this.sqlgGraph.addVertex(T.label, "C", "name", "c11");
    Vertex c12 = this.sqlgGraph.addVertex(T.label, "C", "name", "c12");
    Vertex c13 = this.sqlgGraph.addVertex(T.label, "C", "name", "c13");
    Vertex c21 = this.sqlgGraph.addVertex(T.label, "C", "name", "c21");
    Vertex c22 = this.sqlgGraph.addVertex(T.label, "C", "name", "c22");
    Vertex c23 = this.sqlgGraph.addVertex(T.label, "C", "name", "c23");
    Vertex c31 = this.sqlgGraph.addVertex(T.label, "C", "name", "c31");
    Vertex c32 = this.sqlgGraph.addVertex(T.label, "C", "name", "c32");
    Vertex c33 = this.sqlgGraph.addVertex(T.label, "C", "name", "c33");
    a1.addEdge("ab", b1);
    a1.addEdge("ab", b2);
    a1.addEdge("ab", b3);
    b1.addEdge("bc", c11);
    b1.addEdge("bc", c12);
    b1.addEdge("bc", c13);
    b2.addEdge("bc", c21);
    b2.addEdge("bc", c22);
    b2.addEdge("bc", c23);
    b3.addEdge("bc", c31);
    b3.addEdge("bc", c32);
    b3.addEdge("bc", c33);
    this.sqlgGraph.tx().commit();

    Traversal<Vertex, String> traversal = this.sqlgGraph.traversal()
            .V(a1)
            .local(
                    __.out().limit(1).out()
            ).values("name");
    printTraversalForm(traversal);

    List<String> names = traversal.toList();
    for (String name : names) {
        System.out.println(name);
    }
}
Before optimization
[GraphStep(vertex,[v[public.A:::1]]), LocalStep([VertexStep(OUT,vertex), RangeGlobalStep(0,1), VertexStep(OUT,vertex)]), PropertiesStep([name],value)]
After optimization
[SqlgGraphStep(vertex,[])@[sqlgPathFakeLabel], LocalStep([SqlgVertexStep@[sqlgPathOrderRangeLabel], SqlgVertexStep@[sqlgPathFakeLabel]]), PropertiesStep([name],value)]
output
SELECT
	"public"."V_A"."ID" AS "alias1",
	"public"."V_A"."name" AS "alias2"
FROM
	"public"."V_A"
WHERE
	( "public"."V_A"."ID" = ?)

SELECT
	1 as "index",
	"public"."V_B"."ID" AS "alias1",
	"public"."V_B"."name" AS "alias2"
FROM
	"public"."V_A" INNER JOIN
	"public"."E_ab" ON "public"."V_A"."ID" = "public"."E_ab"."public.A__O" INNER JOIN
	"public"."V_B" ON "public"."E_ab"."public.B__I" = "public"."V_B"."ID"
WHERE
	"public"."V_A"."ID" = 1
ORDER BY
	"index"
LIMIT 1 OFFSET 0 (1)

SELECT
	1 as "index",
	"public"."V_C"."ID" AS "alias1",
	"public"."V_C"."name" AS "alias2"
FROM
	"public"."V_B" INNER JOIN
	"public"."E_bc" ON "public"."V_B"."ID" = "public"."E_bc"."public.B__O" INNER JOIN
	"public"."V_C" ON "public"."E_bc"."public.C__I" = "public"."V_C"."ID"
WHERE
	"public"."V_B"."ID" = 1
ORDER BY
	"index"
  1. In this case the query is simple enough for the LIMIT to be executed on the database.

10.2.6. And Step

@Test
public void testStrategy2AndStep() {
    Vertex a1 = this.sqlgGraph.addVertex(T.label, "A", "name", "a1");
    Vertex b1 = this.sqlgGraph.addVertex(T.label, "B", "name", "b1");
    a1.addEdge("ab", b1);
    a1.addEdge("abb", b1);
    Vertex a2 = this.sqlgGraph.addVertex(T.label, "A", "name", "a2");
    Vertex b2 = this.sqlgGraph.addVertex(T.label, "B", "name", "b2");
    a2.addEdge("abb", b2);
    Vertex a3 = this.sqlgGraph.addVertex(T.label, "A", "name", "a3");
    Vertex b3 = this.sqlgGraph.addVertex(T.label, "B", "name", "b3");
    a3.addEdge("abbb", b3);

    Traversal<Vertex, String> traversal = this.sqlgGraph.traversal().V().hasLabel("A").and(
            __.out("ab"),
            __.out("abb")
    ).values("name");
    printTraversalForm(traversal);

    List<String> names = traversal.toList();
    for (String name : names) {
        System.out.println(name);
    }
}
Before optimization
[GraphStep(vertex,[]), HasStep([~label.eq(A)]), AndStep([[VertexStep(OUT,[ab],vertex)], [VertexStep(OUT,[abb],vertex)]]), PropertiesStep([name],value)]
After optimization
[SqlgGraphStep(vertex,[])@[sqlgPathFakeLabel], SqlgAndStepBarrier([[SqlgVertexStep@[sqlgPathFakeLabel]], [SqlgVertexStep@[sqlgPathFakeLabel]]]), PropertiesStep([name],value)]
SELECT
	"public"."V_A"."ID" AS "alias1",
	"public"."V_A"."name" AS "alias2"
FROM
	"public"."V_A"

SELECT
	"index" as "index",
	"public"."V_B"."ID" AS "alias1",
	"public"."V_B"."name" AS "alias2"
FROM
	"public"."V_A" INNER JOIN
	"public"."E_ab" ON "public"."V_A"."ID" = "public"."E_ab"."public.A__O" INNER JOIN
	"public"."V_B" ON "public"."E_ab"."public.B__I" = "public"."V_B"."ID" INNER JOIN
	(VALUES(1, 1),(2, 2),(3, 3)) AS tmp ("tmpId", "index") ON "public"."V_A"."ID" = tmp."tmpId"
ORDER BY
	"index"

SELECT
	"index" as "index",
	"public"."V_B"."ID" AS "alias1",
	"public"."V_B"."name" AS "alias2"
FROM
	"public"."V_A" INNER JOIN
	"public"."E_abb" ON "public"."V_A"."ID" = "public"."E_abb"."public.A__O" INNER JOIN
	"public"."V_B" ON "public"."E_abb"."public.B__I" = "public"."V_B"."ID" INNER JOIN
	(VALUES(1, 1),(2, 2),(3, 3)) AS tmp ("tmpId", "index") ON "public"."V_A"."ID" = tmp."tmpId"
ORDER BY
	"index"
output
a1

10.2.7. Or Step

@Test
public void testStrategy2OrStep() {
    Vertex a1 = this.sqlgGraph.addVertex(T.label, "A", "name", "a1");
    Vertex b1 = this.sqlgGraph.addVertex(T.label, "B", "name", "b1");
    a1.addEdge("ab", b1);
    Vertex a2 = this.sqlgGraph.addVertex(T.label, "A", "name", "a2");
    Vertex b2 = this.sqlgGraph.addVertex(T.label, "B", "name", "b2");
    a2.addEdge("abb", b2);
    Vertex a3 = this.sqlgGraph.addVertex(T.label, "A", "name", "a3");
    Vertex b3 = this.sqlgGraph.addVertex(T.label, "B", "name", "b3");
    a3.addEdge("abbb", b3);
    Vertex a4 = this.sqlgGraph.addVertex(T.label, "A", "name", "a4");
    Vertex b4 = this.sqlgGraph.addVertex(T.label, "B", "name", "b4");
    a4.addEdge("abbbb", b4);


    Traversal<Vertex, String> traversal = this.sqlgGraph.traversal()
            .V().hasLabel("A")
            .or(
                    __.out("ab"),
                    __.out("abb"),
                    __.out("abbb")
            ).values("name");
    printTraversalForm(traversal);

    List<String> names = traversal.toList();
    for (String name : names) {
        System.out.println(name);
    }
}
Before optimization
[GraphStep(vertex,[]), HasStep([~label.eq(A)]), OrStep([[VertexStep(OUT,[ab],vertex)], [VertexStep(OUT,[abb],vertex)], [VertexStep(OUT,[abbb],vertex)]]), PropertiesStep([name],value)]
After optimization
[SqlgGraphStep(vertex,[])@[sqlgPathFakeLabel], SqlgOrStepBarrier([[SqlgVertexStep@[sqlgPathFakeLabel]], [SqlgVertexStep@[sqlgPathFakeLabel]], [SqlgVertexStep@[sqlgPathFakeLabel]]]), PropertiesStep([name],value)]
SELECT
	"public"."V_A"."ID" AS "alias1",
	"public"."V_A"."name" AS "alias2"
FROM
	"public"."V_A"

SELECT
	"index" as "index",
	"public"."V_B"."ID" AS "alias1",
	"public"."V_B"."name" AS "alias2"
FROM
	"public"."V_A" INNER JOIN
	"public"."E_ab" ON "public"."V_A"."ID" = "public"."E_ab"."public.A__O" INNER JOIN
	"public"."V_B" ON "public"."E_ab"."public.B__I" = "public"."V_B"."ID" INNER JOIN
	(VALUES(1, 1),(2, 2),(3, 3),(4, 4)) AS tmp ("tmpId", "index") ON "public"."V_A"."ID" = tmp."tmpId"
ORDER BY
	"index"

SELECT
	"index" as "index",
	"public"."V_B"."ID" AS "alias1",
	"public"."V_B"."name" AS "alias2"
FROM
	"public"."V_A" INNER JOIN
	"public"."E_abb" ON "public"."V_A"."ID" = "public"."E_abb"."public.A__O" INNER JOIN
	"public"."V_B" ON "public"."E_abb"."public.B__I" = "public"."V_B"."ID" INNER JOIN
	(VALUES(2, 1),(3, 2),(4, 3)) AS tmp ("tmpId", "index") ON "public"."V_A"."ID" = tmp."tmpId"
ORDER BY
	"index"

SELECT
	"index" as "index",
	"public"."V_B"."ID" AS "alias1",
	"public"."V_B"."name" AS "alias2"
FROM
	"public"."V_A" INNER JOIN
	"public"."E_abbb" ON "public"."V_A"."ID" = "public"."E_abbb"."public.A__O" INNER JOIN
	"public"."V_B" ON "public"."E_abbb"."public.B__I" = "public"."V_B"."ID" INNER JOIN
	(VALUES(3, 1),(4, 2)) AS tmp ("tmpId", "index") ON "public"."V_A"."ID" = tmp."tmpId"
ORDER BY
	"index"
output
a1
a2
a3

10.2.8. Not Step

@Test
public void testStrategy2NotStep() {
    Vertex a1 = this.sqlgGraph.addVertex(T.label, "A", "name", "a1");
    Vertex a2 = this.sqlgGraph.addVertex(T.label, "A", "name", "a2");
    Vertex b1 = this.sqlgGraph.addVertex(T.label, "B", "name", "b1");
    a1.addEdge("ab", b1);
    this.sqlgGraph.tx().commit();

    Traversal<Vertex, String> traversal = this.sqlgGraph.traversal()
            .V().hasLabel("A")
            .not(
                    __.out()
            ).values("name");

    List<String> names = traversal.toList();
    for (String name : names) {
        System.out.println(name);
    }
}
Before optimization
[GraphStep(vertex,[]), HasStep([~label.eq(A)]), NotStep([VertexStep(OUT,vertex)]), PropertiesStep([name],value)]
After optimization
[SqlgGraphStep(vertex,[])@[sqlgPathFakeLabel], SqlgNotStepBarrier([[SqlgVertexStep@[sqlgPathFakeLabel]]]), PropertiesStep([name],value)]
SELECT
	"public"."V_A"."ID" AS "alias1",
	"public"."V_A"."name" AS "alias2"
FROM
	"public"."V_A"

SELECT
	"index" as "index",
	"public"."V_B"."ID" AS "alias1",
	"public"."V_B"."name" AS "alias2"
FROM
	"public"."V_A" INNER JOIN
	"public"."E_ab" ON "public"."V_A"."ID" = "public"."E_ab"."public.A__O" INNER JOIN
	"public"."V_B" ON "public"."E_ab"."public.B__I" = "public"."V_B"."ID" INNER JOIN
	(VALUES(1, 1),(2, 2)) AS tmp ("tmpId", "index") ON "public"."V_A"."ID" = tmp."tmpId"
ORDER BY
	"index"
output
a2

10.2.9. Where Step

@Test
public void testStrategy2WhereStep() {
    Vertex a1 = this.sqlgGraph.addVertex(T.label, "A", "name", "a1");
    Vertex a2 = this.sqlgGraph.addVertex(T.label, "A", "name", "a2");
    Vertex b1 = this.sqlgGraph.addVertex(T.label, "B", "name", "b1");
    Vertex b2 = this.sqlgGraph.addVertex(T.label, "B", "name", "b2");
    Vertex b3 = this.sqlgGraph.addVertex(T.label, "B", "name", "b3");
    a1.addEdge("ab", b1);
    a1.addEdge("ab", b2);
    a1.addEdge("ab", b3);
    a2.addEdge("ab", b1);
    this.sqlgGraph.tx().commit();

    Traversal<Vertex, String> traversal = this.sqlgGraph.traversal()
            .V().hasLabel("A")
            .where(
                    __.out()
            ).values("name");
    printTraversalForm(traversal);

    List<String> names = traversal.toList();
    for (String name : names) {
        System.out.println(name);
    }
}
Before optimization
[GraphStep(vertex,[]), HasStep([~label.eq(A)]), TraversalFilterStep([VertexStep(OUT,vertex)]), PropertiesStep([name],value)]
After optimization
[SqlgGraphStep(vertex,[])@[sqlgPathFakeLabel], SqlgTraversalFilterStepBarrier([SqlgVertexStep@[sqlgPathFakeLabel]]), PropertiesStep([name],value)]
SELECT
	"public"."V_A"."ID" AS "alias1",
	"public"."V_A"."name" AS "alias2"
FROM
	"public"."V_A"

SELECT
	"index" as "index",
	"public"."V_B"."ID" AS "alias1",
	"public"."V_B"."name" AS "alias2"
FROM
	"public"."V_A" INNER JOIN
	"public"."E_ab" ON "public"."V_A"."ID" = "public"."E_ab"."public.A__O" INNER JOIN
	"public"."V_B" ON "public"."E_ab"."public.B__I" = "public"."V_B"."ID" INNER JOIN
	(VALUES(1, 1),(2, 2)) AS tmp ("tmpId", "index") ON "public"."V_A"."ID" = tmp."tmpId"
ORDER BY
	"index"

10.3. Predicates

TinkerPop’s Compare and Contains predicates are optimized to execute on the database.

10.3.1. Compare predicate

@Test
public void showComparePredicates() {
    Vertex a1 = this.sqlgGraph.addVertex(T.label, "A", "name", "a1");
    Vertex b1 = this.sqlgGraph.addVertex(T.label, "B", "name", "b1");
    Vertex b2 = this.sqlgGraph.addVertex(T.label, "B", "name", "b2");
    Vertex c1 = this.sqlgGraph.addVertex(T.label, "C", "name", "c1");
    Vertex c2 = this.sqlgGraph.addVertex(T.label, "C", "name", "c2");
    Vertex c3 = this.sqlgGraph.addVertex(T.label, "C", "name", "c3");
    Vertex c4 = this.sqlgGraph.addVertex(T.label, "C", "name", "c4");
    a1.addEdge("ab", b1);
    a1.addEdge("ab", b2);
    b1.addEdge("bc", c1);
    b1.addEdge("bc", c2);
    b2.addEdge("bc", c3);
    b2.addEdge("bc", c4);
    this.sqlgGraph.tx().commit();

    List<String> result = this.sqlgGraph.traversal()
            .V().hasLabel("A")
            .out().has("name", P.eq("b1"))
            .out().has("name", P.eq("c2")) (1)
            .<String>values("name")
            .toList();
    for (String name : result) {
        System.out.println(name);
    }
}
  1. The P predicates will resolve on the database as a sql where clause.

sql
SELECT
	"public"."V_C"."ID" AS "alias1",
	"public"."V_C"."name" AS "alias2"
FROM
	"public"."V_A" INNER JOIN
	"public"."E_ab" ON "public"."V_A"."ID" = "public"."E_ab"."public.A__O" INNER JOIN
	"public"."V_B" ON "public"."E_ab"."public.B__I" = "public"."V_B"."ID" INNER JOIN
	"public"."E_bc" ON "public"."V_B"."ID" = "public"."E_bc"."public.B__O" INNER JOIN
	"public"."V_C" ON "public"."E_bc"."public.C__I" = "public"."V_C"."ID"
WHERE
	( "public"."V_B"."name" = ?) AND ( "public"."V_C"."name" = ?)

The same pattern is used for all the Compare predicates.

10.3.2. Contains predicate

Sqlg’s implementation of Contains is slightly more complex.

For Postgresql, MSSqlServer and HSQLDB a join onto a values expression is used.

For H2 and MariaDB a regular in clause is used.

@Test
public void showContainsPredicate() {
    List<Integer> numbers = new ArrayList<>(10000);
    for (int i = 0; i < 10000; i++) {
        this.sqlgGraph.addVertex(T.label, "A", "number", i);
        numbers.add(i);
    }
    this.sqlgGraph.tx().commit();

    List<Vertex> persons = this.sqlgGraph.traversal().V()
            .hasLabel("A")
            .has("number", P.within(numbers))
            .toList();

    assertEquals(10000, persons.size());
}
sql
SELECT
	"public"."V_A"."ID" AS "alias1",
	"public"."V_A"."number" AS "alias2"
FROM
	"public"."V_A" INNER JOIN
	(VALUES (0::INTEGER), (1::INTEGER), ... (9998::INTEGER), (9999::INTEGER)) as tmp1(within) on "public"."V_A"."number" = tmp1.within

This pattern makes P.within and p.without very fast even with millions of values being passed into the query. For the case of there being only one value Sqlg will use an equals instead of a values statement or an in statement.

10.3.3. Text predicate

Note
Sqlg assumes a case-sensitive collation. MSSqlServer does not default to a case-sensitive collation. Create the database with CREATE DATABASE sqlgraphdb COLLATE sql_latin1_general_cp1_cs_as

Sqlg includes its own Text predicate for full text queries.

  • Text.contains (case sensitive string contains)

  • Text.ncontains (case sensitive string does not contain)

  • Text.containsCIS (case insensitive string contains)

  • Text.ncontainsCIS (case insensitive string does not contain)

  • Text.startsWith (case sensitive string starts with)

  • Text.nstartsWith (case sensitive string does not start with)

  • Text.endsWith (case sensitive string ends with)

  • Text.nendsWith (case sensitive string does not end with)

@Test
public void showTextPredicate() {
    Vertex john = this.sqlgGraph.addVertex(T.label, "Person", "name", "John XXX Doe");
    Vertex peter = this.sqlgGraph.addVertex(T.label, "Person", "name", "Peter YYY Snow");
    this.sqlgGraph.tx().commit();

    List<Vertex> persons = this.sqlgGraph.traversal().V()
            .hasLabel("Person")
            .has("name", Text.contains("XXX")).toList();

    assertEquals(1, persons.size());
    assertEquals(john, persons.get(0));
}
sql
SELECT
	"public"."V_Person"."ID" AS "alias1",
	"public"."V_Person"."name" AS "alias2"
FROM
	"public"."V_Person"
WHERE
	( "public"."V_Person"."name" like ?)

Full text search is supported on postgresql.

This is shown under full text indexing

10.3.5. DateTime queries

LocalDateTime, LocalDate and LocalTime queries are supported.

@Test
public void showSearchOnLocalDateTime() {
    LocalDateTime born1 = LocalDateTime.of(1990, 1, 1, 1, 1, 1);
    LocalDateTime born2 = LocalDateTime.of(1990, 1, 1, 1, 1, 2);
    LocalDateTime born3 = LocalDateTime.of(1990, 1, 1, 1, 1, 3);
    Vertex john = this.sqlgGraph.addVertex(T.label, "Person", "name", "John", "born", born1);
    Vertex peter = this.sqlgGraph.addVertex(T.label, "Person", "name", "Peter", "born", born2);
    Vertex paul = this.sqlgGraph.addVertex(T.label, "Person", "name", "Paul", "born", born3);
    this.sqlgGraph.tx().commit();

    List<Vertex> persons = this.sqlgGraph.traversal().V().hasLabel("Person")
            .has("born", P.eq(born1))
            .toList();
    assertEquals(1, persons.size());
    assertEquals(john, persons.get(0));

    persons = this.sqlgGraph.traversal().V().hasLabel("Person")
            .has("born", P.between(LocalDateTime.of(1990, 1, 1, 1, 1, 1), LocalDateTime.of(1990, 1, 1, 1, 1, 3)))
            .toList();
    //P.between is inclusive to exclusive
    assertEquals(2, persons.size());
    assertTrue(persons.contains(john));
    assertTrue(persons.contains(peter));
}
sql
SELECT
	"public"."V_Person"."ID" AS "alias1",
	"public"."V_Person"."born" AS "alias2",
	"public"."V_Person"."name" AS "alias3"
FROM
	"public"."V_Person"
WHERE
	( "public"."V_Person"."born" >= ?) AND ( "public"."V_Person"."born" < ?)

11. Batch Mode

Sqlg supports 3 distinct batch modes. Normal, streaming and streaming with lock. Batch modes are only implemented on Postgresql. Batch mode is activated on the transaction object itself. After every commit the batchMode needs to be reactivated.

Sqlg introduces an extra method on the transaction, flush().

  • In normal batch mode flush() will send all the data to Postgresql, assign id(s) and clear the cache.

  • In streaming mode flush() will close the OutputStream that the data has been written to.

  • In streaming mode with lock flush() will close the OutputStream that the data has been written to and assign id(s).

The Postgresql 'copy' command is used to bulk insert data.

11.1. Normal batch mode

In normal batch mode the standard TinkerPop modification api can be used. Normal batch mode caches all modifications in memory and on commit() or flush() sends the modifications to the server.

Because all modifications are held in memory it is important to call commit() or flush() to prevent an OutOfMemoryError.

In batch mode vertices and edges returned from Graph.addVertex and vertex.addEdge respectively do not yet have their id(s) assigned to them. This is because the new vertices and edges are cached in memory and are only sent to Postgresql on commit() or flush(). After commit() or flush() the new vertices and edges have their id(s) assigned.

The transaction must be manually placed in normal batch mode. i.e. SqlgGraph.tx().normalBatchModeOn() must occur before any batch processing. After every commit() the transaction reverts to a regular transaction and must be placed in normal batch mode again for batch processing to continue.

Vertices and edges can be created and updated and removed as per normal making normal batch mode easy to use.

Note
Sqlg does not query the cache. If a gremlin query is executed while in batch mode the batch is first flushed. Take care not to query the graph while in batch mode as flushing often will defeat the purpose of batching in the first place.
custom api
sqlgGraph.tx().normalBatchModeOn();
sqlgGraph.tx().flush();

Create 10 000 000 Persons each with a car. 20 000 000 vertices and 10 000 000 edges in total.

@Test
public void showNormalBatchMode() {
    StopWatch stopWatch = new StopWatch();
    stopWatch.start();
    this.sqlgGraph.tx().normalBatchModeOn();
    for (int i = 1; i <= 10_000_000; i++) {
        Vertex person = this.sqlgGraph.addVertex(T.label, "Person", "name", "John" + i);
        Vertex car = this.sqlgGraph.addVertex(T.label, "Car", "name", "Dodge" + i);
        person.addEdge("drives", car);
        if (i % 100_000 == 0) { # (1)
            this.sqlgGraph.tx().flush(); # (1)
        }
    }
    this.sqlgGraph.tx().commit();
    stopWatch.stop();
    System.out.println(stopWatch.toString());
}
  1. To preserve memory commit or flush every so often.

output without edge foreign keys
Time taken: 0:05:48.889
output with edge foreign keys
Time taken: 0:02:33.313
memory

image of tinkerpop-classic

11.2. Streaming batch mode

Streaming batch writes any new vertex or edge immediately to Postgresql via its stdin api. I.e. the data is written directly to a Postgresql jdbc driver OutputStream.

Streaming batch mode does not use the Graph.addVertex method. Instead SqlgGraph.streamVertex is defined.

The transaction must be placed in streaming batch mode manually before any streaming batch modification can happen. SqlgGraph.tx().streamingBatchModeOn() After every commit() the transaction reverts to normal mode and must be placed into streaming batch mode again for streaming batch mode to continue.

The benefit of streaming mode is that the memory consumption is very low as nothing is cached. It is also somewhat faster than the normal batch mode (+/- 25% faster).

However the caveat is that, per transaction/thread only one label/table can be written between consecutive calls to SqlgTransaction.flush(). Further it is not possible to assign an id to the vertex or element. As such the SqlgGraph.streamVertex method returns void.

custom api
sqlgGraph.tx().streamingBatchModeOn();

Create 10 000 000 Persons and 10 000 000 cars.

@Test
public void showStreamingBatchMode() {
    StopWatch stopWatch = new StopWatch();
    stopWatch.start();
    //enable streaming mode
    this.sqlgGraph.tx().streamingBatchModeOn();
    for (int i = 1; i <= 10_000_000; i++) {
        this.sqlgGraph.streamVertex(T.label, "Person", "name", "John" + i);
    }
    this.sqlgGraph.tx().flush(); # (1)
    for (int i = 1; i <= 10_000_000; i++) {
        this.sqlgGraph.streamVertex(T.label, "Car", "name", "Dodge" + i);
    }
    this.sqlgGraph.tx().commit();
    stopWatch.stop();
    System.out.println(stopWatch.toString());
}
  1. flushing is needed before starting streaming Car. Only only one label/table can stream at a time.

output
Time taken: 0:00:42.014
memory

image of tinkerpop-classic

11.3. Bulk edge creation

To create an edge via the normal api a handle to the Vertex is needed. This is not always the case. In particula if the SqlgGraph.streamVertex api is used no handle to the Vertex is returned.

For this scenario there is a bulk edge creation method.

public <L, R> void bulkAddEdges(String outVertexLabel, String inVertexLabel, String edgeLabel, Pair<String, String> idFields, Collection<Pair<L, R>> uids) {
  • outLabel and inLabel specifies the out and in vertex labels that the edges will be between.

  • edgeLabel is the label of the edges to be created.

  • idFields specifies the fields that uniquely identify the out and in vertex.

  • uids are the actual unique identifies for each out/in vertex pairing.

Sqlg will then first copy the uids into a temporary table. Then it joins the temporary table on the out and in vertex tables to retrieve the in and out ids. These ids are then inserted into the edge table. All this happens on Postgresql, having minimal processing and memory impact on the java process.

The unique identifiers still have to be kept in memory, but its is not necessary to have the actual out and in vertices in memory.

Note
The unique identifiers do not need to be the vertices’s id. It can be any property as long as it is unique.
@Test
public void showBulkEdgeCreation() {
    StopWatch stopWatch = new StopWatch();
    stopWatch.start();
    int count = 0;
    for (int i = 1; i <= 10; i++) {
        List<Pair<String, String>> identifiers = new ArrayList<>();
        this.sqlgGraph.tx().streamingBatchModeOn();
        for (int j = 1; j <= 1_000_000; j++) {
            this.sqlgGraph.streamVertex(T.label, "Person", "name", "John" + count, "personUid", String.valueOf(count));
        }
        this.sqlgGraph.tx().flush();
        for (int j = 1; j <= 1_000_000; j++) {
            this.sqlgGraph.streamVertex(T.label, "Car", "name", "Dodge" + count, "carUid", String.valueOf(count));
            identifiers.add(Pair.of(String.valueOf(count), String.valueOf(count++)));
        }
        this.sqlgGraph.tx().flush();
        this.sqlgGraph.bulkAddEdges("Person", "Car", "drives", Pair.of("personUid", "carUid"), identifiers);
        this.sqlgGraph.tx().commit();
    }
    stopWatch.stop();
    System.out.println("Time taken: " + stopWatch.toString());
}
output (with edge foreign keys)
Time taken: 0:10:03.397
output (without edge foreign keys)
Time taken: 0:03:45.951
memory

image of tinkerpop-classic

11.4. Streaming with lock batch mode

Streaming with lock batch mode is similar to streaming batch mode. The difference being that the label/table being written to is locked. Locking the table ensures that no concurrent changes will occur on the table. This allows Sqlg to query the id sequence and assigned ids to the elements.

This means that the normal Vertex vertex = graph.addVertex(…​) method can be used. This is useful if a pointer to the new vertices are needed.

The transaction must be placed into streaming with lock batch mode manually before any streaming with lock batch modification can happen. SqlgGraph.tx().streamingWithLockBatchModeOn() After every commit() the transaction reverts to normal mode and must be placed into streaming batch mode again for streaming batch mode to continue.

custom api
sqlgGraph.tx().streamingWithLockBatchModeOn();
@Test
public void showStreamingWithLockBulkEdgeCreation() {
    StopWatch stopWatch = new StopWatch();
    stopWatch.start();
    int count = 0;
    for (int i = 1; i <= 10; i++) {
        List<Vertex> persons = new ArrayList<>();
        this.sqlgGraph.tx().streamingWithLockBatchModeOn();
        for (int j = 1; j <= 1_000_000; j++) {
            Vertex person = this.sqlgGraph.addVertex(T.label, "Person", "name", "John" + count);
            persons.add(person);
        }
        this.sqlgGraph.tx().flush();
        List<Vertex> cars = new ArrayList<>();
        for (int j = 1; j <= 1_000_000; j++) {
            Vertex car = this.sqlgGraph.addVertex(T.label, "Car", "name", "Dodge" + count++);
            cars.add(car);
        }
        this.sqlgGraph.tx().flush();
        Iterator<Vertex> carIter = cars.iterator();
        for (Vertex person : persons) {
            person.addEdge("drives", carIter.next());
        }
        this.sqlgGraph.tx().commit();
    }
    stopWatch.stop();
    System.out.println(stopWatch.toString());
}
output without edge foreign keys
Time taken: 0:02:42.363
memory

image of tinkerpop-classic

12. Topology

Sqlg stores the graph’s topology information in the graph itself as a graph. The topology is stored in the sqlg_schema schema.

UML diagram of Sqlg’s topology.

image of Sqlg’s topology

TinkerPop has no notion of schema or topology. However any TinkerPop graph has an implicit schema. Sqlg manages the schema as a first class construct.

Sqlg follows the normal TinkerPop semantics in that the schema does not need to be defined upfront. Every graph modification first checks to see if the element’s schema (label,name) exists. If not, it will create the element’s schema. For Postgresql this works well as it supports transactional schema creation/modification.

Warning
Hsqldb, H2 and MariaDb do not support transactional schema creation/modification. They will both silently commit the transaction and continue. This breaks the user’s transaction boundaries. For Hsqldb, H2 and MariaDb it is recommended to create the schema upfront.

It is possible to query and traverse the topology as a normal TinkerPop graph. To query the topology the TopologyStrategy is used. To facilitate ease of use, SqlgGraph.topology() method is added to enable the strategy. Being able to query the topology is helpful to understand a graph’s structure.

@Test
public void showTopologyTraversals() {
    Io.Builder<GraphSONIo> builder = GraphSONIo.build(GraphSONVersion.V3_0); (1)
    final GraphReader reader = sqlgGraph.io(builder).reader().create();
    try (final InputStream stream = AbstractGremlinTest.class.getResourceAsStream("/tinkerpop-modern-v3d0.json")) {
        reader.readGraph(stream, sqlgGraph);
    } catch (IOException e) {
        Assert.fail(e.getMessage());
    }
    System.out.println("//All vertex labels");
    sqlgGraph.topology().V()
            .hasLabel(Topology.SQLG_SCHEMA + "." + Topology.SQLG_SCHEMA_VERTEX_LABEL) # (2)
            .forEachRemaining(
                    v -> System.out.println(v.<String>value(Topology.SQLG_SCHEMA_VERTEX_LABEL_NAME))
            );

    System.out.println("//All edge labels");
    sqlgGraph.topology().V()
            .hasLabel(Topology.SQLG_SCHEMA + "." + Topology.SQLG_SCHEMA_VERTEX_LABEL)
            .out(Topology.SQLG_SCHEMA_OUT_EDGES_EDGE) # (3)
            .forEachRemaining(
                    v -> System.out.println(v.<String>value(Topology.SQLG_SCHEMA_EDGE_LABEL_NAME))
            );

    System.out.println("//'person' properties");
    sqlgGraph.topology().V()
            .hasLabel(Topology.SQLG_SCHEMA + "." + Topology.SQLG_SCHEMA_VERTEX_LABEL)
            .has(Topology.SQLG_SCHEMA_VERTEX_LABEL_NAME, "person") # (4)
            .out(Topology.SQLG_SCHEMA_VERTEX_PROPERTIES_EDGE) # (5)
            .forEachRemaining(
                    v -> {
                        System.out.print(v.<String>value(Topology.SQLG_SCHEMA_PROPERTY_NAME) + " : ");
                        System.out.println(v.<String>value(Topology.SQLG_SCHEMA_PROPERTY_TYPE));
                    }
            );

    System.out.println("//'software' properties");
    sqlgGraph.topology().V()
            .hasLabel(Topology.SQLG_SCHEMA + "." + Topology.SQLG_SCHEMA_VERTEX_LABEL)
            .has(Topology.SQLG_SCHEMA_VERTEX_LABEL_NAME, "software")
            .out(Topology.SQLG_SCHEMA_VERTEX_PROPERTIES_EDGE)
            .forEachRemaining(
                    v -> {
                        System.out.print(v.<String>value(Topology.SQLG_SCHEMA_PROPERTY_NAME) + " : ");
                        System.out.println(v.<String>value(Topology.SQLG_SCHEMA_PROPERTY_TYPE));
                    }
            );

    System.out.println("//'created' properties");
    sqlgGraph.topology().V()
            .hasLabel(Topology.SQLG_SCHEMA + "." + Topology.SQLG_SCHEMA_VERTEX_LABEL) # (6)
            .out(Topology.SQLG_SCHEMA_OUT_EDGES_EDGE) # (7)
            .has(Topology.SQLG_SCHEMA_EDGE_LABEL_NAME, "created") # (8)
            .out(Topology.SQLG_SCHEMA_EDGE_PROPERTIES_EDGE) # (9)
            .forEachRemaining(
                    v -> {
                        System.out.print(v.<String>value(Topology.SQLG_SCHEMA_PROPERTY_NAME) + " : ");
                        System.out.println(v.<String>value(Topology.SQLG_SCHEMA_PROPERTY_TYPE));
                    }
            );

    System.out.println("//'knows' properties");
    sqlgGraph.topology().V()
            .hasLabel(Topology.SQLG_SCHEMA + "." + Topology.SQLG_SCHEMA_VERTEX_LABEL)
            .out(Topology.SQLG_SCHEMA_OUT_EDGES_EDGE)
            .has(Topology.SQLG_SCHEMA_EDGE_LABEL_NAME, "knows")
            .out(Topology.SQLG_SCHEMA_EDGE_PROPERTIES_EDGE)
            .forEachRemaining(
                    v -> {
                        System.out.print(v.<String>value(Topology.SQLG_SCHEMA_PROPERTY_NAME) + " : ");
                        System.out.println(v.<String>value(Topology.SQLG_SCHEMA_PROPERTY_TYPE));
                    }
            );

}
  1. Use TinkerPop’s i.o. infrastructure to load the modern graph.

  2. Find all VertexLabels, they are in sqlg_schema.vertex

  3. Traverse out on the out_edges edge to find all the edges. 'WARNING' this may produce duplicates as a single edge label may have many different distinct out vertex labels.

  4. Find the person vertex.

  5. Traverse out on the vertex_property edge to find the 'person' vertex labels properties.

  6. Find all vertex labels. i.e. vertices in sqlg_schema.vertex

  7. Traverse the out_edges edge.

  8. Filter the out edges for only the 'created' edges.

  9. Traverse the edge_properties edge to find the 'created' edge’s properties.

output
//All vertex labels
person
software
//All edge labels
knows
created
//'person' properties
name : STRING
age : INTEGER
//'software' properties
name : STRING
lang : STRING
//'created' properties
weight : DOUBLE
//'knows' properties
weight : DOUBLE

12.1. Topology eager creation

It is often useful to create the topology upfront. The topology creation api is accessed via the Topology object. It is a singleton. Topology topology = sqlgGraph.getTopology(); To create new topology objects use the ensureXXX methods. They will return a topology object representing the specific topology element. i.e. Schema, VertexLabel, EdgeLabel, PropertyColumn, Index

Note
The ensureXXX methods will create the topology object if it does not exists. If it does exist it will simply return the relevant topology object. On any topology object one can call isCommitted or isUncommitted to check the state of the object. committed indicates that it already exists. uncommitted indicates that it has been created in the current active transaction.
eg
@Test
public void createModernTopology() {
    Topology topology = this.sqlgGraph.getTopology(); # (1)
    VertexLabel personVertexLabel = topology.ensureVertexLabelExist("public", "person", new HashMap<String, PropertyType>() {{
        put("name", PropertyType.STRING);
        put("age", PropertyType.INTEGER);
    }}); # (2)
    VertexLabel softwareVertexLabel = topology.ensureVertexLabelExist("public", "software", new HashMap<String, PropertyType>() {{
        put("name", PropertyType.STRING);
        put("lang", PropertyType.STRING);
    }});
    EdgeLabel createdEdgeLabel = personVertexLabel.ensureEdgeLabelExist("created", softwareVertexLabel, new HashMap<String, PropertyType>() {{
        put("weight", PropertyType.DOUBLE);
    }}); # (3)
    EdgeLabel knowsEdgeLabel = personVertexLabel.ensureEdgeLabelExist("knows", personVertexLabel, new HashMap<String, PropertyType>() {{
        put("weight", PropertyType.DOUBLE);
    }});
    this.sqlgGraph.tx().commit(); # (4)
}
  1. Get the Topology object.

  2. Create the 'person' VertexLabel. The HashMap<String, PropertyType> defines the 'person''s properties.

  3. Create the 'created' EdgeLabel. The format is outVertexLabel.ensureEdgeLabelExist(name, inVertexLabel, properties)

  4. Be sure to commit the transaction. Postgresql and MSSqlServer supports transactional schema creation. Hsqldb,H2 and MariaDB do not.

@Test
public void generalTopologyCreationWithSchema() {
    Schema schema = this.sqlgGraph.getTopology().ensureSchemaExist("Humans"); # (1)
    VertexLabel personVertexLabel = schema.ensureVertexLabelExist("Person", new HashMap<String, PropertyType>() {{
        put("name", PropertyType.STRING);
        put("date", PropertyType.LOCALDATE);
    }}); # (2)
    this.sqlgGraph.tx().commit();
}
  1. Create the 'Humans' schema

  2. Create the 'Person' VertexLabel via the Schema object.

Sqlg keeps an in-memory cache of the graphs entire topology. It is possible query this cache directly.

@Test
public void queryCache() {
    loadModern();
    Optional<Schema> publicSchema = this.sqlgGraph.getTopology().getSchema(this.sqlgGraph.getSqlDialect().getPublicSchema()); # (1)
    assertTrue(publicSchema.isPresent());
    Schema publicSchemaViaShortCut = this.sqlgGraph.getTopology().getPublicSchema(); # (2)
    Optional<VertexLabel> personVertexLabel = publicSchema.get().getVertexLabel("person"); # (3)
    assertTrue(personVertexLabel.isPresent());
    Optional<EdgeLabel> createEdgeLabel = personVertexLabel.get().getOutEdgeLabel("created"); # (4)
    assertTrue(createEdgeLabel.isPresent());
    Optional<EdgeLabel> knowsEdgeLabel = personVertexLabel.get().getOutEdgeLabel("knows"); # (5)
    assertTrue(knowsEdgeLabel.isPresent());

    Optional<PropertyColumn> namePropertyColumn = personVertexLabel.get().getProperty("name"); # (6)
    assertTrue(namePropertyColumn.isPresent());
    assertEquals(PropertyType.STRING, namePropertyColumn.get().getPropertyType()); # (7)
    Optional<PropertyColumn> agePropertyColumn = personVertexLabel.get().getProperty("age");
    assertTrue(agePropertyColumn.isPresent());
    assertEquals(PropertyType.INTEGER, agePropertyColumn.get().getPropertyType());
    Optional<PropertyColumn> weightPropertyColumn = createEdgeLabel.get().getProperty("weight");
    assertTrue(weightPropertyColumn.isPresent());
    assertEquals(PropertyType.DOUBLE, weightPropertyColumn.get().getPropertyType());
}
  1. Get the 'public' schema object.

  2. Because the 'public' schema will always exist there is a shortcut method to get it.

  3. Use the 'Schema' object the get the 'person' VertexLabel

  4. Use the 'person' VertexLabel to get its 'created' out edge.

  5. Use the 'person' VertexLabel to get its 'knows' out edge.

  6. Use the 'person' VertexLabel to get its 'name' property. Properties are represented by the PropertyColumn class.

  7. On the PropertyColumn object one can get the PropertyType. PropertyType is an enum representing all data types supported by Sqlg.

12.2. User supplied identifiers

You can define your own identifiers for a VertexLabel or EdgeLabel. This will result in Sqlg generating primary keys on the specified identifiers instead of using an auto generated sequence.

eg.
@Test
public void testUserSuppliedIds() {
    VertexLabel personVertexLabel = this.sqlgGraph.getTopology().getPublicSchema().ensureVertexLabelExist(
            "Person",
            new LinkedHashMap<>() {{
                put("name", PropertyType.STRING);
                put("surname", PropertyType.STRING);
                put("nickname", PropertyType.STRING);
            }},
            ListOrderedSet.listOrderedSet(Arrays.asList("name", "surname")) # (1)
    );
    personVertexLabel.ensureEdgeLabelExist(
            "marriedTo",
            personVertexLabel,
            new LinkedHashMap<>() {{
                put("place", PropertyType.STRING);
                put("when", PropertyType.LOCALDATETIME);
            }},
            ListOrderedSet.listOrderedSet(List.of("place", "when")) # (2)
    );
    this.sqlgGraph.tx().commit();

    Vertex john = this.sqlgGraph.addVertex(T.label, "Person", "name", "John", "surname", "Longfellow", "nickname", "Longboy");
    Vertex sue = this.sqlgGraph.addVertex(T.label, "Person", "name", "Sue", "surname", "Pretty");
    john.addEdge("marriedTo", sue, "place", "Timbuck2", "when", LocalDateTime.now());
    this.sqlgGraph.tx().commit();

    List<Vertex> marriedTo = this.sqlgGraph.traversal().V().hasLabel("Person")
            .has("name", "John")
            .out("marriedTo")
            .toList();
    Assert.assertEquals(1, marriedTo.size());
    Assert.assertEquals(sue, marriedTo.get(0));
}
  1. Specify the name and surname properties as the primary key for the Person vertex label.

  2. Specify the place and when properties as the primary key for the marriedTo edge label.

This will generate a table with name and surname, and place and when as composite primary keys.

CREATE TABLE public."V_Person"
(
    name text COLLATE pg_catalog."default" NOT NULL,
    surname text COLLATE pg_catalog."default" NOT NULL,
    nickname text COLLATE pg_catalog."default",
    CONSTRAINT "V_Person_pkey" PRIMARY KEY (name, surname)
);

CREATE TABLE public."E_marriedTo"
(
    place text COLLATE pg_catalog."default" NOT NULL,
    "when" timestamp without time zone NOT NULL,
    "public.Person.name__I" text COLLATE pg_catalog."default",
    "public.Person.surname__I" text COLLATE pg_catalog."default",
    "public.Person.name__O" text COLLATE pg_catalog."default",
    "public.Person.surname__O" text COLLATE pg_catalog."default",
    CONSTRAINT "E_marriedTo_pkey" PRIMARY KEY (place, "when"),
    CONSTRAINT "E_marriedTo_public.Person.name__I_public.Person.surname__I_fkey" FOREIGN KEY ("public.Person.name__I", "public.Person.surname__I")
        REFERENCES public."V_Person" (name, surname) MATCH SIMPLE
        ON UPDATE NO ACTION
        ON DELETE NO ACTION
        DEFERRABLE,
    CONSTRAINT "E_marriedTo_public.Person.name__O_public.Person.surname__O_fkey" FOREIGN KEY ("public.Person.name__O", "public.Person.surname__O")
        REFERENCES public."V_Person" (name, surname) MATCH SIMPLE
        ON UPDATE NO ACTION
        ON DELETE NO ACTION
        DEFERRABLE
)

The gremlin query will execute the following sql,

SELECT
	a2."alias1", a2."alias2", a2."alias3"
FROM (
SELECT
	"public"."E_marriedTo"."public.Person.name__I" AS "public.E_marriedTo.public.Person.name__I",
	"public"."E_marriedTo"."public.Person.surname__I" AS "public.E_marriedTo.public.Person.surname__I"
FROM
	"public"."V_Person" INNER JOIN
	"public"."E_marriedTo" ON "public"."V_Person"."name" = "public"."E_marriedTo"."public.Person.name__O" AND "public"."V_Person"."surname" = "public"."E_marriedTo"."public.Person.surname__O"
WHERE
	( "public"."V_Person"."name" = ?)
) a1 INNER JOIN (
SELECT
	"public"."V_Person"."name" AS "alias1",
	"public"."V_Person"."surname" AS "alias2",
	"public"."V_Person"."nickname" AS "alias3"
FROM
	"public"."V_Person"
) a2 ON a1."public.E_marriedTo.public.Person.name__I" = a2."alias1" AND a1."public.E_marriedTo.public.Person.surname__I" = a2."alias2"

12.3. Lock the topology

It is possible to lock the topology. This will prevent any creation of topology elements without explicitly unlocking the topology. The topology can only be locked globally. However, there are two ways to unlock the topology. Either globally or per transaction. Unlocking the topology on the transaction unlocks the topology only for the current transaction. This means that globally the topology remains locked and that other threads (transactions) will not be able to change the topology. To do so they too will have to first unlock the topology. There is no need to lock the topology again for the transaction. It will automatically do so on commit or rollback.

@Test
public void topologyGlobalLockUnlock() {
    this.sqlgGraph.getTopology().lock(); # (1)
    try {
        this.sqlgGraph.addVertex(T.label, "A"); # (2)
        Assert.fail("Expected IllegalStateException");
    } catch (IllegalStateException e) {
        //The topology is locked so an IllegalStateException is thrown.
    }
    this.sqlgGraph.getTopology().unlock(); # (3)
    this.sqlgGraph.addVertex(T.label, "A"); # (4)
    this.sqlgGraph.tx().commit();
    Assert.assertTrue(this.sqlgGraph.getTopology()
            .getPublicSchema()
            .getVertexLabel("A")
            .isPresent());
}
  1. Globally lock the topology.

  2. Try to create a VertexLabel 'A', it will fail with an IllegalStateException as the topology is locked.

  3. Globally unlock the topology.

  4. Create a VertexLabel 'A'. This time it will succeed as the topology is not locked.

@Test
public void topologyGlobalLockTransactionUnlock() {
    this.sqlgGraph.getTopology().lock(); # (1)
    this.sqlgGraph.tx().unlockTopology(); # (2)
    this.sqlgGraph.addVertex(T.label, "A"); # (3)
    this.sqlgGraph.tx().commit(); # (4)
    Assert.assertTrue(this.sqlgGraph.getTopology()
            .getPublicSchema()
            .getVertexLabel("A")
            .isPresent());
}
  1. Globally lock the topology.

  2. Unlock the topology only for the current transaction.

  3. Create a VertexLabel 'A'. This will succeed for the current transaction.

  4. On commit the unlockTopology is automatically no longer valid.

13. Postgresql Partitioning

Sqlg supports postgresql partitioning. To partition a table it needs to be created upfront using the Topology api. Sqlg currently supports RANGE, LIST and HASH partitions.

13.1. Range partitioning

@Test
public void testPartitioningRange() {
    Schema publicSchema = this.sqlgGraph.getTopology().getPublicSchema(); # (1)
    VertexLabel partitionedVertexLabel = publicSchema.ensurePartitionedVertexLabelExist(
            "Measurement",
            new LinkedHashMap<String, PropertyType>() {{
                put("date", PropertyType.LOCALDATE);
                put("temp", PropertyType.INTEGER);
            }},
            ListOrderedSet.listOrderedSet(Collections.singletonList("date")),
            PartitionType.RANGE, #(2)
            "date"); # (3)
    partitionedVertexLabel.ensureRangePartitionExists("measurement1", "'2016-07-01'", "'2016-08-01'"); # (4)
    partitionedVertexLabel.ensureRangePartitionExists("measurement2", "'2016-08-01'", "'2016-09-01'"); # (5)
    this.sqlgGraph.tx().commit();

    LocalDate localDate1 = LocalDate.of(2016, 7, 1);
    this.sqlgGraph.addVertex(T.label, "Measurement", "date", localDate1);
    LocalDate localDate2 = LocalDate.of(2016, 8, 1);
    this.sqlgGraph.addVertex(T.label, "Measurement", "date", localDate2);
    this.sqlgGraph.tx().commit();

    Assert.assertEquals(2, this.sqlgGraph.traversal().V().hasLabel("Measurement").count().next(), 0);
    Assert.assertEquals(1, this.sqlgGraph.traversal().V().hasLabel("Measurement").has("date", localDate1).count().next(), 0);
    Assert.assertEquals(1, this.sqlgGraph.traversal().V().hasLabel("Measurement").has("date", localDate2).count().next(), 0);

    Partition partition = this.sqlgGraph.getTopology().getPublicSchema().getVertexLabel("Measurement").get().getPartition("measurement1").get(); # (6)
    partition.remove(); # (7)
    this.sqlgGraph.tx().commit();

    Assert.assertEquals(1, this.sqlgGraph.traversal().V().hasLabel("Measurement").count().next(), 0);
    Assert.assertEquals(0, this.sqlgGraph.traversal().V().hasLabel("Measurement").has("date", localDate1).count().next(), 0);
    Assert.assertEquals(1, this.sqlgGraph.traversal().V().hasLabel("Measurement").has("date", localDate2).count().next(), 0);

    Assert.assertEquals(1, this.sqlgGraph.topology().V().hasLabel(Topology.SQLG_SCHEMA + "." + Topology.SQLG_SCHEMA_PARTITION).count().next(), 0); # (8)
}
  1. Get the 'public' schema object.

  2. Indicates a RANGE partition.

  3. Create a VertexLabel with a range partition on the date field.

  4. Create a named partition for the range '2016-07-01' to '2016-08-01'.

  5. Create a named partition for the range '2016-08-01' to '2016-09-01'.

  6. Using the Topology api get the measurement1 partition.

  7. Remove the measurement1 partition.

  8. Assert that `Sqlg`s topology only has one partition.

13.2. List partitioning

//the partitionExpression 'left(lower(name), 1)' is to complex for the query planner to optimize.
//i.e. select * from Cities where name = 'asdasd' willscan all partitions.
@Test
public void testPartitioningList() {
    Schema publicSchema = this.sqlgGraph.getTopology().getPublicSchema();
    VertexLabel partitionedVertexLabel = publicSchema.ensurePartitionedVertexLabelExist("Cities",
            new LinkedHashMap<String, PropertyType>() {{
                put("name", PropertyType.STRING);
                put("population", PropertyType.LONG);
            }},
            ListOrderedSet.listOrderedSet(Collections.singletonList("name")),
            PartitionType.LIST, # (1)
            "left(lower(name), 1)"); # (2)
    partitionedVertexLabel.ensureListPartitionExists("Cities_a", "'a'"); # (3)
    partitionedVertexLabel.ensureListPartitionExists("Cities_b", "'b'");
    partitionedVertexLabel.ensureListPartitionExists("Cities_c", "'c'");
    partitionedVertexLabel.ensureListPartitionExists("Cities_d", "'d'");
    this.sqlgGraph.tx().commit();

    this.sqlgGraph.tx().normalBatchModeOn();
    for (int i = 0; i < 100; i++) {
        this.sqlgGraph.addVertex(T.label, "Cities", "name", "aasbc", "population", 1000L);
    }
    this.sqlgGraph.addVertex(T.label, "Cities", "name", "basbc", "population", 1000L);
    for (int i = 0; i < 100; i++) {
        this.sqlgGraph.addVertex(T.label, "Cities", "name", "casbc", "population", 1000L);
    }
    this.sqlgGraph.addVertex(T.label, "Cities", "name", "dasbc", "population", 1000L);
    this.sqlgGraph.tx().commit();

    Assert.assertEquals(202, this.sqlgGraph.traversal().V().hasLabel("Cities").count().next(), 0);
    Assert.assertEquals(100, this.sqlgGraph.traversal().V().hasLabel("Cities").has("name", "aasbc").count().next(), 0);
    Assert.assertEquals(1, this.sqlgGraph.traversal().V().hasLabel("Cities").has("name", "basbc").count().next(), 0);
    Assert.assertEquals(100, this.sqlgGraph.traversal().V().hasLabel("Cities").has("name", "casbc").count().next(), 0);

    Partition partition = this.sqlgGraph.getTopology().getPublicSchema().getVertexLabel("Cities").get().getPartition("Cities_a").get();
    partition.remove();
    this.sqlgGraph.tx().commit();

    Assert.assertEquals(102, this.sqlgGraph.traversal().V().hasLabel("Cities").count().next(), 0);
    Assert.assertEquals(3, this.sqlgGraph.topology().V().hasLabel(Topology.SQLG_SCHEMA + "." + Topology.SQLG_SCHEMA_PARTITION).count().next(), 0);
}
  1. Indicates a LIST partition.

  2. The partition expression.

  3. Create a named partition for the list entry 'a'.

13.3. Hash partitioning

@Test
public void testPartitioningHash() {
        VertexLabel vertexLabel = this.sqlgGraph.getTopology().getPublicSchema().ensurePartitionedVertexLabelExist(
                "A",
                new LinkedHashMap<>() {{
                    put("uid1", PropertyType.INTEGER);
                    put("name", PropertyType.STRING);
                    put("surname", PropertyType.STRING);
                }},
                ListOrderedSet.listOrderedSet(List.of("uid1")),
                PartitionType.HASH, (1)
                "\"uid1\"" (2)
        );
        for (int i = 0; i < 10; i++) {
            vertexLabel.ensureHashPartitionExists("hashPartition" + i, 10, i); (3)
        }
        this.sqlgGraph.tx().commit();
        for (int i = 0; i < 1000; i++) {
            this.sqlgGraph.addVertex(T.label, "A", "uid1", i, "name", "name" + i, "surname", "surname" + i);
        }
        this.sqlgGraph.tx().commit();
        Assert.assertEquals(1000, this.sqlgGraph.traversal().V().hasLabel("A").count().next(), 0);

        Connection connection = this.sqlgGraph.tx().getConnection();
        try (Statement s = connection.createStatement()) {
            ResultSet rs = s.executeQuery("select tableoid::regclass as partition_name, count(*) from \"V_A\" group by 1 order by 1;"); (4)
            int count = 0;
            Map<String, Long> partitionDistributionCount = new HashMap<>();
            while (rs.next()) {
                count++;
                partitionDistributionCount.put(rs.getString(1), rs.getLong(2));
            }
            Assert.assertEquals(10, count); (5)
            Assert.assertEquals(10, partitionDistributionCount.size());
            for (int i = 0; i < 10; i++) {
                Assert.assertTrue(partitionDistributionCount.containsKey("\"hashPartition" + i + "\""));
            }
            Assert.assertEquals(100, partitionDistributionCount.get("\"hashPartition0\""), 0);
            Assert.assertEquals(92, partitionDistributionCount.get("\"hashPartition1\""), 0);
            Assert.assertEquals(103, partitionDistributionCount.get("\"hashPartition2\""), 0);
            Assert.assertEquals(88, partitionDistributionCount.get("\"hashPartition3\""), 0);
            Assert.assertEquals(113, partitionDistributionCount.get("\"hashPartition4\""), 0);
            Assert.assertEquals(90, partitionDistributionCount.get("\"hashPartition5\""), 0);
            Assert.assertEquals(119, partitionDistributionCount.get("\"hashPartition6\""), 0);
            Assert.assertEquals(92, partitionDistributionCount.get("\"hashPartition7\""), 0);
            Assert.assertEquals(100, partitionDistributionCount.get("\"hashPartition8\""), 0);
            Assert.assertEquals(103, partitionDistributionCount.get("\"hashPartition9\""), 0);
        } catch (SQLException throwables) {
            Assert.fail(throwables.getMessage());
        }

}
  1. Indicates a HASH partition.

  2. The partition expression.

  3. Create a named partition for the hash entry with it modulus and remainder.

  4. Fetch the partitions for the assertion

  5. Assert that there are as many partitions as the modulus

14. Postgresql Foreign Data Wrappers

Sqlg supports postgres-fdw. This allows the graph to scale horizontally over multiple databases.

Note
  • postgres_fdw does not support distributed transactions.

  • It is not possible to use SERIAL id fields for inserting data into a remote database via postgres_fdw

  • It is possible to read VertexLabel(s) and EdgeLabel(s) with a SERIAL id field from a remote database via postgres_fdw.

14.1. How it works

Sqlg imports the foreign graph schema/topology, either complete schemas or a subset of VertexLabel(s) and EdgeLabel(s) This allows Sqlg to refer to parts of the foreign graph as though it is local. The rest, postgresql takes care of by forwarding sql statements to the remote server.

14.2. Prepare

The user is responsible to prepare the environment. i.e.

CREATE EXTENSION IF NOT EXISTS postgres_fdw;
CREATE SERVER "sqlgraph_fwd_server" FOREIGN DATA WRAPPER postgres_fdw OPTIONS (host 'localhost', dbname 'sqlgraphdb_fdw', port '5432')
CREATE USER MAPPING FOR postgres SERVER "sqlgraph_fwd_server" OPTIONS (user 'postgres', password 'postgres');

14.3. Import a foreign schema

First we need to create 2 separate graphs pointing to different databases.

Note

We are creating the VertexLabel and EdgeLabel using identifiers as it is not possible to insert into a SERIAL auto generated primary key.

@Test
public void testImportForeignSchema() {
    PropertiesConfiguration properties = new PropertiesConfiguration();
    properties.setProperty("jdbc.url", "jdbc:postgresql://localhost:5432/sqlgraphdb");
    properties.setProperty("jdbc.username", "postgres");
    properties.setProperty("jdbc.password", "postgres");
    SqlgGraph sqlgGraph = SqlgGraph.open(properties); (1)

    PropertiesConfiguration propertiesForeign = new PropertiesConfiguration();
    propertiesForeign.setProperty("jdbc.url", "jdbc:postgresql://localhost:5432/sqlgraphdb_fdw");
    propertiesForeign.setProperty("jdbc.username", "postgres");
    propertiesForeign.setProperty("jdbc.password", "postgres");
    SqlgGraph sqlgGraphForeign = SqlgGraph.open(propertiesForeign); (2)

    Schema foreignSchemaA = sqlgGraphForeign.getTopology().ensureSchemaExist("A"); (3)
    VertexLabel aVertexLabel = foreignSchemaA.ensureVertexLabelExist("A", new HashMap<>() {{
                put("ID", PropertyType.UUID);
                put("name", PropertyType.STRING);
            }},
            ListOrderedSet.listOrderedSet(List.of("ID"))
    ); (4)
    VertexLabel bVertexLabel = foreignSchemaA.ensureVertexLabelExist("B", new HashMap<>() {{
                put("ID", PropertyType.UUID);
                put("name", PropertyType.STRING);
            }},
            ListOrderedSet.listOrderedSet(List.of("ID"))
    ); (5)
    aVertexLabel.ensureEdgeLabelExist("ab", bVertexLabel, new HashMap<>() {{
                put("ID", PropertyType.UUID);
                put("name", PropertyType.STRING);
            }}, ListOrderedSet.listOrderedSet(Set.of("ID"))
    ); (6)
    sqlgGraphForeign.tx().commit();

    Connection connection = sqlgGraph.tx().getConnection();
    try (Statement statement = connection.createStatement()) {
        String sql = String.format(
                "CREATE SCHEMA \"%s\";",
                "A"
        ); (7)
        statement.execute(sql);
        sql = String.format(
                "IMPORT FOREIGN SCHEMA \"%s\" FROM SERVER \"%s\" INTO \"%s\";",
                "A",
                "sqlgraph_fwd_server",
                "A"
        ); (8)
        statement.execute(sql);
    } catch (SQLException e) {
        Assert.fail(e.getMessage());
    }
    sqlgGraph.tx().commit();
    sqlgGraph.getTopology().importForeignSchemas(Set.of(foreignSchemaA)); (9)

    Assert.assertTrue(sqlgGraph.getTopology().getSchema("A").isPresent());
    Assert.assertTrue(sqlgGraph.getTopology().getSchema("A").orElseThrow().getVertexLabel("A").isPresent());
    Assert.assertTrue(sqlgGraph.getTopology().getSchema("A").orElseThrow().getVertexLabel("B").isPresent());
    Assert.assertTrue(sqlgGraph.getTopology().getSchema("A").orElseThrow().getEdgeLabel("ab").isPresent()); (10)

    Vertex aVertex = sqlgGraph.addVertex(T.label, "A.A", "ID", UUID.randomUUID(), "name", "John");
    Vertex bVertex = sqlgGraph.addVertex(T.label, "A.B", "ID", UUID.randomUUID(), "name", "Joe");
    aVertex.addEdge("ab", bVertex, "ID", UUID.randomUUID(), "name", "myEdge");
    sqlgGraph.tx().commit();
    Assert.assertEquals(1L, sqlgGraph.traversal().V()
            .hasLabel("A.A")
            .has("name", P.eq("John"))
            .out("ab")
            .count().next(), 0); (11)
}
  1. A SqlgGraph pointing to sqlgraphdb

  2. A SqlgGraph pointing to sqlgraphdb_fwd

  3. Create a Schema A in `sqlgrapdb_fdw

  4. Create a VertexLabel A in sqlgraphdb_fdw

  5. Create a VertexLabel B in sqlgraphdb_fdw

  6. Create a EdgeLabel ab in sqlgraphdb_fdw

  7. Create the schema A in sqlgraphdb

  8. Import the foreign schema A in sqlgraphdb_fdw into A in sqlgraphdb

  9. Inform SqlgGraph of sqlgraphdb about the foreign elements it can refer to as local.

  10. Assert that the foreign schema elements are available to the local SqlgGraph pointing to sqlgraphdb.

  11. Insert and query data into sqlgraphdb_fdw using the local SqlgGraph on sqlgraphdb

@Test
public void testImportForeignSchemaWithSERIAL() {
    PropertiesConfiguration properties = new PropertiesConfiguration();
    properties.setProperty("jdbc.url", "jdbc:postgresql://localhost:5432/sqlgraphdb");
    properties.setProperty("jdbc.username", "postgres");
    properties.setProperty("jdbc.password", "postgres");
    SqlgGraph sqlgGraph = SqlgGraph.open(properties); (1)

    PropertiesConfiguration propertiesForeign = new PropertiesConfiguration();
    propertiesForeign.setProperty("jdbc.url", "jdbc:postgresql://localhost:5432/sqlgraphdb_fdw");
    propertiesForeign.setProperty("jdbc.username", "postgres");
    propertiesForeign.setProperty("jdbc.password", "postgres");
    SqlgGraph sqlgGraphForeign = SqlgGraph.open(propertiesForeign); (2)

    Schema foreignSchemaA = sqlgGraphForeign.getTopology().ensureSchemaExist("A"); (3)
    VertexLabel aVertexLabel = foreignSchemaA.ensureVertexLabelExist("A",
            new HashMap<>() {{
                put("name", PropertyType.STRING);
            }}
    ); (4)
    VertexLabel bVertexLabel = foreignSchemaA.ensureVertexLabelExist("B",
            new HashMap<>() {{
                put("name", PropertyType.STRING);
            }}
    ); (5)
    aVertexLabel.ensureEdgeLabelExist("ab", bVertexLabel,
            new HashMap<>() {{
                put("name", PropertyType.STRING);
            }}
    ); (6)
    sqlgGraphForeign.tx().commit();

    Connection connection = sqlgGraph.tx().getConnection();
    try (Statement statement = connection.createStatement()) {
        String sql = String.format(
                "CREATE SCHEMA \"%s\";",
                "A"
        ); (7)
        statement.execute(sql);
        sql = String.format(
                "IMPORT FOREIGN SCHEMA \"%s\" FROM SERVER \"%s\" INTO \"%s\";",
                "A",
                "sqlgraph_fwd_server",
                "A"
        ); (8)
        statement.execute(sql);
    } catch (SQLException e) {
        Assert.fail(e.getMessage());
    }
    sqlgGraph.tx().commit();
    sqlgGraph.getTopology().importForeignSchemas(Set.of(foreignSchemaA)); (9)

    Vertex aVertex = sqlgGraphForeign.addVertex(T.label, "A.A", "name", "John");
    Vertex bVertex = sqlgGraphForeign.addVertex(T.label, "A.B", "name", "Joe");
    aVertex.addEdge("ab", bVertex, "name", "myEdge");
    sqlgGraphForeign.tx().commit(); (10)

    Assert.assertEquals(1L, sqlgGraph.traversal().V()
            .hasLabel("A.A")
            .has("name", P.eq("John"))
            .out("ab")
            .count().next(), 0); (11)
}
  1. A SqlgGraph pointing to sqlgraphdb

  2. A SqlgGraph pointing to sqlgraphdb_fwd

  3. Create a Schema A in `sqlgrapdb_fdw

  4. Create a VertexLabel A with a SERIAL id in sqlgraphdb_fdw

  5. Create a VertexLabel B with a SERIAL id in sqlgraphdb_fdw

  6. Create a EdgeLabel ab with a SERIAL id in sqlgraphdb_fdw

  7. Create the schema A in sqlgraphdb

  8. Import the foreign schema A in sqlgraphdb_fdw into A in sqlgraphdb

  9. Inform SqlgGraph of sqlgraphdb about the foreign elements it can refer to as local.

  10. Insert the data directly into sqlgraphdb_fdw

  11. Read the data from sqlgraphdb via postgres_fdw

15. Sqlg ui

Sqlg includes a basic ui to visualize and delete/remove schema elements.

Warning
Sqlg' ui is very dangerous as it allows for deletion of schema elements including RDBMS schemas. Use with care and ideally do not expose to a wide audience.

15.1. Startup

The ui uses Sparkjava as its web framework.

There are two ways in which to start the ui.

Embedded

To use a completely standalone Jetty you can run the following code,

//SqlgUI.initialize(); (1)
SqlgUI.initialize(8181);
SqlgUI.set(sqlgGraph);
  1. The default port is 4567

To use an existing embedded Jetty you can use the following setup.

 //Define your filer
FilterHolder filterHolderSqlgUI = contextHandler.addFilter("spark.servlet.SparkFilter", "/sqlg/*", EnumSet.of(DispatcherType.REQUEST));
filterHolderSqlgUI.setInitParameter("applicationClass", "org.umlg.sqlg.ui.SparkResources");

//Websocket servlet
ServletHolder websocketServletHolder = new ServletHolder(new SqlgWebsocketServlet());
websocketServletHolder.setName("Sqlg-ui websocket servlet");
contextHandler.addServlet(websocketServletHolder, "/sqlg/data/v1/websocket");

...

SqlgUI.set(sqlgGraph);
Webserver
<filter>
  <filter-name>SparkFilter</filter-name>
  <filter-class>spark.servlet.SparkFilter</filter-class>
  <init-param>
    <param-name>applicationClass</param-name>
    <param-value>com.company.YourApplication</param-value>
  </init-param>
</filter>
<filter-mapping>
  <filter-name>SparkFilter</filter-name>
  <url-pattern>/*</url-pattern>
</filter-mapping>
Warning
The webserver mode has not been tested, nor do I know how the websocket will be made to work.

The ui is accessible at

http://ip:port/sqlg/v1/

15.2. Authentication

The ui uses cookie authentication.

To define the users that are allowed to use the ui there must be a corresponding property in sqlg.properties

sqlg.ui.username.john=john_password
sqlg.ui.username.peter=peter_password

For the user to be allowed to do any editing there must be the following property in sqlg.properties

sqlg.ui.username.john.edit=true
sqlg.ui.username.peter.edit=true

There is one additional property which specifies how long the cookie remains valid for.

sqlg.ui.cookie.expiry=3600

16. Gremlin server

As sqlg implements TinkerPop you can access it via gremlin server.

Warning
At present non of Sqlg’s custom features are available via gremlin server. i.e. No batch mode and no postgresql partitions.

16.1. Minimal config

gremlin-server-min.yaml

host: localhost
port: 8182
graphs: {
  graph: src/test/resources/conf/sqlg.properties}
scriptEngines: {
  gremlin-groovy: {
    plugins: { org.apache.tinkerpop.gremlin.server.jsr223.GremlinServerGremlinPlugin: {},
               org.umlg.sqlg.groovy.plugin.SqlgPostgresGremlinPlugin: {},
               org.apache.tinkerpop.gremlin.jsr223.ScriptFileGremlinPlugin: {files: [src/test/resources/conf/gremlin-server-sqlg-postgres.groovy]} } } }
serializers:
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphBinaryMessageSerializerV1, config: { ioRegistries: [org.umlg.sqlg.structure.SqlgIoRegistryV3] } }        # application/vnd.graphbinary-v1.0
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerV3d0, config: { ioRegistries: [org.umlg.sqlg.structure.SqlgIoRegistryV3] } }        # application/json

gremlin-server-postgresql.yaml

host: gremlin-server
port: 8182
evaluationTimeout: 30000
# channelizer: org.apache.tinkerpop.gremlin.server.channel.HttpChannelizer
graphs: {
  graph: /path/to/gremlin-server-sqlg-postgres.properties
}
scriptEngines: {
  gremlin-groovy: {
    plugins: { org.apache.tinkerpop.gremlin.server.jsr223.GremlinServerGremlinPlugin: {},
               org.umlg.sqlg.groovy.plugin.SqlgPostgresGremlinPlugin: {},
               org.apache.tinkerpop.gremlin.groovy.jsr223.GroovyCompilerGremlinPlugin: {enableThreadInterrupt: true},
               org.apache.tinkerpop.gremlin.jsr223.ImportGremlinPlugin: {classImports: [java.lang.Math], methodImports: [java.lang.Math#*]},
               org.apache.tinkerpop.gremlin.jsr223.ScriptFileGremlinPlugin: {files: [/path/to/gremlin-server-sqlg-postgres.groovy]}
    }
  }
}
serializers:
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV3d0, config: { ioRegistries: [org.apache.tinkerpop.gremlin.tinkergraph.structure.TinkerIoRegistryV3d0] }}            # application/vnd.gremlin-v3.0+gryo
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV3d0, config: { serializeResultToString: true }}                                                                      # application/vnd.gremlin-v3.0+gryo-stringd
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerV3d0, config: { ioRegistries: [org.apache.tinkerpop.gremlin.tinkergraph.structure.TinkerIoRegistryV3d0] }}        # application/json
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphBinaryMessageSerializerV1 }                                                                                                           # application/vnd.graphbinary-v1.0
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphBinaryMessageSerializerV1, config: { serializeResultToString: true }}                                                                 # application/vnd.graphbinary-v1.0-stringd
metrics: {
  consoleReporter: {enabled: false, interval: 180000},
  csvReporter: {enabled: false, interval: 180000, fileName: /tmp/gremlin-server-metrics.csv},
  jmxReporter: {enabled: false},
  slf4jReporter: {enabled: true, interval: 180000}
}
strictTransactionManagement: false
idleConnectionTimeout: 0
keepAliveInterval: 0
maxInitialLineLength: 4096
maxHeaderSize: 8192
maxChunkSize: 8192
maxContentLength: 65536
maxAccumulationBufferComponents: 1024
resultIterationBatchSize: 64

gremlin-server-postgresql.groovy

def globals = [:]

globals << [g : graph.traversal()]