⚠

EXPERIMENTAL: These features are experimental and should not be used in production systems.

Graph Library: Creating Graphs

Goal

This guide demonstrates how to use the Graph Library’s graph constructor modules, load graphs from external data sources, such as a CSV file, and create graphs from knowledge graphs stored in RelationalAI’s Relational Knowledge Graph System.

Introduction

Graphs are modules containing node, edge, and is_directed relations. For convenience, the Graph Library provides graph constructors that build modules with the correct schema. See Graphs: Schema for details.

Edge sets are usually built from base relations in a database. In most cases, edge sets and graphs are persisted to a model so that they can be maintained separately from and reused in other queries. This offers performance benefits and enables automatic updates to the graph when data are inserted into or deleted from the base relations from which the graph is built.

Graph Constructors

The Graph Library provides two graph constructors:

undirected_graph for creating undirected graphs.
directed_graph for creating directed graphs.

Constructors are template modules. They are always instantiated with an edge set provided as an argument. The result is a module containing the appropriate node, edge, and is_directed relations.

Undirected Graphs

To create an undirected graph, apply the undirected_graph constructor module to a binary edge relation:

// read query
 
// Declare an edge set.
def my_edges = {(1, 2); (2, 3)}
 
// Create an undirected graph with edges from my_edges and display the contents.
def output = undirected_graph[my_edges]

Note that is_directed does not appear in the graph module since the graph is undirected.

Edges in an undirected graph represent a bidirectional relationship between nodes. In Rel, edge sets of undirected graphs are modeled as symmetric relations. In other words, if (x, y) is a tuple in the graph’s edge set, then so is (y, x). This is why my_undirected_graph:edge contains four tuples. See Undirected Edges for more information.

⚠

Always use the num_edges relation to count the edges in a graph. Using count[my_graph:edge] may double count edges in undirected graphs. See Count Nodes and Edges for details.

The relation my_undirected_graph is equivalent to the following module declaration:

// read query
 
// Declare a directed graph using module syntax.
module my_undirected_graph
    def node = {1; 2; 3}
    def edge = {(1, 2); (2, 3)}
    def edge = transpose[edge]  // Ensure edge is symmetric.
end

The rule def edge = transpose[edge] uses the Standard Library’s transpose relation to ensure that edge includes the reverse of each tuple on the previous line.

Directed Graphs

To create a directed graph, apply the directed_graph constructor module to a binary edge relation:

// read query
 
// Declare an edge set.
def my_edges = {(1, 2); (2, 3)}
 
// Create a directed graph with edges from `my_edges` and display the contents.
def output = directed_graph[my_edges]

In a directed graph, edges indicate a unidirectional relationship between nodes. Unlike undirected graphs, edges in a directed graph correspond to a single tuple in the edge set.

The relation my_directed_graph is equivalent to the following module declaration:

// read query
 
// Declare a directed graph using module syntax.
module my_directed_graph
    def node = {1; 2; 3}
    def edge = {(1, 2); (2, 3)}
    def is_directed = true
end

💡

The relation true is equivalent to {()}, the relation containing a single empty tuple. See Boolean Constant Relations for more information.

Common Scenarios

Data for graphs may come from base relations, models, or be included directly in a read query. In most scenarios, however, data is loaded into a base relation. The graph module and the instantiated library module are then persisted to a model, and act as views over the underlying source data. This architecture offers two key advantages:

Re-usable relations with better performance, thanks to pre-computed data.
Automatic updates to the graph when base data are altered.

In this section, you’ll learn two ways to create and persist unlabeled graphs to a database starting from semantically rich data, such as:

Create a Graph From a CSV File

Graphs may be created from data in a CSV file by loading the CSV data into a base relation and declaring an edge set and a graph in a model. CSV files may be uploaded to the RAI Console or hosted with a cloud storage provider. See Data I/O: CSV Import for more details about loading CSV files.

Typically, CSV files are loaded from the cloud. For example, the following write query loads a CSV file from Azure. The CSV file has two columns, node_from and node_to, which contain data about the edges in a directed graph. The data from the CSV file is inserted into a base relation called graph_csv:

// write query
 
// Declare a configuration module with the path to the CSV file and the schema for the columns.
module config
    def path = "https://raidocs.blob.core.windows.net/datasets/power-grid/opsahl-powergrid.csv"
    def schema = {(:node_from, "int"); (:node_to, "int")}
end
 
// Load the CSV file and insert the data into a base relation called graph_csv.
def insert:graph_csv = load_csv[config]

💡

CSV data is loaded in Graph Normal Form. Each value in the CSV file is represented by a tuple of the form (:column_name, row_number, value). See Working With CSV Data for more information.

In order to create an edge relation from the CSV data, you must write some Rel that matches each node in the node_from column with the node in the node_to column on the same row. For the best performance, the edge relation, the graph constructed from it, and the instantiated rel:graphlib module are all persisted to a model:

// model
 
// Declare an edge set containing pairs of nodes (u, v) that appear in the same row
// of the node_from and node_to columns of graph_csv.
def my_edges(u, v) = {
    exists(row in FilePos:
        graph_csv(:node_from, row, u)
        and graph_csv(:node_to, row, v)
    )
}
 
// Build a directed graph by passing the edge set to the directed_graph constructor module.
def my_graph = directed_graph[my_edges]
 
// Instantiate rel:graphlib on my_graph.
@inline
def my_graphlib = rel:graphlib[my_graph]

The relations my_graph and my_graphlib can now be re-used in other queries:

// read query
 
// Display the number of nodes and edges in my_graph.
def output = my_graphlib:num_nodes, my_graphlib:num_edges

For information on loading data for a large graph that is split among several CSV files, see Loading Multiple Files in the Same Relation.

Create a Graph From a Knowledge Graph

The Graph Library can be used to perform graph analytics on a relational knowledge graph. To do so, build an edge set by selecting all or some portion of edges in the knowledge graph and apply one of the graph constructors to the edge set.

⚠

This section assumes that you have some familiarity with knowledge graphs. See My First Knowledge Graph and Elements of a Relational Knowledge Graph for details.

For example, the following knowledge graph contains people and the products they have purchased:

// model
 
module MyKG
    // The Person entity node type.
    entity type Person = String
    def Person = {
        ^Person["Ava"];
        ^Person["Barnard"];
        ^Person["Cynthia"]
    }
 
    // The Product entity node type.
    entity type Product = String
    def Product = {
        ^Product["Laptop"];
        ^Product["Toaster Oven"];
        ^Product["Toothbrush"];
        ^Product["Smartphone"]
    }
 
    // The PurchaseDate value node type.
    value type PurchaseDate = Date
 
    // Property edge that maps people to the products they have purchased.
    def purchased = {
        (^Person["Ava"], ^Product["Laptop"]);
        (^Person["Barnard"], ^Product["Smartphone"]);
        (^Person["Barnard"], ^Product["Toaster Oven"]);
        (^Person["Cynthia"], ^Product["Toothbrush"])
    }
 
    // Property edge that maps people to the products they have saved.
    def saved = {
        (^Person["Ava"], ^Product["Laptop"]);
        (^Person["Cynthia"], ^Product["Laptop"])
    }
 
    // Property hyperedge with triples that describe who bought what product and when.
    def purchased_by_on = {
        (^Product["Laptop"], ^Person["Ava"], ^PurchaseDate[2023-01-03]);
        (^Product["Smartphone"], ^Person["Barnard"], ^PurchaseDate[2023-03-12]);
        (^Product["Toaster Oven"], ^Person["Barnard"], ^PurchaseDate[2023-03-12]);
        (^Product["Toothbrush"], ^Person["Cynthia"], ^PurchaseDate[2023-02-04])
    }
end

🔎

Graphs built from knowledge graphs usually contain non-integer nodes. For the best performance, convert the nodes to integers before creating the graph. See Performance Tips for more information.

The following sections describe common ways to build edge sets for an unlabeled graph from edges in a knowledge graph.

From a Property Edge

To create a graph with the same nodes and edges as a binary property edge in a knowledge graph, apply a graph constructor directly to the property edge relation. For instance, the following model creates a directed graph of people and the products they have purchased using MyKG:purchased as the edge set:

// model
 
// Construct a directed graph with nodes and edges from MyKG:purchased.
def purchase_graph = directed_graph[MyKG:purchased]
 
// Instantiate rel:graphlib on purchase_graph.
@inline
def purchase_graph_lib = rel:graphlib[purchase_graph]

There are seven nodes — one for each of the three people and four products in the knowledge graph:

// read query
 
def output = purchase_graph:node

The nodes of the graph are hashes created from the ^Person and ^Product entity type constructors. In practice, it is best to assign nodes in the knowledge graph to integers before applying a graph constructor to the edge set.

From Multiple Property Edges

You may build edge sets from the union of multiple property edges in the knowledge graph. The following query builds a graph of people and products that they have either purchased or liked:

// model
 
// Edges for the graph are the union of the purchased and liked property edges in MyKG.
def purchased_and_saved_edges = {MyKG:purchased; MyKG:saved}
 
// Construct a directed graph from purchased_and_saved_edges.
def purchased_and_saved_graph = directed_graph[purchased_and_saved_edges]

Note that even if the same edge appears in both MyKG:purchased and MyKG:saved, it only appears once in purchased_and_saved_graph:edge. The graph has five edges, not six:

// read query
 
// Display the edges of purchased_and_saved_graph.
def output = purchased_and_saved_graph:edge

From a Hyperedge

Although the Graph Library does not support graphs with hyperedges, you can still build graphs from hyperedges in a knowledge graph — such as the MyKG:purchased_by_on hyperedge relation in the preceding example — by selecting pairs of nodes from each tuple.

For instance, the following query creates a graph of products and their purchase dates:

// model
 
// Build an edge set by selecting products and dates from the MyKG:purchased_on hyperedge.
def purchased_on_edges(product, date) { MyKG:purchased_by_on(product, _, date) }
 
// Construct a directed graph with nodes and edges from purchased_on_edges.
def purchased_on_graph = directed_graph[purchased_on_edges]

Note that purchased_on_graph has nodes with mixed data types. Some are entities and others are value types:

// read query
 
def output = purchased_on_graph:node

Graphs created from knowledge graphs often contain nodes with mixed data types. Assigning nodes to integers before applying a graph constructor to the edge set avoids mixed types and offers better performance.

Graph Library: Creating Graphs

Goal

Introduction

Graph Constructors

Undirected Graphs

Directed Graphs

Common Scenarios

Create a Graph From a CSV File

Create a Graph From a Knowledge Graph

From a Property Edge

From Multiple Property Edges

From a Hyperedge

See Also