EXPERIMENTAL: These features are experimental and should not be used in production systems.
Graph Library: Creating Graphs
Goal
This guide demonstrates how to use the Graph Libraryβs graph constructor modules, load graphs from external data sources, such as a CSV file, and create graphs from knowledge graphs stored in RelationalAIβs Relational Knowledge Graph System.
Introduction
Graphs
are
modules
containing node
, edge
, and is_directed
relations.
For convenience, the Graph Library provides
graph constructors
that build modules with the correct schema.
See Graphs: Schema
for details.
Edge sets are usually built from base relations in a database. In most cases, edge sets and graphs are persisted to a model so that they can be maintained separately from and reused in other queries. This offers performance benefits and enables automatic updates to the graph when data are inserted into or deleted from the base relations from which the graph is built.
Graph Constructors
The Graph Library provides two graph constructors:
undirected_graph
for creating undirected graphs.directed_graph
for creating directed graphs.
Constructors are
template modules.
They are always instantiated with an edge set provided as an
argument.
The result is a module containing the appropriate
node
, edge
, and is_directed
relations.
Undirected Graphs
To create an undirected graph,
apply
the
undirected_graph
constructor
module
to a
binary edge relation:
// read query
// Declare an edge set.
def my_edges = {(1, 2); (2, 3)}
// Create an undirected graph with edges from my_edges and display the contents.
def output = undirected_graph[my_edges]
Note that is_directed
does not appear in the graph module
since the graph is undirected.
Edges in an undirected graph represent a bidirectional relationship between nodes.
In Rel, edge sets of undirected graphs are modeled as symmetric relations.
In other words, if (x, y)
is a tuple in the graphβs edge set, then so is (y, x)
.
This is why my_undirected_graph:edge
contains four tuples.
See
Undirected Edges
for more information.
Always use the
num_edges
relation to count the edges in a graph.
Using count[my_graph:edge]
may double count edges in undirected graphs.
See Count Nodes and Edges
for details.
The relation my_undirected_graph
is equivalent to the following module
declaration:
// read query
// Declare a directed graph using module syntax.
module my_undirected_graph
def node = {1; 2; 3}
def edge = {(1, 2); (2, 3)}
def edge = transpose[edge] // Ensure edge is symmetric.
end
The
rule
def edge = transpose[edge]
uses the Standard Libraryβs
transpose
relation to ensure that edge
includes the reverse of each tuple on the previous line.
Directed Graphs
To create a directed graph, apply the
directed_graph
constructor
module
to a
binary edge relation:
// read query
// Declare an edge set.
def my_edges = {(1, 2); (2, 3)}
// Create a directed graph with edges from `my_edges` and display the contents.
def output = directed_graph[my_edges]
In a directed graph, edges indicate a unidirectional relationship between nodes. Unlike undirected graphs, edges in a directed graph correspond to a single tuple in the edge set.
The relation my_directed_graph
is equivalent to the following module
declaration:
// read query
// Declare a directed graph using module syntax.
module my_directed_graph
def node = {1; 2; 3}
def edge = {(1, 2); (2, 3)}
def is_directed = true
end
The relation
true
is equivalent to {()}
, the relation containing a single empty tuple.
See Boolean Constant Relations
for more information.
Common Scenarios
Data for graphs may come from base relations, models, or be included directly in a read query. In most scenarios, however, data is loaded into a base relation. The graph module and the instantiated library module are then persisted to a model, and act as views over the underlying source data. This architecture offers two key advantages:
- Re-usable relations with better performance, thanks to pre-computed data.
- Automatic updates to the graph when base data are altered.
In this section, youβll learn two ways to create and persist unlabeled graphs to a database starting from semantically rich data, such as:
Create a Graph From a CSV File
Graphs may be created from data in a CSV file by loading the CSV data into a base relation and declaring an edge set and a graph in a model. CSV files may be uploaded to the RAI Console or hosted with a cloud storage provider. See Data I/O: CSV Import for more details about loading CSV files.
Typically, CSV files are loaded from the cloud.
For example, the following write
query
loads a CSV file from Azure.
The CSV file has two columns, node_from
and node_to
,
which contain data about the
edges
in a directed graph.
The data from the CSV file is inserted into a base relation called graph_csv
:
// write query
// Declare a configuration module with the path to the CSV file and the schema for the columns.
module config
def path = "https://raidocs.blob.core.windows.net/datasets/power-grid/opsahl-powergrid.csv"
def schema = {(:node_from, "int"); (:node_to, "int")}
end
// Load the CSV file and insert the data into a base relation called graph_csv.
def insert:graph_csv = load_csv[config]
CSV data is loaded in
Graph Normal Form.
Each value in the CSV file is represented by a
tuple
of the form (:column_name, row_number, value)
.
See
Working With CSV Data
for more information.
In order to create an edge relation from the CSV data,
you must write some
Rel
that matches each
node
in the node_from
column
with the node in the node_to
column on the same row.
For the best performance, the edge relation, the graph constructed from it,
and the instantiated rel:graphlib
module
are all persisted to a model:
// model
// Declare an edge set containing pairs of nodes (u, v) that appear in the same row
// of the node_from and node_to columns of graph_csv.
def my_edges(u, v) = {
exists(row in FilePos:
graph_csv(:node_from, row, u)
and graph_csv(:node_to, row, v)
)
}
// Build a directed graph by passing the edge set to the directed_graph constructor module.
def my_graph = directed_graph[my_edges]
// Instantiate rel:graphlib on my_graph.
@inline
def my_graphlib = rel:graphlib[my_graph]
The relations my_graph
and my_graphlib
can now be re-used in other queries:
// read query
// Display the number of nodes and edges in my_graph.
def output = my_graphlib:num_nodes, my_graphlib:num_edges
For information on loading data for a large graph that is split among several CSV files, see Loading Multiple Files in the Same Relation.
Create a Graph From a Knowledge Graph
The Graph Library can be used to perform graph analytics on a relational knowledge graph. To do so, build an edge set by selecting all or some portion of edges in the knowledge graph and apply one of the graph constructors to the edge set.
This section assumes that you have some familiarity with knowledge graphs. See My First Knowledge Graph and Elements of a Relational Knowledge Graph for details.
For example, the following knowledge graph contains people and the products they have purchased:
// model
module MyKG
// The Person entity node type.
entity type Person = String
def Person = {
^Person["Ava"];
^Person["Barnard"];
^Person["Cynthia"]
}
// The Product entity node type.
entity type Product = String
def Product = {
^Product["Laptop"];
^Product["Toaster Oven"];
^Product["Toothbrush"];
^Product["Smartphone"]
}
// The PurchaseDate value node type.
value type PurchaseDate = Date
// Property edge that maps people to the products they have purchased.
def purchased = {
(^Person["Ava"], ^Product["Laptop"]);
(^Person["Barnard"], ^Product["Smartphone"]);
(^Person["Barnard"], ^Product["Toaster Oven"]);
(^Person["Cynthia"], ^Product["Toothbrush"])
}
// Property edge that maps people to the products they have saved.
def saved = {
(^Person["Ava"], ^Product["Laptop"]);
(^Person["Cynthia"], ^Product["Laptop"])
}
// Property hyperedge with triples that describe who bought what product and when.
def purchased_by_on = {
(^Product["Laptop"], ^Person["Ava"], ^PurchaseDate[2023-01-03]);
(^Product["Smartphone"], ^Person["Barnard"], ^PurchaseDate[2023-03-12]);
(^Product["Toaster Oven"], ^Person["Barnard"], ^PurchaseDate[2023-03-12]);
(^Product["Toothbrush"], ^Person["Cynthia"], ^PurchaseDate[2023-02-04])
}
end
Graphs built from knowledge graphs usually contain non-integer nodes. For the best performance, convert the nodes to integers before creating the graph. See Performance Tips for more information.
The following sections describe common ways to build edge sets for an unlabeled graph from edges in a knowledge graph.
From a Property Edge
To create a graph with the same nodes and edges as a
binary
property edge in a knowledge graph,
apply a graph constructor directly to the property edge relation.
For instance, the following model creates
a directed graph of people and the products they have purchased
using MyKG:purchased
as the edge set:
// model
// Construct a directed graph with nodes and edges from MyKG:purchased.
def purchase_graph = directed_graph[MyKG:purchased]
// Instantiate rel:graphlib on purchase_graph.
@inline
def purchase_graph_lib = rel:graphlib[purchase_graph]
There are seven nodes β one for each of the three people and four products in the knowledge graph:
// read query
def output = purchase_graph:node
The nodes of the graph are hashes created from the ^Person
and ^Product
entity type constructors.
In practice, it is best to assign nodes in the knowledge graph
to integers
before applying a graph constructor to the edge set.
From Multiple Property Edges
You may build edge sets from the union of multiple property edges in the knowledge graph. The following query builds a graph of people and products that they have either purchased or liked:
// model
// Edges for the graph are the union of the purchased and liked property edges in MyKG.
def purchased_and_saved_edges = {MyKG:purchased; MyKG:saved}
// Construct a directed graph from purchased_and_saved_edges.
def purchased_and_saved_graph = directed_graph[purchased_and_saved_edges]
Note that even if the same edge appears
in both MyKG:purchased
and MyKG:saved
,
it only appears once in purchased_and_saved_graph:edge
.
The graph has five edges, not six:
// read query
// Display the edges of purchased_and_saved_graph.
def output = purchased_and_saved_graph:edge
From a Hyperedge
Although the Graph Library does not support graphs with hyperedges,
you can still build graphs from
hyperedges
in a knowledge graph
β such as the MyKG:purchased_by_on
hyperedge relation in the preceding example β
by selecting pairs of nodes from each tuple.
For instance, the following query creates a graph of products and their purchase dates:
// model
// Build an edge set by selecting products and dates from the MyKG:purchased_on hyperedge.
def purchased_on_edges(product, date) { MyKG:purchased_by_on(product, _, date) }
// Construct a directed graph with nodes and edges from purchased_on_edges.
def purchased_on_graph = directed_graph[purchased_on_edges]
Note that purchased_on_graph
has nodes with mixed data types.
Some are entities and others are value types:
// read query
def output = purchased_on_graph:node
Graphs created from knowledge graphs often contain nodes with mixed data types. Assigning nodes to integers before applying a graph constructor to the edge set avoids mixed types and offers better performance.
See Also
See Nodes and Edges to learn more about working with nodes and edges in graphs. See the Overview of Graph Algorithms to explore all of the algorithms implemented in Relβs Graph Library.