Query composition

Review

In the previous sections we’ve:

  1. Setup TypeDB and Studio (or Console)

  2. Created a schema

    1. Created a database called get_started.

    2. Defined the social network schema:

      Details
      #!test[schema]
      define
      attribute username,
       value string;
      attribute start-date,
       value datetime;
      attribute status,
       value string;
      attribute student-count,
       value integer;
      entity user,
        owns status @card(0..),
        owns username @key,
        plays enrolment:student,
        plays friendship:friend;
      entity organization,
        owns status,
        owns username,
        plays friendship:friend;
      entity company,
        sub organization;
      entity university,
        sub organization,
        owns student-count,
        plays enrolment:university;
      relation friendship,
        relates friend @card(2..2),
        owns start-date @card(0..1);
      relation enrolment,
        relates student,
        relates university;
  3. Loaded some data using a insert and match-insert queries:

Details
#!test[write]
insert $u0 isa user, has username "alice";
#!test[write]
insert
$u1 isa user, has username "bob";
$u2 isa user;
$u2 has username "charlene";  # can split `isa` and `has` across different statements
#!test[write]
insert
$u3 isa user, has username "delta";
$u4 isa user, has username "echo";
$u5 isa user, has username "fizzbuzz";
friendship (friend: $u3, friend: $u4);
friendship (friend: $u3, friend: $u5);
#!test[write, count=1]
match
$x isa user, has username "alice";
$y isa user, has username "bob";
insert
friendship (friend: $x, friend: $y), has start-date 2015-01-01T00:00:00;

Overview

In this page, we’ll look at another useful query operation, reduce, and how to compose small queries into larger query pipelines by chaining clauses together. Query pipelines are extremely powerful, and can be to express much more interesting and complex logic in single queries.

We’ll then introduce functions, which are another way to compose modular and reusable pieces of queries together.

Finally, we’ll cover the fetch clause, which lets you format your data as JSON and look up additional information before returning results.

Reduce

The reduce operation works by collapsing the answers provided by a query using an aggregation operation, such as count or sum.

We call reduce an operator rather than a clause, since it does not read or write to the database state - it only transforms the answers it receives.

  • Studio

  • Console

Using a read transaction, let’s count the number of users in the database:

match
$user isa user;
reduce $count = count;

We see the following result:

------------
$count | 6
------------

Notice that the user variable $user has been consumed by reduce, and a new variable $count has been created.

Reductions can also use groupby to aggregate within groups:

match
$user isa user, has username $username;
friendship ($user, $friend);
reduce $num-friends = count groupby $user, $username;

Returning:

------------------
$num-friends | 1
$user        | isa user, iid 0x1e00000000000000000005
$username    | isa username "fizzbuzz"
------------------
$num-friends | 1
$user        | isa user, iid 0x1e00000000000000000001
$username    | isa username "bob"
------------------
$num-friends | 2
$user        | isa user, iid 0x1e00000000000000000003
$username    | isa username "delta"
------------------
$num-friends | 1
$user        | isa user, iid 0x1e00000000000000000000
$username    | isa username "alice"
------------------
$num-friends | 1
$user        | isa user, iid 0x1e00000000000000000004
$username    | isa username "echo"
------------------

Here we see the number of friends for each user, grouped by unique pairs of (user, username). Of course, we know that each user has a unique username, so we don’t actually have to keep both!

Notice that the variables in the grouping are not consumed by reduce and are returned in the results.

Open a read transaction with transaction read get-started to count the number of users in the database:

#!test[read, count=1]
match
$user isa user;
reduce $count = count;

We see the following result:

------------
$count | 6
------------

Notice that the user variable $user has been consumed by reduce, and a new variable $count has been created.

Reductions can also use groupby to aggregate within groups:

#!test[read, count=5]
match
$user isa user, has username $username;
friendship ($user, $friend);
reduce $num-friends = count groupby $user, $username;

Returning:

------------------
$num-friends | 1
$user        | isa user, iid 0x1e00000000000000000005
$username    | isa username "fizzbuzz"
------------------
$num-friends | 1
$user        | isa user, iid 0x1e00000000000000000001
$username    | isa username "bob"
------------------
$num-friends | 2
$user        | isa user, iid 0x1e00000000000000000003
$username    | isa username "delta"
------------------
$num-friends | 1
$user        | isa user, iid 0x1e00000000000000000000
$username    | isa username "alice"
------------------
$num-friends | 1
$user        | isa user, iid 0x1e00000000000000000004
$username    | isa username "echo"
------------------

Here we see the number of friends for each user, grouped by unique pairs of (user, username). Of course, we know that each user has a unique username, so we don’t actually have to keep both!

Notice that the variables in the grouping are not consumed by reduce and are returned in the results.

If this feels familiar, it’s because you’ve likely encountered list, iterator, or stream reduce operations in programming languages. Keep that idea in mind as we move forward.

Chaining clauses into pipelines

Let’s say we’d like to write a query that does the following:

Make every user friends with the most popular user, if they aren’t already friends.

Conceptually, this requires three steps:

  1. Find the most popular user

  2. Find all users that aren’t friends with the most popular user

  3. Make them friends with the most popular user

We can build this query step by step by chaining together query pieces into an extended query pipeline.

Let’s start with the first step:

  • Studio

  • Console

We’ll build on top of the reduce query we wrote previously. The most popular user is the one with the highest number of friends, so we’ll sort the previous reduce results by number of friends, and limit to the first answer.

We can do that with two new operators: sort and limit.

sort and limit are stream operators like reduce, since they manipulate answer stream provided by the previous clause. They are probably also familiar to you since most query languages have some sort of Iterator or Stream construct, which you can sort and limit! So you can see that TypeQL queries are really similar to Streams in programming languages.

We generally say that we construct query stream pipelines using query stages, and each stage is either a clause (if it reads or writes to the database), or a stream operator (if it only manipulates the stream).

A good mental model of TypeQL queries is that any stage produces a stream of answers. Any stage can be chained onto any other kind of stage to manipulate the stream, or read and write to the database.

This is a very powerful idea for composing queries, as we’ll see next! Let’s go back to our example:

match
$user isa user, has username $username;
friendship ($user, $friend);
reduce $num-friends = count groupby $user, $username;
sort $num-friends desc;
limit 1;

Which returns:

------------------
$num-friends | 2
$user        | isa user, iid 0x1e00000000000000000003
$username    | isa username "delta"
------------------

With this query pipeline, we’ve identified the most popular user using a match clause, and the reduce, sort, and limit operators!

We’ll build on top of the reduce query we wrote previously. The most popular user is the one with the highest number of friends, so we’ll sort the previous reduce results by number of friends, and limit to the first answer.

We can do that with two new operators: sort and limit.

sort and limit are stream operators like reduce, since they manipulate answer stream provided by the previous clause. They are probably also familiar to you since most query languages have some sort of Iterator or Stream construct, which you can sort and limit! So you can see that TypeQL queries are really similar to Streams in programming languages.

We generally say that we construct query stream pipelines using query stages, and each stage is either a clause (if it reads or writes to the database), or a stream operator (if it only manipulates the stream).

A good mental model of TypeQL queries is that any stage produces a stream of answers. Any stage can be chained onto any other kind of stage to manipulate the stream, or read and write to the database.

This is a very powerful idea for composing queries, as we’ll see next! Let’s go back to our example:

#!test[read, count=1]
match
$user isa user, has username $username;
friendship ($user, $friend);
reduce $num-friends = count groupby $user, $username;
sort $num-friends desc;
limit 1;

Which returns:

------------------
$num-friends | 2
$user        | isa user, iid 0x1e00000000000000000003
$username    | isa username "delta"
------------------

With this query pipeline, we’ve identified the most popular user using a match clause, and the reduce, sort, and limit operators!

We’ll first simplify a bit by eliminating the num-friends value from each answer to avoid confusion. This is achieved by using select to include only the user and username answers.

Next we need to find users that aren’t friends with this user. For this we’ll use another match clause, which can use answers provided by the previous stage:

  • Studio

  • Console

We need not statements to filter out users that are already friends with the most popular user, and also to eliminate the most-popular user from appearing again:

match
$user isa user, has username $username;
friendship ($user, $friend);
reduce $num-friends = count groupby $user, $username;

sort $num-friends desc;
limit 1;
select $user, $username;

match
$other-user isa user, has username $other-username;
not { $user is $other-user; };
not { friendship ($user, $other-user); };

This returns:

---------------------
$other-user     | isa user, iid 0x1e00000000000000000000
$other-username | isa username "alice"
$user           | isa user, iid 0x1e00000000000000000003
$username       | isa username "delta"
---------------------
$other-user     | isa user, iid 0x1e00000000000000000001
$other-username | isa username "bob"
$user           | isa user, iid 0x1e00000000000000000003
$username       | isa username "delta"
---------------------
$other-user     | isa user, iid 0x1e00000000000000000002
$other-username | isa username "charlene"
$user           | isa user, iid 0x1e00000000000000000003
$username       | isa username "delta"
---------------------

At the end of this pipeline, we have a stream where each answer contains the most popular user and a different user who isn’t friends with them yet.

We need not statements to filter out users that are already friends with the most popular user, and also to eliminate the most-popular user from appearing again:

#!test[read, count=3]
match
$user isa user, has username $username;
friendship ($user, $friend);
reduce $num-friends = count groupby $user, $username;

sort $num-friends desc;
limit 1;
select $user, $username;

match
$other-user isa user, has username $other-username;
not { $user is $other-user; };
not { friendship ($user, $other-user); };

This returns:

---------------------
$other-user     | isa user, iid 0x1e00000000000000000000
$other-username | isa username "alice"
$user           | isa user, iid 0x1e00000000000000000003
$username       | isa username "delta"
---------------------
$other-user     | isa user, iid 0x1e00000000000000000001
$other-username | isa username "bob"
$user           | isa user, iid 0x1e00000000000000000003
$username       | isa username "delta"
---------------------
$other-user     | isa user, iid 0x1e00000000000000000002
$other-username | isa username "charlene"
$user           | isa user, iid 0x1e00000000000000000003
$username       | isa username "delta"
---------------------

At the end of this pipeline, we have a stream where each answer contains the most popular user and a different user who isn’t friends with them yet.

Let’s stack one more insert clause to create the desired friendships. Note that this means this query now is a write query, and requires a write transaction.

  • Studio

  • Console

match
$user isa user, has username $username;
friendship ($user, $friend);
reduce $num-friends = count groupby $user, $username;

sort $num-friends desc;
limit 1;
select $user, $username;

match
$other-user isa user, has username $other-username;
not { $user is $other-user; };
not { friendship ($user, $other-user); };

insert
friendship ($other-user, $user);

After you execute the query once, every user will be friends with the most popular user.

#!test[write, count=3]
match
$user isa user, has username $username;
friendship ($user, $friend);
reduce $num-friends = count groupby $user, $username;

sort $num-friends desc;
limit 1;
select $user, $username;

match
$other-user isa user, has username $other-username;
not { $user is $other-user; };
not { friendship ($user, $other-user); };

insert
friendship ($other-user, $user);

After you execute the query once, every user will be friends with the most popular user.

Fantastic! We’ve written a single query that identifies the most popular user, finds all users who aren’t friends with them, and creates the necessary friendships.

Pipelines are a very powerful tool for composing queries. They help eliminate round trips to the server, since even complex operations can be performed within the database.

They are also easy to read and understand, since the data flow is explicit in the order of the query stages. At the same time, query pipelines are very easy to debug: truncate your pipeline at any point and inspect intermediate results by executing the truncated query.

Functions

In the query we just wrote, we might want to reuse the "find the most popular user" part of the pipeline.

We can do this by declaring a function, which can be committed to the database as part of the schema.

Note that functions can contain full query pipelines, but cannot have side effects (e.g., they must be read-only).

  • Studio

  • Console

define

# note: if there are multiple "most popular users", tiebreak on the username
fun most_popular_user() -> user, username:
  match
  $user isa user, has username $username;
  friendship ($user, $friend);
  reduce $num-friends = count groupby $user, $username;
  sort $num-friends desc, $username asc;
  return first $user, $username;

This function returns the first user and username using the same logic as the long pipeline we just wrote, with a tiebreaker on the username if there are multiple "most popular users".

Don’t forget to commit so the function becomes available outside the transaction.

#!test[schema]
define

# note: if there are multiple "most popular users", tiebreak on the username
fun most_popular_user() -> user, username:
  match
  $user isa user, has username $username;
  friendship ($user, $friend);
  reduce $num-friends = count groupby $user, $username;
  sort $num-friends desc, $username asc;
  return first $user, $username;

This function returns the first user and username using the same logic as the long pipeline we just wrote, with a tiebreaker on the username if there are multiple "most popular users".

Don’t forget to commit so the function becomes available outside the transaction.

Now we can invoke the function in a pipeline.

Let’s try to find the users who are friends with our popular user:

  • Studio

  • Console

match
let $most-popular-user, $most-popular-username = most_popular_user();
$other-user isa user, has username $other-username;
not { $most-popular-user is $other-user; };
friendship ($other-user, $most-popular-user);

This query is now much shorter, since we’ve reused the logic from the function to find the most popular user!

We correctly find every user is friends with the user delta:

----------------------------
$most-popular-user     | isa user, iid 0x1e00000000000000000003
$most-popular-username | isa username "delta"
$other-user            | isa user, iid 0x1e00000000000000000004
$other-username        | isa username "echo"
----------------------------
$most-popular-user     | isa user, iid 0x1e00000000000000000003
$most-popular-username | isa username "delta"
$other-user            | isa user, iid 0x1e00000000000000000005
$other-username        | isa username "fizzbuzz"
----------------------------
$most-popular-user     | isa user, iid 0x1e00000000000000000003
$most-popular-username | isa username "delta"
$other-user            | isa user, iid 0x1e00000000000000000000
$other-username        | isa username "alice"
----------------------------
$most-popular-user     | isa user, iid 0x1e00000000000000000003
$most-popular-username | isa username "delta"
$other-user            | isa user, iid 0x1e00000000000000000001
$other-username        | isa username "bob"
----------------------------
$most-popular-user     | isa user, iid 0x1e00000000000000000003
$most-popular-username | isa username "delta"
$other-user            | isa user, iid 0x1e00000000000000000002
$other-username        | isa username "charlene"
----------------------------
#!test[read, count=5]
match
let $most-popular-user, $most-popular-username = most_popular_user();
$other-user isa user, has username $other-username;
not { $most-popular-user is $other-user; };
friendship ($other-user, $most-popular-user);

This query is now much shorter, since we’ve reused the logic from the function to find the most popular user!

We correctly find every user is friends with the user delta:

----------------------------
$most-popular-user     | isa user, iid 0x1e00000000000000000003
$most-popular-username | isa username "delta"
$other-user            | isa user, iid 0x1e00000000000000000004
$other-username        | isa username "echo"
----------------------------
$most-popular-user     | isa user, iid 0x1e00000000000000000003
$most-popular-username | isa username "delta"
$other-user            | isa user, iid 0x1e00000000000000000005
$other-username        | isa username "fizzbuzz"
----------------------------
$most-popular-user     | isa user, iid 0x1e00000000000000000003
$most-popular-username | isa username "delta"
$other-user            | isa user, iid 0x1e00000000000000000000
$other-username        | isa username "alice"
----------------------------
$most-popular-user     | isa user, iid 0x1e00000000000000000003
$most-popular-username | isa username "delta"
$other-user            | isa user, iid 0x1e00000000000000000001
$other-username        | isa username "bob"
----------------------------
$most-popular-user     | isa user, iid 0x1e00000000000000000003
$most-popular-username | isa username "delta"
$other-user            | isa user, iid 0x1e00000000000000000002
$other-username        | isa username "charlene"
----------------------------

Congratulations! You’re now ready to write sophisticated, modular queries.

We’ll cover one last incredibly useful feature: formatting TypeDB answers as JSON and incorporating additional data into JSON responses.

Fetch

TypeQL features one special clause: fetch. This clause can only be used at the end of a pipeline.

Fetch converts a standard TypeDB answer into a JSON document. You can use it to organize data into a specific structure, look up additional attributes, and even execute subqueries and nest their answers.

The fetch clause structure is designed to look exactly like the JSON document it will return, let’s take a look:

  • Studio

  • Console

match
$user isa user;
limit 3;
fetch {
  # get the username if it exists - if not, the JSON value will be set to null
  "username": $user.username,

  # run a subquery, and collect the answers into a list value
  "friends": [
    match
    $other-user isa user;
    friendship ($user, $other-user);
    fetch {
      "friend": $other-user.username
    };
  ]
};

This query now returns a sequence of JSON documents:

{
  "username": "alice",
  "friends": [
    {
      "friend": "bob"
    },
    {
      "friend": "delta"
    }
  ]
}
{
  "username": "bob",
  "friends": [
    {
      "friend": "alice"
    },
    {
      "friend": "delta"
    }
  ]
}
{
  "username": "charlene",
  "friends": [
    {
      "friend": "delta"
    }
  ]
}
#!test[read, documents]
match
$user isa user;
limit 3;
fetch {
  # get the username if it exists - if not, the JSON value will be set to null
  "username": $user.username,

  # run a subquery, and collect the answers into a list value
  "friends": [
    match
    $other-user isa user;
    friendship ($user, $other-user);
    fetch {
      "friend": $other-user.username
    };
  ]
};

This query now returns a sequence of JSON documents:

{
  "username": "alice",
  "friends": [
    {
      "friend": "bob"
    },
    {
      "friend": "delta"
    }
  ]
}
{
  "username": "bob",
  "friends": [
    {
      "friend": "alice"
    },
    {
      "friend": "delta"
    }
  ]
}
{
  "username": "charlene",
  "friends": [
    {
      "friend": "delta"
    }
  ]
}

The fetch clause lets you lay out the exact JSON document you need for your application, making it easy to retrieve additional data such as optional attributes or answers from subqueries.

What next?

There is much more to explore in the world of TypeDB, including building custom query logic using disjunctions and negations, the deep exploration of data connection using functional recursion, and more.

This is just the beginning of your journey. If you have any questions, don’t hesitate to reach out!