Data Oriented

Language-integrated queries specify the logic in SQL-like syntax to process data and events. They are easy to write and understand due to the simplicity of the syntax. See how Ballerina provides first-class support for writing queries that process data.

As of now, language-integrated queries are supported for iterator implementations such as an array, map, stream, and table. There are two kinds of integrated queries that can be written in Ballerina:

Query Expressions

Query expressions allow you to generate a list, stream, table, string, or XML. The sections below look at the clauses you can use in a query expression.

Query Clauses

Query expressions contain a set of clauses similar to SQL to process the data. They must start with the from clause and can perform various operations such as filter, join, sort, limit, and projection. There are various SQL-like clauses below to perform these operations.

‘from’ Clause

The from clause is used to define the input iterator source that is considered for processing the data. Similar to the for each statement, you can specify any iterator as the expression. Similar to a list and stream, you can use all kinds of iterators as the input in the from clause.

‘where’ Clause

The where clause allows you to filter by condition. You can define any conditional expression, which returns a boolean. A where clause contains logical operators, equality, and range checks.

‘let’ Clause

The let clause allows you to define variables that can be used only within the query expression scope. These variables cannot be accessed out of the query expression. You can define one or more variables in the let clause and use them within query expressions.

‘join’ Clause

A join clause performs an inner or left outer equijoin. In the join clause, there are two input iterators. During the joining process, each value of an iterator is matched against all the values in the other iterator based on the given condition, and the output values are generated for all the matching event pairs. Here, you can only perform equality checks as the joining condition.

‘order by’ Clause

The order by clause allows ordering the result in the ascending and/or descending order based on the specified attributes. Ordering will be done in an ascending manner by default. You can use the descendingthe keyword to order in a descending manner. Here, attributes that are considered for the order by operations are order-keys, which should be an ordered type. You can define more than one order key in the order by clause and it’s possible to have more than one order by clause.

‘limit’ Clause

The limit clause limits the number of frames/values emitted by a query pipeline. You should define an integer value to specify the number of output values.

‘select’ Clause

The select clause is a mandatory clause in query expressions that is used for projection. You can use this clause to create the values required to generate iterators such as list, table, XML, string, and stream.

‘on conflict’ Clause

An on conflict clause is only allowed for a query expression that constructs a table with a key sequence. The expression is evaluated when the select clause emits a value that conflicts with a previous value, in the sense, that both values have the same key value in the table. The on conflict clause gets executed when the select clause emits a row that has the same key as a row that it emitted earlier. It gives an onConflictError error if there is a key conflict.

Query Actions

Query actions are executed in the same way as the clauses in the query expression. However, it doesn’t generate an output such as a list. Rather, it executes a set of statements defined by you. The block inside the do clause is executed in each iteration.

Query Action Example

var x = from var person in personList
        from var dept in deptList
        let string hrDepartment = "Human Resource"
        do {
            if ( == "HR") {
                Employee employee = {firstName: person.firstName, lastName: person.lastName, deptAccess: hrDepartment};
                employeeList[employeeList.length()] = employee;

Writing Integrated Queries

The example below provides in-depth knowledge on how to utilize the capabilities of the Ballerina query expressions. This example explains the use case of finding popular books in a store.

import ballerina/io;
type Author record {|
   readonly int id;
   string name;
type Category record {|
   readonly int id;
   string name;
type Book record {|
   readonly int id;
   string title;
   int year;
   float price;
   Author[] authors;
   Category[] categories;
type Sale record {|
   int bookId;
   int qty;
type AuthorTable table<Author> key(id);
type CategoryTable table<Category> key(id);
type BookTable table<Book> key(id);
error onConflictError = error("Key Conflict", message = "record with same key exists.");

In the above code, the necessary custom types and variables are created to perform the data manipulation.

function loadAuthors() returns AuthorTable|error {
   json[] authors = [
       {"id": 1, "name": "Giada De Laurentiis"},
       {"id": 2, "name": "J. K. Rowling"},
       {"id": 3, "name": "Henrique C. M. Andrade"},
       {"id": 4, "name": "Buğra Gedik"},
       {"id": 5, "name": "Deepak S. Turaga"}
   // Iterates through `authors` JSON array, constructs `Author` records, and collects them into a table.
   return table key(id) from var author in authors
       select check author.cloneWithType(Author)
       on conflict onConflictError;

Here, the author details are represented as JSON elements. In this method, the respective author JSON elements are converted into tabular data using Ballerina query expressions for further processing.

function loadCategories() returns CategoryTable|error {
   xml categories = xml `<categories>
   // Iterates through `categories` xml array, constructs `Category` records, and collects them into a table.
   return table key(id) from var category in categories/<category>
       select {
           id: check 'int:fromString((category/**/<id>/*).toString()),
           name: (category/**/<name>/*).toString()
       on conflict onConflictError;

The above method iterates through a categories XML array and constructs a table with the category ID and name. Query expression clauses such as from, select, and on conflict are used in it.

function getCategories(CategoryTable categories, int[] ids) returns Category[] {
   // Query `categories` for each id, and collects matching categories into an array.
   return from int id in ids
       join Category c in categories on id equals
       select c;

The above method returns the book category names according to the given category IDs. As written in the query expression, a join clause iterates through the categories array to find the respective category name according to the ID.

function getAuthors(AuthorTable authors, int[] ids) returns Author[] {
   // Query `authors` for each id, and collects matching authors into an array.
   return from int id in ids
       join Author a in authors on id equals
       select a;

Here, the author names are identified according to the author IDs. As similar to the previous method, query joins are utilized to create the authors array.

function loadBooks() returns BookTable|error {
   map<anydata>[] books = [
       {"id": 1, "title": "Everyday Italian", "year": 2005, "price": 30.00, "authors": [1], "categories": [1]},
       {"id": 2, "title": "Harry Potter", "year": 1997, "price": 29.99, "authors": [2], "categories": [2, 3]},
       {"id": 3, "title": "Fundamentals of Stream Processing", "year": 2014, "price": 123.99, "authors": [3, 4, 5], "categories": [4]},
       {"id": 4, "title": "Fantastic Beasts", "year": 2001, "price": 29.99, "authors": [2], "categories": [2, 3]}
   // Iterates through `books` map<anydata> array, constructs `Book` records, and collects them into a table.
   return table key(id) from var book in books
       let AuthorTable authors = check loadAuthors()
       let CategoryTable categories = check loadCategories()
       select {
           id: <int>book["id"],
           title: <string>book["title"],
           year: <int>book["year"],
           price: <float>book["price"],
           authors: getAuthors(authors, check book["authors"].cloneWithType()),
           categories: getCategories(categories, check book["categories"].cloneWithType())
       on conflict onConflictError;

Now, the author details and book category details exist as individual table values. The book details exist as an array. Further, this array has the necessary keys to identify the respective authors and book categories. Hence, there is a requirement to generate a table that contains thenrelevant book details, author names, and category names. As shown in the above example, Ballerina query expressions can be utilized to create such table values.

function getPopularBooks(stream<Sale> sales, int minSales, int 'limit) returns Book[]|error {
   // Join each sale value from `sales` stream with retrieved `BookTable`
   // filter books with at least `minSales` number of sales,
   // order filtered books by sales quantity,
   // and collect `'limit` number of books into an array.
   return from Sale s in sales
       join Book b in check loadBooks() on s.bookId equals
       where s.qty >= minSales
       order by s.qty descending
       limit 'limit
       select b;

Now, you have the book details as a table value and there is a requirement to identify the popular books based on the number of sales copies. As shown in the above example, you have to combine the sales array and the books table to identify the popular books. In this example, Ballerina query clauses such as join, where, order by, and limit are utilized to cater to the requirement.

public function main() returns error? {
   Sale[] sales = [
       {bookId: 1, qty: 150},
       {bookId: 2, qty: 3500},
       {bookId: 3, qty: 1250},
       {bookId: 4, qty: 2100}
   // Retrieve 2 most popular books with at least 1000 sales.
   Book[]|error mostPopular2Books = getPopularBooks(sales.toStream(), 1000, 2);

The main method is responsible for identifying the popular books according to the number of copies sold. Hence, the respective getPopularBooks method gets called with the required parameters.