Specification: Ballerina SQL Library

Owners: @daneshk @niveathika
Reviewers: @daneshk
Created: 2022/01/13
Updated: 2022/07/07
Edition: Swan Lake

Introduction

This is the specification for the SQL standard library of Ballerina language, which provides the generic interface and functionality to interact with a SQL database.

The SQL library specification has evolved and may continue to evolve in the future. The released versions of the specification can be found under the relevant GitHub tag.

If you have any feedback or suggestions about the library, start a discussion via a GitHub issue or in the Slack channel. Based on the outcome of the discussion, the specification and implementation can be updated. Community feedback is always welcome. Any accepted proposal, which affects the specification is stored under /docs/proposals. Proposals under discussion can be found with the label type/proposal in GitHub.

The conforming implementation of the specification is released and included in the distribution. Any deviation from the specification is considered a bug.

Contents

  1. Overview
  2. Client
    2.1. Handle connection pools
    2.2. Closing the Client
  3. Queries and Values
    3.1. ParameterizedQuery and Values
    3.2. ParameterizedCallQuery and Parameters
    3.3. Query concatenation
  4. Database operations
    4.1. Query
    4.2. Query row
    4.3. Execute
    4.4. Batch execute
    4.5. Call
  5. Metadata operations
    5.1. Schema client
    5.2. Metadata types
    5.3. Metadata methods
  6. Errors

1. Overview

This specification elaborates on the generic Client interface used in ballerina SQL connectors to interface with a relational database such as MySQL, MSSQL, Postgresql and OracleDB.

Client supports five database operations as follows,

  1. Executes the query, which may return multiple results.
  2. Executes the query, which is expected to return at most one row of the result.
  3. Executes the SQL query. Only the metadata of the execution is returned.
  4. Executes the SQL query with multiple sets of parameters in a batch. Only the metadata of the execution is returned.
  5. Executes a SQL query, which calls a stored procedure. This can either return results or nil.

All the above operations make use of ParameterizedQuery object, backtick surrounded string template to pass SQL statements to the database. ParameterizedQuery supports passing of Ballerina basic types or Typed SQL Values such as CharValue, BigIntValue, etc. to indicate parameter types in SQL statements.

2. Client

Each client represents a pool of connections to the database. The pool of connections is maintained throughout the lifetime of the client.

2.1. Handle connection pools

Configuration available for tweaking the connection pool properties:

There are three possible scenarios for connection pool handling,

  1. Global, shareable, default connection pool

    If the poolOptions field is NOT provided when creating the database client, a globally-shareable pool will be created for the database connection unless a connection pool matching with the properties provided already exists. The JDBC module sample below shows how the global connection pool is used.

  2. Client-owned, un-sharable connection pool

    If the connectionPool field is defined inline when creating the database client with the sql:ConnectionPool type, an un-sharable connection pool will be created. The JDBC module sample below shows how the global connection pool is used.

  3. Local, shareable connection pool

    If a record of the sql:ConnectionPool type is created and reused in the configuration of multiple clients, for each set of clients that connects to the same database instance with the same set of properties, a shared connection pool will be created. The JDBC module sample below shows how the global connection pool is used.

2.2. Close the client

Once all the database operations are performed, the client can be closed by invoking the close() operation. This will close the corresponding connection pool if it is not shared by any other database clients.

3. Queries and values

3.1. ParameterizedQuery and values

The sql:ParameterizedQuery is used to construct the SQL query to be executed by the client. It is backtick string template which allows dynamic values for query parameters.

Query with constant values

Query with dynamic values

All primitive Ballerina types are supported for parameter values in sql:ParameterizedQuery. Following mapping is used to map ballerina types to SQL types,

Ballerina typeSQL type
StringVARCHAR
intINTEGER
booleanBOOLEAN
floatREAL
decimalDECIMAL
byte[]BINARY
xmlSQLXML
record {}STRUCT
ArrayARRAY

In addition to the above time module constructs are used to represent SQL Time data types.

time constructsSQL type
time:DateDATE
time:TimeOfDayTIME
time:CivilDATETIME
time:UtcTIMESTAMP

Furthermore, typed values are used to map values to a specific SQL data type such as BIGINT,

List of typed values:

  1. VarcharValue
  2. NVarcharValue
  3. CharValue
  4. NCharValue
  5. TextValue
  6. ClobValue
  7. NClobValue
  8. SmallIntValue
  9. IntegerValue
  10. BigIntValue
  11. NumericValue
  12. DecimalValue
  13. RealValue
  14. FloatValue
  15. DoubleValue
  16. BitValue
  17. BooleanValue
  18. BinaryValue
  19. VarBinaryValue
  20. BlobValue
  21. DateValue
  22. TimeValue
  23. DateTimeValue
  24. TimestampValue
  25. RefValue
  26. StructValue
  27. RowValue

3.2. ParameterizedCallQuery and parameters

The sql:ParameterizedCallQuery is used to construct the SQL CALL Statement to be executed by the client. It is backtick string template which allows dynamic values for query parameters. In addition to Values supported by sql:ParameterizedQuery, sql:ParameterizedCallQuery supports following SQL Parameters,

  1. InOutParameter
  2. Typed OutParameters

These types can be used to retrieve values from SQL stored procedures using the get() method.

Type of the returned value is inferred from LHS of the expression.

3.3. Query concatenation

sql:ParameterizedQuery can be concatenated using util methods such as sql:queryConcat() and sql:arrayFlattenQuery() which makes it easier to create a dynamic/constant complex query.

The sql:queryConcat() is used to create a parameterized query by concatenating a set of parameterized queries.

The query with the IN operator can be created using the sql:ParameterizedQuery(). Here the values in the array are flattened with a comma separator to add it in the query.

The util function sql:arrayFlattenQuery() can be used to make the array flatten easier. It makes the inclusion of varying array elements into the query easier by flattening the array to return a parameterized query.

4. Database operations

The client supports five remote methods, each for one Database operation

4.1. Query

query() remote method execute the SQL query and returns multiple results.

Here the returned stream can consist of following types of records,

  1. Open record
    The property name in the open record type will be the same as how the column is defined in the database.

  2. Typed record
    A ballerina record type is created to represent the returned result set. the SELECT query is executed via the query remote function of the client. Once the query is executed, each data record can be retrieved by looping the result set. The stream returned by the select operation holds a pointer to the actual data in the database, and it loads data from the table only when it is accessed. This stream can be iterated only once.

    This record can be defined as an open or a closed record according to the requirement. If an open record is defined, the returned stream type will include both defined fields in the record and additional database columns fetched by the SQL query that are not defined in the record. Additional column names added to the returned record as in the SQL query. If the record is defined as a close record, only defined fields in the record are returned or gives an error when additional columns present in the SQL query.

    Note the mapping of the database column to the returned record's property is case-insensitive if it is defined in the record(i.e., the ID column in the result can be mapped to the id property in the record).

    sql:Column annotation can be used to map database columns to record fields of different name. This annotation should be attached to record fields.

    The above annotation will map the database column first_name to the Ballerina record field firstName. If the query() function does not return first_name column, the field will not be populated.

    Multiple table columns can be matched to a single Ballerina record within a returned record. For instance if the query returns data from multiple tables as follows, schema Both TEACHERS.id and TEACHERS.name can be grouped to another Typed record such as Teacher type.

    In the above scenario also, sql:Column annotation can be used to rename field name such as,

The returned stream needs to be closed properly to release resources. The stream is automatically closed if either it is iterated fully or consists of an error. If result is accessed one by one using next() method, it should be closed after the required results are accessed.

Usage:

4.2. Query row

queryRow() remote method executes the SQL query return at most one row of the result.

The provided return type(inferred from LHS) can be of 2 types,

  1. Ballerina record Returns only the first row retrieved by the query as a record.

  2. Ballerina primitive type Return the value of the first column of the first row retrieved by the query.

sql:NoRowsError is returned if the query does not return at most of one result.

4.3. Execute

execute() remote method executes the SQL query return metadata of the execution.

The metadata is returned as sql:ExecutionResult record,

This sample demonstrates modifying data by executing an UPDATE statement via the execute remote function of the client.

4.4. Batch execute

batchExecute() remote method executes the SQL query with multiple sets of parameters in a batch.

This sample demonstrates how to insert multiple records with a single INSERT statement that is executed via the batchExecute remote function of the client. This is done by creating a table with multiple records and parameterized SQL query as same as the above execute operations.

4.5. Call

call() remote method executes a SQL query, which calls a stored procedure. This can either return results or nil.

The returned results are returned as of type sql:ProcedureCallResult,

Here getNextQueryResult() can be used to update the pointer to iterate through the returned result sets.

This sample demonstrates how to execute a stored procedure with a single INSERT statement that is executed via the call remote function of the client.

The close operation needs to be explicitly invoked on the sql:ProcedureCallResult to release the connection resources and avoid a connection leak as shown above.

5. Metadata operations

5.1. Schema client

The SchemaClient is used to query the database to retrieve the relevant metadata. On initialization, the user would have to provide credentials to access the relevant metadata tables. It is also required to provide the name of the database regarding which the metadata should be retrieved.

This client will be extended by each SQL connector to customize the implementation as necessary.

5.2. Metadata types

New record-types would be introduced to represent the metadata which may be retrieved.

  • TableDefinition
  • ColumnDefinition
  • CheckConstraintDefinition
  • ReferentialConstraintDefinition
  • RoutineDefinition
  • ParameterDefinition

These record types contain only fields which are common to relational databases. However, they may be inherited and then extended by each SQL connector to provide support for additional database-specific fields.

Table definition

The columns field is optional as there are use cases where the user would not want to retrieve information regarding the columns of a table.

The TableType type is an enum, which can take one of two values:

  • BASE_TABLE
  • VIEW

Column definition

The referentialConstraints and checkConstraints fields are optional as there may be use cases where the user does not want to retrieve this information.

Referential constraint

The ReferentialRule type is an enum with four possible values:

  • NO_ACTION
  • RESTRICT
  • CASCADE
  • SET_NULL
  • SET_DEFAULT

Check constraint

Routine definition

The RoutineType type is an enum which can take one of two values

  • PROCEDURE
  • FUNCTION

Parameter definition

The ParameterMode type is an enum which can take one of three values

  • IN
  • OUT
  • INOUT

5.3. Metadata methods

The SchemaClient contains six methods which may be used to retrieve metadata.

  • listTables()
  • getTableInfo()
  • listRoutines()
  • getRoutineInfo()

All of these methods will be implemented by each SQL connector.

Retrieve all tables

This would fetch the names of all the tables in the database.

Retrieve a single table

This would fetch all relevant information from a given table from the database. Based on the option provided for the include parameter, relevant column information would also be retrieved.

Retrieve all routines

This would fetch the names of all the routines created in the database.

Retrieve a single routine

This would fetch all relevant information regarding the provided routine (including the parameter data).

6. Errors

sql package consists of following Errors,

All errors include detailed error message.

Include error details along with this sql:DatabaseError as follows.

sql:BatchExecuteError includes detail of the metadata of the executed queries before the error occurred.