Specification: Ballerina SerDes Library

Owners: @MohamedSabthar @shafreenAnfar @ThisaruGuruge
Reviewers: @shafreenAnfar @ThisaruGuruge
Created: 2022/08/01
Updated: 2022/08/01
Edition: Swan Lake

Introduction

This is the specification for the SerDes standard library of Ballerina language, which is used for serializing and deserializing subtypes of Ballerina anydata type.

The SerDes library specification has evolved and may continue to evolve in the future. The released versions of the specification can be found under the relevant GitHub tag.

If you have any feedback or suggestions about the library, start a discussion via a GitHub issue or in the Discord server. Based on the outcome of the discussion, the specification and implementation can be updated. Community feedback is always welcome. Any accepted proposal, which affects the specification is stored under /docs/proposals. Proposals under discussion can be found with the label type/proposal in GitHub.

The conforming implementation of the specification is released and included in the distribution. Any deviation from the specification is considered a bug.

Contents

  1. Overview
  2. Schema
  3. Proto3Schema
  4. Ballerina anydata to proto3 mapping

1. Overview

This specification elaborates on functionalities provided by the SerDes library and how the SerDes library maps the Ballerina anydata to a protocol buffer type.

2. Schema

Schema object defines the API to perform serialization and deserialization of Ballerina anydata. You can include this Schema object in a class and implement your serialization and deserialization logic. The Schema object definition is as follows.

public type Schema object {

  public isolated function serialize(anydata data) returns byte[]|Error;

  public isolated function deserialize(byte[] encodedMessage, typedesc<anydata> T = <>) returns T|Error;
}

2.1 serialize function

Serializes the value passed as the argument and returns byte[] on successful serialization or an Error on failure.

2.2 deserialize function

Deserializes the provided byte[] argument and returns the ballerina value or an Error on failure.

3. Proto3Schema

Proto3Schema class includes Schema object and provides the implementation to perform serialization and deserialization of Ballerina anydata using protocol buffers (proto3). The class definition of Proto3Schema is as follows.

public class Proto3Schema {
  *Schema;

  public isolated function init(typedesc<anydata> ballerinaDataType) returns Error? {
        check generateSchema(self, ballerinaDataType);
  }

  // Implementation of serialize(), deserialize() functions goes here
}

3.1 init function

Generates a proto3 message definition for the given typedesc<anydata> when instantiating a Proto3Schema object.

3.2 serialize function

Serializes the value passed as the argument and returns byte[] on successful serialization or an Error on failure. The underlying implementation uses the previously generated proto3 message definition to serialize the provided value. Passing a value that doesn't match the type provided during the instantiation of the Proto3Schema object may results in a serialization failure. The following code shows an example of performing serialization.

import ballerina/serdes;

// Define a type which is a subtype of anydata.
type Student record {
    int id;
    string name;
    decimal fees;
};

public function main() returns error? {

    // Assign the value to the variable
    Student student = {
        id: 7894,
        name: "Liam",
        fees: 24999.99
    };

    // Create a serialization object by passing the typedesc.
    // This creates an underlying protocol buffer schema for the typedesc.
    serdes:Proto3Schema serdes = check new (Student);

    // Serialize the record value to bytes.
    byte[] bytes = check serdes.serialize(student);
}

3.3 deserialize function

Deserializes the provided byte[] argument and returns the ballerina value with the type represented by the typedesc value provided during the Proto3Schema object instantiation. The underlying implementation uses the generated proto3 message definition to serialize the provided value. Passing a byte[] that is not a serialized value of the specified type may result in a deserialization failure or a garbage value. The following code shows an example of performing deserialization.

import ballerina/io;
import ballerina/serdes;

// Define a type which is a subtype of anydata.
type Student record {
    int id;
    string name;
    decimal fees;
};

public function main() returns error? {

    byte[] bytes = readSerializedDataToByteArray();

    // Deserialize the record value from bytes. 
    Student student = check serdes.deserialize(bytes);

    // Print deserialized data.
    io:println(student);
}

4. Ballerina anydata to proto3 mapping

As specified before, the Proto3Schema dynamically generates proto3 message definition for given subtypes of Ballerina anydata. The following sections define the mapping for each subtype.

4.1 Ballerina primitives

Ballerina Proto message
int age = 24;
message IntValue {
  sint64 atomicField = 1;
}
float mass = 24.5;
message FloatValue {
  double atomicField = 1;
}
boolean isOpen = true;
message BooleanValue {
  bool atomicField = 1;
}
byte count = 3;
message ByteValue {
  bytes atomicField = 1;
}
string package = "serdes";
message StringValue {
    string atomicField = 1;
}
decimal salary = 2e5;
message DecimalValue {
    uint32 scale = 1;
    uint32 precision = 2;
    bytes value = 3;
}

4.2 Array

  1. Simple arrays
Ballerina Proto message
type IntArray int[];
message ArrayBuilder {
    repeated sint64 arrayField = 1;
}
type FloatArray float[];
message ArrayBuilder {
    repeated double arrayField = 1;
}
type DecimalArray decimal[];
message ArrayBuilder {
  message DecimalValue {
     uint32 scale  = 1;
     uint32 precision  = 2;
     bytes value  = 3;
  }
  repeated DecimalValue arrayField  = 1;
}
  1. Multidimensional arrays
Ballerina Proto message
type String2DArray string[][];
message ArrayBuilder {
  message ArrayBuilder {
     repeated string arrayField  = 1;
  }
  repeated ArrayBuilder arrayField  = 1;
}
type Decimal3DArray decimal[][][];
message ArrayBuilder {
  message ArrayBuilder {
    message ArrayBuilder {
      message DecimalValue {
         uint32 scale  = 1;
         uint32 precision  = 2;
         bytes value  = 3;
      }
      repeated DecimalValue arrayField  = 1;
    }
    repeated ArrayBuilder arrayField  = 1;
  }
  repeated ArrayBuilder arrayField  = 1;
}

4.3 Union

  1. Union with primitive types
Ballerina Proto message
type PrimitiveUnion  int|byte|float|decimal|string?;
message UnionBuilder {
  message DecimalValue {
     uint32 scale  = 1;
     uint32 precision  = 2;
     bytes value  = 3;
  }
  sint64 int___unionField  = 1;
  bytes byte___unionField  = 2;
  double float___unionField  = 3;
  DecimalValue decimal___unionField  = 4;
  string string___unionField  = 5;
  bool nullField  = 6;
}
`<type>___` prefix added to avoid name collision in protobuf schema generation
  1. Union of multidimensional arrays
Ballerina Proto message
type UnionWithArray int[][]|float[]|string[][][];
message UnionBuilder {
  message int___ArrayBuilder_1 {
     repeated sint64 arrayField  = 1;
  }
  message string___ArrayBuilder_2 {
    message ArrayBuilder {
       repeated string arrayField  = 1;
    }
    repeated ArrayBuilder arrayField  = 1;
  }
  repeated int___ArrayBuilder_1 int___arrayField_2___unionField  = 1;
  repeated double float___arrayField_1___unionField  = 2;
  repeated string___ArrayBuilder_2 string___arrayField_3___unionField  = 3;
}
A (union) member array has the following name format for message field name and nested message name:
  • Field name format: `<type>___arrayField_<dimension>_unionField`
  • Nested message name format: `<type>_ArrayBuilder_<dimension>`
Here `<type>`, `<dimension>` used to avoid name collision in protobuf schema generation.
  1. Union of union-arrays
Ballerina Proto message
type IntOrString int|string;

type FloatOrNill float?;

type UnionArray IntOrString[]|FloatOrNill[];
message UnionBuilder {
  message IntOrString___UnionBuilder {
     sint64 int___unionField  = 1;
     string string___unionField  = 2;
  }
  message FloatOrNill___UnionBuilder {
     double float___unionField  = 1;
     bool nullField  = 2;
  }
  repeated IntOrString___UnionBuilder IntOrString___arrayField_1___unionField  = 1;
  repeated FloatOrNill___UnionBuilder FloatOrNill___arrayField_1___unionField  = 2;
}

4.4 Record

  1. Simple record with primitive types
Ballerina Proto message
type Employee record {
    string name;
    byte age;
    int weight;
    float height;
    boolean isMarried;
    decimal salary;
};
message Employee {
  message DecimalValue {
     uint32 scale  = 1;
     uint32 precision  = 2;
     bytes value  = 3;
  }
  string name  = 1;
  bytes age  = 2;
  sint64 weight  = 3;
  double height  = 4;
  bool isMarried  = 5;
  DecimalValue salary  = 6;
}
Proto message name and field names are the same as the ballerina record type name and field names.
  1. Record with arrays fields
Ballerina Proto message
type RecordWithSimpleArrays record {
    string[] stringArray;
    int[] intArray;
    float[] floatArray;
    boolean[] boolArray;
    byte[] byteArray;
};
message RecordWithSimpleArrays {
   repeated string stringArray  = 1;
   repeated sint64 intArray  = 2;
   repeated double floatArray  = 3;
   repeated bool boolArray  = 4;
   bytes byteArray  = 5;
}
type RecordWithMultidimentionalArrays record {
    string[][][] string3DArray;
    decimal[][] decimal2DArray;
};
message RecordWithMultidimentionalArrays {
  message decimal2DArray___ArrayBuilder {
    message DecimalValue {
       uint32 scale  = 1;
       uint32 precision  = 2;
       bytes value  = 3;
    }
    repeated DecimalValue arrayField  = 1;
  }
  message string___ArrayBuilder_3 {
    message ArrayBuilder {
       repeated string arrayField  = 1;
    }
    repeated ArrayBuilder arrayField  = 1;
  }
  repeated string3DArray___ArrayBuilder string3DArray  = 1;
  repeated decimal2DArray___ArrayBuilder decimal2DArray  = 2;
}
  1. Record with union fields
Ballerina Proto message
type RecordWithUnion record {
    int|string? data;
};
message RecordWithUnion {
  message data___UnionBuilder {
     bool nullField  = 1;
     sint64 int___unionField  = 2;
     string string___unionField  = 3;
  }
  data___UnionBuilder data  = 1;
}
Nested message names of union messages are prefixed with ballerina record field name to avoid name collision, generally the union message name follows the form of `<recordFieldName>__UnionBuilder`
  1. Record with cyclic references
Ballerina Proto message
type Node1 record {
    string name;
    Nested2? nested;
};

type Node2 record {
    string name;
    Nested3? nested;
};

type Node3 record {
    string name;
    Nested1? nested;
};
message Node1 {
  message nested___UnionBuilder {
    message Node2 {
      message nested___UnionBuilder {
        message Node3 {
          message nested___UnionBuilder {
             Nested1 Nested1___unionField  = 1;
             bool nullField  = 2;
          }
          string name  = 1;
          nested___UnionBuilder nested  = 2;
        }
        Nested3 Nested3___unionField  = 1;
        bool nullField  = 2;
      }
      string name  = 1;
      nested___UnionBuilder nested  = 2;
    }
    Nested2 Nested2___unionField  = 1;
    bool nullField  = 2;
  }
  string name  = 1;
  nested___UnionBuilder nested  = 2;
}

4.5 Map

  1. Map with primitive types
Ballerina Proto message
type MapInt map<int>;
message MapBuilder {
  message MapFieldEntry {
     string key  = 1;
     sint64 value  = 2;
  }
  repeated MapFieldEntry mapField  = 1;
}
type MapDecimal map<decimal>;
message MapBuilder {
  message MapFieldEntry {
    message DecimalValue {
       uint32 scale  = 1;
       uint32 precision  = 2;
       bytes value  = 3;
    }
    string key  = 1;
    DecimalValue value  = 2;
  }
  repeated MapFieldEntry mapField  = 1;
}
  1. Map with records
Ballerina Proto message
type Status record {
    int code;
    string message?;
};

type MapRecord map<Status>;
message MapBuilder {
  message MapFieldEntry {
    message Status {
       sint64 code  = 1;
       string message  = 2;
    }
    string key  = 1;
    Status value  = 2;
  }
  repeated MapFieldEntry mapField  = 1;
}
  1. Map with arrays
Ballerina Proto message
type IntMatrix int[][];

type MapArray <IntMatrix>;
message MapBuilder {
  message MapFieldEntry {
    message ArrayBuilder {
       repeated sint64 arrayField  = 1;
    }
    string key  = 1;
    repeated ArrayBuilder value  = 2;
  }
  repeated MapFieldEntry mapField  = 1;
}
  1. Map with unions
Ballerina Proto message
type Status record {
    int code;
    string message?;
};

type IntMatrix int[][];

type MapUnion map<Status|IntMatrix>;
message MapBuilder {
  message MapFieldEntry {
    message value___UnionBuilder {
      message Status {
         sint64 code  = 1;
         string message  = 2;
      }
     message int___ArrayBuilder_1 {
        repeated sint64 arrayField  = 1;
      }
      Status Status___unionField  = 1;
      repeated int___ArrayBuilder_1 int___arrayField_2___unionField  = 2;
    }
    string key  = 1;
    value___UnionBuilder value  = 2;
  }
  repeated MapFieldEntry mapField  = 1;
}
  1. Map with maps
Ballerina Proto message
type Status record {
    int code;
    string message?;
};

type IntMatrix int[][];

type MapUnion map<Status|IntMatrix>;

type MapOfMaps map<MapUnion>;
message MapBuilder {
  message MapFieldEntry {
    message MapBuilder {
      message MapFieldEntry {
        message value___UnionBuilder {
          message Status {
             sint64 code  = 1;
             string message  = 2;
          }
         message int___ArrayBuilder_1 {
            repeated sint64 arrayField  = 1;
          }
          Status Status___unionField  = 1;
          repeated int___ArrayBuilder_1 int___arrayField_2___unionField  = 2;
        }
        string key  = 1;
        value___UnionBuilder value  = 2;
      }
      repeated MapFieldEntry mapField  = 1;
    }
    string key  = 1;
    MapBuilder value  = 2;
  }
  repeated MapFieldEntry mapField  = 1;
}

4.6 Table

  1. Table with Map constraint
Ballerina Proto message
type Score map<int>;

type ScoreTable table<Score>;
message TableBuilder {
  message MapBuilder {
    message MapFieldEntry {
       string key  = 1;
       sint64 value  = 2;
    }
    repeated MapFieldEntry mapField  = 1;
  }
  repeated MapBuilder tableEntry  = 1;
}
  1. Table with record constraint
Ballerina Proto message
type Row record {
    int id;
    string name;
};

type RecordTable table<Row>;
message TableBuilder {
  message Row {
     sint64 id  = 1;
     string name  = 2;
  }
  repeated Row tableEntry  = 1;
}

4.7 Tuple

  1. Tuple with primitive type elements
Ballerina Proto message
type PrimitiveTuple [byte, int, float, boolean, string ,decimal];
message TupleBuilder {
  message DecimalValue {
     uint32 scale  = 1;
     uint32 precision  = 2;
     bytes value  = 3;
  }
  bytes element_1  = 1;
  sint64 element_2  = 2;
  double element_3  = 3;
  bool element_4  = 4;
  string element_5  = 5;
  DecimalValue element_6  = 6;
}
  1. Tuple with Union elements
Ballerina Proto message
type TupleWithUnion [byte|string, decimal|boolean];
message TupleBuilder {
  message element_1___UnionBuilder {
     bytes byte___unionField  = 1;
     string string___unionField  = 2;
  }
  message element_2___UnionBuilder {
   message DecimalValue {
      uint32 scale  = 1;
      uint32 precision  = 2;
      bytes value  = 3;
    }
    bool boolean___unionField  = 1;
    DecimalValue decimal___unionField  = 2;
  }
  element_1___UnionBuilder element_1  = 1;
  element_2___UnionBuilder element_2  = 2;
}
  1. Tuple with array elements
Ballerina Proto message
type UnionTupleElement byte|string;
type TupleWithArray [string[], boolean[][], int[][][], UnionTupleElement[]];
message TupleBuilder {
  message int___ArrayBuilder_2 {
    message ArrayBuilder {
       repeated sint64 arrayField  = 1;
    }
    repeated ArrayBuilder arrayField  = 1;
  }
 message UnionTupleElement___UnionBuilder {
    bytes byte___unionField  = 1;
    string string___unionField  = 2;
  }
 message boolean___ArrayBuilder_1 {
    repeated bool arrayField  = 1;
  }
  repeated string element_1  = 1;
  repeated boolean___ArrayBuilder_1 element_2  = 2;
  repeated int___ArrayBuilder_2 element_3  = 3;
  repeated UnionTupleElement___UnionBuilder element_4  = 4;
}
  1. Tuple with record elements
Ballerina Proto message
type Student record {
    string name;
    int courseId;
    decimal fees;
};

type Teacher record {
    string name;
    int courseId;
    decimal salary;
};

type TupleWithRecord [Student, Teacher];
message TupleBuilder {
  message Teacher {
    message DecimalValue {
       uint32 scale  = 1;
       uint32 precision  = 2;
       bytes value  = 3;
    }
    sint64 courseId  = 1;
    string name  = 2;
    DecimalValue salary  = 3;
  }
  message Student {
    message DecimalValue {
      uint32 scale  = 1;
      uint32 precision  = 2;
      bytes value  = 3;
    }
    sint64 courseId  = 1;
    DecimalValue fees  = 2;
    string name  = 3;
  }
  Student element_1  = 1;
  Teacher element_2  = 2;
}
  1. Tuple with tuple elements
Ballerina Proto message
type PrimitiveTuple [byte, int, float, boolean, string ,decimal];

type TupleWithUnion [byte|string, decimal|boolean];

type TupleOfTuples [PrimitiveTuple, TupleWithUnion];
message TupleBuilder {
  message element_2___TupleBuilder {
    message element_1___UnionBuilder {
       bytes byte___unionField  = 1;
       string string___unionField  = 2;
    }
    message element_2___UnionBuilder {
     message DecimalValue {
        uint32 scale  = 1;
        uint32 precision  = 2;
        bytes value  = 3;
      }
      bool boolean___unionField  = 1;
      DecimalValue decimal___unionField  = 2;
    }
    element_1___UnionBuilder element_1  = 1;
    element_2___UnionBuilder element_2  = 2;
  }
  message element_1___TupleBuilder {
    message DecimalValue {
       uint32 scale  = 1;
       uint32 precision  = 2;
       bytes value  = 3;
    }
    bytes element_1  = 1;
    sint64 element_2  = 2;
    double element_3  = 3;
    bool element_4  = 4;
    string element_5  = 5;
    DecimalValue element_6  = 6;
  }
  element_1___TupleBuilder element_1  = 1;
  element_2___TupleBuilder element_2  = 2;
}

4.8 Enum

Ballerina enum is a syntactic sugar of union of constant strings thus enum is handled as union in protobuf level

Ballerina Proto message
enum Color {
    RED=”red”,
    GREEN,
    BLUE
}
message UnionBuilder {
   string string___unionField  = 1;
}
const OPEN = "open";
const CLOSE = "close";
type STATE OPEN|CLOSE;
message UnionBuilder {
   string string___unionField  = 1;
}