Specification: Ballerina XmlData Library

Owners: @daneshk @kalaiyarasiganeshalingam @MadhukaHarith92
Reviewers: @daneshk
Created: 2021/12/10
Updated: 2022/06/07
Edition: Swan Lake

Introduction

This is the specification for the Xmldata standard library of Ballerina language, which provides APIs to perform conversions between XML and JSON/Ballerina records.

The Xmldata library specification has evolved and may continue to evolve in the future. The released versions of the specification can be found under the relevant Github tag.

If you have any feedback or suggestions about the library, start a discussion via a GitHub issue or in the Slack channel. Based on the outcome of the discussion, the specification and implementation can be updated. Community feedback is always welcome. Any accepted proposal, which affects the specification is stored under /docs/proposals. Proposals under discussion can be found with the label type/proposal in Github.

The conforming implementation of the specification is released and included in the distribution. Any deviation from the specification is considered a bug.

Contents

  1. Overview
  2. Data structure
  3. Rules
  4. Operations

1. Overview

This specification elaborates on the functionalities available in the Xmldata library.

This package considers JSON, XML, Ballerina record and Map data structure and creates the mapping for conversion by preserving their information and structure, and provides the following conversion between XML and JSON/Ballerina records/Map.

  • XML to JSON Conversion
  • XML to Ballerina record Conversion
  • JSON to XML Conversion
  • To XML Conversion
  • From XML Conversion

2. Data Structure

2.1. JSON

JSON is a textual format for representing a single or collection of following values:

  • a simple value (string, number, boolean, null)
  • an array of values
  • an object

2.2. XML

An XML value is a sequence representing the parsed content of an XML element. Values are sequences of zero or more items, where an item is one of the following:

  • element
  • text item consisting of characters
  • processing instruction
  • comment

2.3. Map

A map is an unordered collection of key-value pair elements. A map element is a key and value pair that maps one thing to another. The key must in a string. The value can be of a primitive or complex data type.

2.4. Record

A record is just a collection of fields. Record equality works the same as map equality. A record type descriptor describes a type of mapping value by specifying a type separately for the value of each field.

The record can be defined as an open or a closed record according to the requirement. If a closed record is defined, the returned data should have those defined fields with defined types. Otherwise, this is an open record. Hence, the returned data include both defined fields in the record and additional fields by conversion which are not defined in the record.

3. Rules

We have followed some set of rules for every conversion to preserve the information and structure of both input and output.

3.1. Rules for XML to JSON Conversion

The following rules are used during the conversion process:

  • The namespaces will be omitted or added by configuring preserveNamespaces.
  • Attributes and namespaces will be treated as regular JSON properties, and these keys have a prefix with a string to differentiate them from regular JSON properties.
  • Sequences of two or more similar elements will be converted to a JSON array.
  • Text nodes will be converted into a JSON property with the key as #content.
  • PI and comments in the XML will be omitted.

The following table shows a mapping between the different forms of XML, to a corresponding matching JSON representation by considering the above rules.

XML TypeXML SampleJSON Representation TypeJSON Representation of XML
Empty element<e/>
JSON key-Value pair
and value is ""
{"e":""}
Text itemvalue
Stringvalue
Comment<!-- value -->
Empty JSON
because it is not considered
in this mapping
{}
PI<?doc document="book.doc"?>
Empty JSON
because it is not considered
in this mapping
{}
Empty sequence``
Empty``
XML sequence,
with ‘element’s having
distinct keys
<key>
  <key1>value1</key1>
  <key2>value2</key2>
</key>
JSON object{
  "key":{
    "key1":"value1",
    "key2":"value2"
  }
}
XML sequence,
with ‘element’s having
identical keys
<keys>
  <key>value1</key>
  <key>value2</key>
  <key>value3</key>
</keys>
JSON object
which contains JSON array
{
  "keys":{
    "key":["value1","value2","value3"]
  }
}
XML sequence,
containing items of type
Element and Text
<key>
  value1 Value2
  <key1>value3</key1>
  <key2>value4</key2>
</key>
JSON object
with text value and
that key is ’#content’
{
  "key":{
    "#content":"value1 Value2",
    "key1":"value3",
    "key2":"value4"
  }
}
XML with attribute<foo key="value">5</foo>
JSON object.
Here, attribute has ‘@’ prefix
{
  "foo": {
    "@key": "value",
    "#content": "5"
  }
}
XML with attribute and namespace<foo key="value"
xmlns:ns0="http://sample.com">5</foo>
JSON object.
Here, attribute and namespace
have ‘@’ prefix
{
  "foo":{
    "@key":"value",
    "@xmlns:ns0":"<http://sample.com>",
    "#content":"5"
  }
}

3.2. Rules for XML to Record Conversion

This conversion also follows all the rules which will be applied during the XML to the JSON conversion process except the attributes and namespaces rule. Here, attributes and namespaces key will be converted with a prefix as _ in the record.

The table shows a mapping of XML with attribute and namespace to JSON.

XML TypeXML SampleRecord Representation TypeRecord Representation of XML
XML with attribute<foo key="value">5</foo>
JSON object.
Here, attribute has ‘_’ prefix.
{
  "foo": {
    "_key": "value",
    "#content": "5"
  }
XML with attribute and namespace<foo key="value"
xmlns:ns0="http://sample.com">5</foo>
JSON object.
Here, attribute and namespace
have ‘_’ prefix.
{
  "foo":{
     "_key":"value",
     "_xmlns:ns0":"<http://sample.com>",
     "#content":"5"
  }
}

3.3. Rules for JSON to XML Conversion

The following rules are used during the conversion process:

  • A default root element will be created while the following scenarios:
    • When JSON is a JSON array
    • When JSON data contains multiple key-value pairs
  • JSON array entries will be converted to individual XML elements.
  • For a JSON primitive value, convert the value as the text node of the XML element.
  • If JSON properties' keys have the prefix and that value is the same with attributePrefix value which is defined in the JsonOptions, those will be handled as attributes and namespaces in the XML.

The following table shows a mapping between the different forms of XML, to a corresponding matching JSON representation by considering the above rules.

JSON TypeJSON SampleXML Representation TypeXML Representation of XML
JSON object has single
key-value and value is ""
{"e":""}Empty element<root><e/></root>
Empty JSON``Empty Sequence``
Single value
(string, number, boolean)
valueXML textvalue
NullnullEmpty sequence``
JSON object with
single key-value
{
  "Store": {
    "name": "Anne",
    "address": {
     "street": "Main",
     "city": "94"
    }
  }
}
XML sequence<root>
  <Store>
    <name>Anne</name>
    <address>
      <street>Main</street>
      <city>94</city>
    </address>
  </Store>
</root>
JSON object with
distinct keys
{
   "key1":"value1",
   "key2":"value2"
}
XML sequence with root tag<root>
  <key1>value1</key1>
  <key2>value2</key2>
</root>
JSON array[
   {
     "key": "value1"
  },
  value2
]
XML sequence with root tag<root>
  <item>
    <key>value1</key>
  </item>
  <item>value2</item>
</root>
JSON object with key
as "#content"
{"#content":"value1"}XML textvalue1
JSON object with key
prefix as ‘@’
{
  "foo": {
    "@key": "value",
    "@xmlns:ns0":"<http://sample.com>"
  }
}
XML element with attribute and namespace<root>
  <foo key="value"
xmlns:ns0="<http://sample.com>"/>
</root>

3.4. Rules between the Map and XML Conversions

The following table shows mapping the XML to the different forms of map representation.

XML TypeXML SampleMap TypeOutput
XML Element<key>value</key>map<BALLERINA_PRIMITIVE_TYPE>{key: "VALUE_IN_DEFINED_TYPE"}
XML Element<key>value</key>map<BALLERINA_PRIMITIVE_TYPE_ARRAY>{key: "VALUE_IN_DEFINED_ARRAY_TYPE"}
XML Element<key>value</key>map<xml>{#content: <key>value</key>}
XML Element<key>value</key>map<json>{key: "value"}
XML Sequence<keys><key>value</key></keys>map<BALLERINA_PRIMITIVE_TYPE>ERROR
XML Sequence<keys><key>value</key></keys>map<BALLERINA_PRIMITIVE_TYPE_ARRAY>ERROR
XML Sequence<keys><key>value</key></keys>map<json>{keys: {key: "value"}
XML Sequence<keys><key>value</key></keys>map<xml>{#content: <keys><key>value</key></keys>}
XML Sequence<keys><key>value</key></keys>map<table<map<string>>>{keys: table [key: "value"]}

The following table shows mapping the map data to a corresponding matching XML representation.

Map TypeMap SampleXML
map<BALLERINA_PRIMITIVE_TYPE>{key1: value1, key2: value2}<root>
  <key1>value1</key1>
  <key2>value2</key2>
</root>
map<BALLERINA_PRIMITIVE_ARRAY_TYPE>{key1: [v1,v2], key2: [v3,v4]}<root>
  <key1>value1</key1>
  <key1>value2</key1>
  <key2>value3</key2>
  <key2>value4</key2>
</root>
map<json>{keys: {key1: value1, key2: value2}}<root>
  <keys>
    <key1>value1</key1>
    <key2>value2</key2>
  </keys>
</root>
map<xml>{keys: xml <key>value</key>}<root>
  <keys>
    <key>value</key>
    </keys>
</root>
map<table<map<string>>>{keys: table [{key: "value"}]}<root>
   <keys>
       <key>value</key>
   </keys>
</root>
map<json[]>{keys: [{key1: value1},{key2: value2}]}<root>
  <keys>
    <key1>value1</key1>
    <key2>value2</key2>
  </keys>
</root>
map<xml[]>{keys: [xml <key1>value1</key1>, xml <key2>value2</key2>]}<root>
  <keys>
   <key1>value1</key1>
   <key2>value2</key2>
  </keys>
</root>

3.5. Rules between the Ballerina record and XML Conversions

Basic Conversion

The following ballerina record definitions are consistent with the OpenAPI definition to map records to XML without any additional configurations.

Ballerina Record DefinitionOpenAPI DefinitionXML format
Record with single field

type Root record {
 string key?;
}
components:
 schemas:
  Root:
   type: object
   properties:
    key:
     type: string
<Root>
<key>string</key>
</Root>
Record with multiple key

type Root record {
 string key1?;
 string key2?;
}
components:
 schemas:
  Root:
   type: object
   properties:
    key1:
     type: string
    key2:
     type: string
<Root>
<key1>string</key1>
<key2>string</key2>
</Root>
Nested Record

type Root record {
 Store store?;
}

type Store record {
 string name?;
 Address address?;
}

type Address record {
 string street?;
 int city?;
}
components:
 schemas:
  Root:
   type: object
   properties:
    store
     type: object
     properties:
      name
      type: string
    address:
      type: object
      properties
       street:
        type: string
       city
         type: integer
<Root>
<store>
  <name>string</name>
  <address>
   <street>string</street>
   <city>0</city>
  /address>
</store>
</Root>
Array

type Root record {
  string[] key?;
}
Root:
 type: object
 properties
  key:
   type: array
  items
   type: string
<Root>
<key>string</key>
<key>string</key>
</Root>
Record field type as XML

type Root record {
 xml key?;
}
components:
 schemas
  Root:
   type: object
   properties
    key:
     type: object
<Root>
   <key>
    xml object
   </key>
</Root>
Record field type as table

table<map> t = table [{key:"value"}];

type Root record {
 table key?;
}
components:
 schemas:
  Root:
   type: object
   properties:
   key:
    type: array
   items
    type: object
<Root>
<key>xml object</key>
<key>xml object</key>
</Root>
Required Field

type Root record {
 int id;
 string uname;
 string name?;
}
components:
 schemas:
  root:
   type: object
   properties:
    id
     type: integer
    uname:
     type: string
    name:
     type: string
    required:
     - id
     - uname
<Root>
<id>0</id>
<uname>string</uname>
<name>string</name>
</Root>
Close record

type Person record {|
 string name;
|};
components:
 schemas
   Person:
    type: object
    properties
    name:
     type: string
   required
     - name
   additionalProperties: false
<Preson>
<name>string</name>
</Person>
open record

type Person record {
 string name;
};
components:
 schemas
   Person:
    type: object
    properties
    name:
     type: string
   required
     - name
   additionalProperties: true
<Preson>
<name>string</name>
<id>string</id>
</Person>
Union Type Field

type Location record {
  string|Address address?;
}

type Address record {
 int id;
 string uname;
 string name?;
}
components:
 schemas
  Location:
   type: object
   properties:
    key
     oneOf
      - $ref: '#/components/schemas/Address'
     - type: string
  Address:
   type: object
   properties:
     id:
     type: integer
    username:
     type: string
    name:
      type: string
    required:
      - id
     - uname
<Location>
<address>
  <id>0</id>
  <uname>string</uname>
  <name>string</name>
</address>
</Location>

OR

<Location>
<address>string</address>
</Location>

Conversion with Attributes and Namespaces

The OpenAPI definition has metadata objects that allow for more fine-tuned XML model definitions. You can find those here. https://github.com/OAI/OpenAPI-Specification/blob/main/versions/3.0.3.md#fixed-fields-22

So, In Ballerina, we are going to introduce some annotations to support this metadata.

OpenAPI metadataOpenAPI DefinitionBallerina Record Definition
with annotation
XML format
XML Name
Replacement
components:
  schemas:
   animals:
     type: object
     properties:
       id:
         type: integer
       xml:
       name: ID
      xml:
     name: animal
@xmldata:name {
value: animal
}
type animals record {
@xmldata:name{
  value: ID
}
  string id?;
};
<animal>
<ID>0</ID>
</animal>
XML Attributecomponents:
 schemas:
  Pline:
   type: object
   properties:
    discount:
     type: string
     xml:
      attribute: true
type Pline record {
@xmldata:attribute
 int discount?;
};
<Pline discount= "string">
</Pline>
XML Namespacecomponents:
 schemas:
  Root
   type: object
   xml:
    prefix: ns0
    namespace: 'http://www.w3.org/'
@xmldata:namespace {
prefix:"nso",
uri = "http://www.w3.org/"
}
type Root record {};
<ns0:Root xmlns:ns0 = "http://www.w3.org/">
</ns0:Root>
XML Namespace
and Prefix
components:
 schemas:
  Pline:
   type: object
   xml:
    prefix: 'nso'
     namespace: 'http://www.w3.org/'
   properties
    foo:
     type: string
      xml:
       prefix: 'nso'
@xmldata:namespace {
prefix: "nso",
uri = "http://www.w3.org/"
}

type Pline record {
@xmldata:namespae{
   prefix: "nso"
}
 string foo;
};
<nso:Pline xmlns:ns0="http://www.w3.org/">
<nso:foo></nso:foo>
</nso:Pline>
XML Prefix with Namespaces

Noted: OpenAPI
Specification
does not support
multiple XML
namespaces
within a single element.
As a workaround,
we can define additional
namespaces as
regular attributes
(that is, schema
properties with xml.attribute=true)
components:
 schemas:
  Root
   type: object
   properties
    key:
     type: string
      xmlns:asd
       enum
        - 'http://www.w3.org/'

       xml
        attribute: true
      xml:
       prefix: 'ns0'
       namespace: 'http://www.w3.org/'
@xmldata:namespace {
prefix:"nso",
uri = "http://www.w3.org/"
}
type Root record {
 string key?;
@xmldata:attribute
string xmlns:asd = "http://www.w3.org/" ;
};
<ns0:root xmlns:ns0="http://www.w3.org/" xmlns:asd="http://www.w3.org/">
<key>string</key>
</ns0:root>
Signifies whether
the array is wrapped or not.
One of the below open
API definitions can be used to
define the ballerina record array field definition.
So, we don’t need to introduce
new annotations for wrapped metadata.

1. Unwrap array definition
components:
 schemas:
  root:
   type: object
   properties:
    root
     type:array
       items:
       type: string

2. Wrap array definition.
components:
 schemas:
  root:
   type: array
     items:
      type: string
   xml:
   wrapped: true
type root record {
 string[] root?;
}
<root>
<root>string</root>
</root>

Convert XML element with attributes(Unsupported in OpenAPI)

OpenAPI does not support XML which has elements with attributes. For more info, please see this issue: https://github.com/OAI/OpenAPI-Specification/issues/630

But this use-case is commonly used in XML. Therefore, In Ballerina, we support through special field name #content like below.

Ballerina Record DefinitionXML Sample
type PLine record {
  ItemCode itemCode?;
}

type ItemCode record {
  string discount?;
  int #content?;// If the value doesn't have a key,
can initialize that value with the default ey name#content
}
<PLine>
<itemCode discount=22%>
  200777
</itemCode>
</PLine>

4. Operations

4.1. XML to JSON Conversion

XML to JSON conversion is a mapping between the different forms of XML to a corresponding matching JSON representation. The following API returns the JSON data to the given XML structure by configuring the XmlOptions.

The XmlOptions is used to configure the attribute and namespace prefix and add or eliminate the namespace in the JSON data.

4.1.1. Sample

The JSON representation of the above XML with the default configuration of the above API.

When attributePrefix is & and preserveNamespaces is false, the JSON representation of the above XML

4.2. XML to Record Conversion

This conversion is a mapping between the different forms of XML to a corresponding matching Ballerina record representation. The following API returns the record to the given XML structure by configuring the preserveNamespaces and returnType.

4.2.1. Sample

The record representation of the above XML with the default configuration of this API.

When preserveNamespaces is false, the JSON representation of the above XML.

4.3. JSON to XML Conversion

This conversion provides a mapping between the different forms of JSON, to a corresponding matching XML representation. The following API returns the JSON data to the given XML structure by configuring the JsonOptions.

The JsonOptions is used to configure the attribute prefix for the JSON and root and array entry tags for XML. Array entry tag is used to create a tag when JSON array is in without keys.

4.3.1. Sample1

The XML representation of the above JSON with the default configuration of this API.

4.3.2. Sample2

When attributePrefix is & and arrayEntryTag is list, the XML representation of the above JSON.

4.4. Ballerina record/Map to XML Conversion

This conversion provides a mapping between the different forms of Ballerina record/Map, to a corresponding matching XML representation. The following API returns the XML data to the given Ballerina record/Map. The record has annotations to configure namespaces and attributes, but others don't have these.

The following annotations are used to configure the name, namespace, and attribute.

4.4.1. Sample1

The XML representation of the above Record.

4.4.2. Sample2

The XML representation of the above map.

4.5. XML to Ballerina record/Map Conversion

his conversion is a mapping between the different forms of XML to a corresponding matching Ballerina record/Map representation. The following API returns the record/map to the given XML structure. The namespaces and attributes will not be considered a special case.

4.5.1. Sample1

The record representation of the above XML with the returned record type as Commercial.

4.5.2. Sample2

The map representation of the above XML.