2026-02-15 12:12:04 +01:00
2025-02-04 15:58:21 +09:00
2025-04-11 16:39:06 +02:00
2025-01-22 19:36:50 +09:00
2026-02-15 12:12:04 +01:00
2025-01-22 19:36:50 +09:00
2025-01-22 19:36:50 +09:00
2026-02-15 12:12:04 +01:00
2026-02-15 12:12:04 +01:00
2026-02-15 12:12:04 +01:00
2025-01-23 20:05:22 +09:00

GM Serialization

Serialization helpers for the Gajumaru.

For an overview of the static serializer, see this document.

Build

$ rebar3 compile

Test

$ rebar3 eunit

Dynamic encoding

The module gmser_dyn offers dynamic encoding support, encoding most 'regular' Erlang data types into an internal RLP representation.

Main API:

  • encode(term()) -> iolist()

  • encode_typed(template(), term()) -> iolist()

  • decode(iolist()) -> term()

  • serialize(term()) -> binary()

  • serialize_typed(template(), term()) -> binary()

  • deserialize(binary()) -> term()

In the examples below, we use the decode functions, to illustrate how the type information is represented. The fully serialized form is produced by the serialize functions.

The basic types supported by the encoder are:

  • integer() (anyint, code: 246)
  • neg_integer() (negint, code: 247)
  • non_neg_integer() (int , code: 248)
  • binary() (binary, code: 249)
  • boolean() (bool , code: 250)
  • list() (list , code: 251)
  • map() (map , code: 252)
  • tuple() (tuple , code: 253)
  • gmser_id:id() (id , code: 254)
  • atom() (label , code: 255)

(The range of codes is chosen because the gmser_chain_objects codes range from 10 to 200, and also to stay within 1 byte.)

When encoding map types, the map elements are first sorted.

When specifying a map type for template-driven encoding, use the #{items => [{Key, ValueType} | {opt, Key, ValueType}]} construct. The key names are included in the encoding, and are match against the item specs during decoding. If the key names don't match, the decoding fails, unless for an {opt, K, V} item, in which case that item spec is skipped.

T = #{items => [{a,int},{opt,b,int},{c,int}]}
E1 = gmser_dyn:encode_typed(T, #{a => 1, b => 2, c => 3}) ->
    [<<0>>,<<1>>,[<<252>>,
     [[[<<255>>,<<97>>],[<<248>>,<<1>>]],
      [[<<255>>,<<98>>],[<<248>>,<<2>>]],
      [[<<255>>,<<99>>],[<<248>>,<<3>>]]]]]
E2 = gmser_dyn:encode_typed(T, #{a => 1, c => 3}) ->
    [<<0>>,<<1>>,[<<252>>,
     [[[<<255>>,<<97>>],[<<248>>,<<1>>]],
      [[<<255>>,<<99>>],[<<248>>,<<3>>]]]]]
gmser_dyn:decode_typed(T,E2) ->
    #{c => 3,a => 1}

Labels

Labels correspond to (existing) atoms in Erlang. Decoding of a label results in a call to binary_to_existing_atom/2, so will fail if the corresponding atom does not already exist.

This behavior can be modified using the option #{missing_labels => fail | create | convert}, where fail is the default, as described above, convert means that missing atoms are converted to binaries, and create means that the atom is created dynamically.

The option can be passed e.g.:

gmser_dyn:deserialize(Binary, set_opts(#{missing_labels => convert}))

or

gmser_dyn:deserialize(Binary, set_opts(#{missing_labels => convert}, Types))

By calling gmser_dyn:register_types/1, after having added options to the type map, the options can be made to take effect automatically.

It's possible to cache labels for more compact encoding. Note that when caching labels, the same cache mapping needs to be used on the decoder side.

Labels are encoded as [<<255>>, << AtomToBinary/binary >>]. If a cached label is used, the encoding becomes [<<255>, [Ix]], where Ix is the integer-encoded index value of the cached label.

Examples

Dynamically encoded objects have the basic structure [<<0>>,V,Obj], where V is the integer-coded version, and Obj is the top-level encoding on the form [Tag,Data].

E = fun(T) -> io:fwrite("~w~n", [gmser_dyn:encode(T)]) end.

E(17)        -> [<<0>>,<<1>>,[<<248>>,<<17>>]]
E(<<"abc">>) -> [<<0>>,<<1>>,[<<249>>,<<97,98,99>>]]
E(true)      -> [<<0>>,<<1>>,[<<250>>,<<1>>]]
E(false)     -> [<<0>>,<<1>>,[<<250>>,<<0>>]]
E([1,2])     -> [<<0>>,<<1>>,[<<251>>,[[<<248>>,<<1>>],[<<248>>,<<2>>]]]]
E({1,2})     -> [<<0>>,<<1>>,[<<253>>,[[<<248>>,<<1>>],[<<248>>,<<2>>]]]]
E(#{a=>1, b=>2}) ->
  [<<0>>,<<1>>,[<<252>>,[[[<<255>>,<<97>>],[<<248>>,<<1>>]],[[<<255>>,<<98>>],[<<248>>,<<2>>]]]]]
E(gmser_id:create(account,<<1:256>>)) ->
  [<<0>>,<<1>>,[<<254>>,<<1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1>>]]

Note that tuples and list are encoded the same way, except for the initial type tag. Maps are encoded as [<Map>, [KV1, KV2, ...]], where [KV1, KV2, ...] is the sorted list of key-value tuples from map:to_list(Map), but with the tuple type tag omitted.

Template-driven encoding

Templates can be provided to the encoder by either naming an already registered type, or by passing a template directly. In both cases, the encoder will enforce the type information in the template.

If the template has been registered, the encoder omits inner type tags (still inserting the top-level tag), leading to some compression of the output. This also means that the serialized term cannot be decoded without the same schema information on the decoder side.

In some cases, the type tags will still be emitted. These are when alternative types appear, and for enumerated map types (#{items => ...}). In the latter case, it is due to the support for optional items.

In the case of a directly provided template, all type information is inserted, such that the serialized term can be decoded without any added type information. The template types are still enforced during encoding.

ET = fun(Type,Term) -> io:fwrite("~w~n", [gmser_dyn:encode_typed(Type,Term)]) end.

ET([{int,int}], [{1,2}]) -> [<<0>>,<<1>>,[<<251>>,[[[<<248>>,<<1>>],[<<248>>,<<2>>]]]]]

gmser_dyn:register_type(1000,lt2i,[{int,int}]).
ET(lt2i, [{1,2}]) -> [<<0>>,<<1>>,[<<3,232>>,[[<<1>>,<<2>>]]]]

Alternative types

The dynamic encoder supports two additions to the gmserialization template language: any, #{alt => [AltTypes]} and #{switch => [AltTypes]}.

any

The any type doesn't have an associated code, but enforces dynamic encoding.

alt

The #{alt => [Type]} construct also enforces dynamic encoding, and will try to encode as each type in the list, in the specified order, until one matches.

gmser_dyn:encode_typed(#{alt => [negint,int]}, 5) -> [<<0>>,<<1>>,[<<247>>,<<5>>]]
gmser_dyn:encode_typed(#{alt => [negint,int]}, 5) -> [<<0>>,<<1>>,[<<248>>,<<5>>]]

gmser_dyn:encode_typed(anyint,-5) -> [<<0>>,<<1>>,[<<246>>,[<<247>>,<<5>>]]]
gmser_dyn:encode_typed(anyint,5)  -> [<<0>>,<<1>>,[<<246>>,[<<248>>,<<5>>]]]

switch

The switch type allows for encoding a 'tagged' object, where the tag determines the type.

E1 = gmser_dyn:encode_typed(#{switch => #{name => binary, age => int}}, #{age => 29}) ->
    [<<0>>,<<1>>,[<<252>>,[[[<<255>>,<<97,103,101>>],[<<248>>,<<29>>]]]]]
gmser_dyn:decode_typed(#{switch => #{name => binary, age => int}}, E1) ->
    #{age => 29}
E2 = gmser_dyn:encode_typed(#{switch => #{name => binary, age => int}}, #{name => <<"Ulf">>}) ->
    [<<0>>,<<1>>,[<<252>>,[[[<<255>>,<<110,97,109,101>>],[<<249>>,<<85,108,102>>]]]]]
gmser_dyn:decode_typed(#{switch => #{name => binary, age => int}}, E1) ->
    #{name => <<"Ulf">>}

A practical use of switch would be in a protocol schema:

t_msg(_) ->
    #{switch => #{ call         => t_call
                 , reply        => t_reply
                 , notification => t_notification }}.

t_call(_) ->
    #{items => [ {id, anyint}
               , {req, t_req} ]}.

t_reply(_) ->
    #{alt => [#{items => [ {id, anyint}
                         , {result, t_result} ]},
              #{items => [ {id, anyint}
                         , {code, anyint}
                         , {message, binary} ]}
             ]}.

In this scenario, messages are 'taggged' as 1-element maps, e.g.:

async_request(Msg) ->
    Id = erlang:unique_integer(),
    gmmp_cp:to_server(
      whereis(gmmp_core_connector),
      #{call => #{ id => Id
                 , req => Msg }}),
    Id.

Notes

Note that anyint is a standard type. The static serializer supports only positive integers (int), as negative numbers are forbidden on-chain. For dynamic encoding e.g. in messaging protocols, handling negative numbers can be useful, so the negint type was added as a dynamic type. To encode a full-range integer, the alt construct is needed.

(Floats are not supported, as they are non-deterministic. Rationals and fixed-point numbers could easily be handled as high-level types, e.g. as {int,int}.)

Description
Serialization formats for the Gajumaru
Readme 1.3 MiB
v1.2.0 Latest
2025-03-01 21:34:35 +09:00
Languages
Erlang 100%