gmserialization/README.md
2025-04-05 13:20:30 +02:00

3.3 KiB

GM Serialization

Serialization helpers for the Gajumaru.

Build

$ rebar3 compile

Test

$ rebar3 eunit

Dynamic encoding

The module gmser_dyn offers dynamic encoding support, encoding most 'regular' Erlang data types into an internal RLP representation.

Main API:

  • encode(term()) -> iolist()

  • encode_typed(template(), term()) -> iolist()

  • decode(iolist()) -> term()

  • serialize(term()) -> binary()

  • serialize_typed(template(), term()) -> binary()

  • deserialize(binary()) -> term()

The basic types supported by the encoder are:

  • non_neg_integer() (int , code: 248)
  • binary() (binary, code: 249)
  • boolean() (bool , code: 250)
  • list() (list , code: 251)
  • map() (map , code: 252)
  • tuple() (tuple , code: 253)
  • gmser_id:id() (id , code: 254)
  • atom() (label , code: 255)

When encoding map types, the map elements are first sorted.

When specifying a map type for template-driven encoding, use the #{items => [{Key, Value}]} construct.

Labels

Labels correspond to (existing) atoms in Erlang. Decoding of a label results in a call to binary_to_existing_atom/2, so will fail if the corresponding atom does not already exist.

It's possible to cache labels for more compact encoding. Note that when caching labels, the same cache mapping needs to be used on the decoder side.

Labels are encoded as [<<255>>, << AtomToBinary/binary >>]. If a cached label is used, the encoding becomes [<<255>, [Ix]], where Ix is the integer-encoded index value of the cached label.

Examples

Dynamically encoded objects have the basic structure [<<0>>,V,Obj], where V is the integer-coded version, and Obj is the top-level encoding on the form [Tag,Data].

E = fun(T) -> io:fwrite("~w~n", [gmser_dyn:encode(T)]) end.

E(17)        -> [<<0>>,<<1>>,[<<248>>,<<17>>]]
E(<<"abc">>) -> [<<0>>,<<1>>,[<<249>>,<<97,98,99>>]]
E(true)      -> [<<0>>,<<1>>,[<<250>>,<<1>>]]
E(false)     -> [<<0>>,<<1>>,[<<250>>,<<0>>]]
E([1,2])     -> [<<0>>,<<1>>,[<<251>>,[[<<248>>,<<1>>],[<<248>>,<<2>>]]]]
E({1,2})     -> [<<0>>,<<1>>,[<<253>>,[[<<248>>,<<1>>],[<<248>>,<<2>>]]]]
E(#{a=>1, b=>2}) ->
  [<<0>>,<<1>>,[<<252>>,[[[<<255>>,<<97>>],[<<248>>,<<1>>]],[[<<255>>,<<98>>],[<<248>>,<<2>>]]]]]
E(gmser_id:create(account,<<1:256>>)) ->
  [<<0>>,<<1>>,[<<254>>,<<1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1>>]]

Note that tuples and list are encoded the same way, except for the initial type tag. Maps are encoded as [<Map>, [KV1, KV2, ...]], where [KV1, KV2, ...] is the sorted list of key-value tuples from map:to_list(Map), but with the tuple type tag omitted.

Template-driven encoding

Templates can be provided to the encoder by either naming an already registered type, or by passing a template directly. The template will then be enforced, and used to slightly compress the encoding.

In the following example, as the encoder knows that {11,12} is encoded as a tuple of two integers, it can omit the inner type tags.

ET = fun(Type,Term) -> io:fwrite("~w~n", [gmser_dyn:encode_typed(Type,Term)]) end.

ET({int,int}, {11,12}) ->[<<0>>,<<1>>,[<<253>>,[<<11>>,<<12>>]]]
ET({int,int}, {11,a}) ->
** exception error: {illegal,int,a} ...