Add some pages

Peter Harpending 2025-03-20 13:46:04 -07:00
parent 008b4b6963
commit c162caabc4
6 changed files with 156 additions and 1 deletions

2
.gitignore vendored Normal file

@ -0,0 +1,2 @@
*.swp
*.swo

148
API-Encoding.md Normal file

@ -0,0 +1,148 @@
# API Encoding (`xy_ABCD` strings)
When you are interacting with Gajumaru you often encounter garbage strings like
```
cb_OwQELwGfAKAgjs50MOABi6flmiNru6qg/5U9bjKymkvlywP7RtPnWASG88h7
th_2H8EreT7LNw43jEG9yL7yvSw3AHbG2TTimMgVTYVFwqh6ucSeV
ak_2CNR6NcNj5cFUa28wmVNyptiadtqcsqhG8qoQpPwULfyLiFHD6
```
These are called "API strings" or "API encoding" in official lingo. Suppose you
have a string `xy_ABCD`
- The `xy` prefix (`cb`, `th`, etc) indicates what sort of data is contained
in the rest of the string, and whether it is Base64 or Base58. (See [[
BaseN ]]).
Generally, anything that is both fixed-length and likely to be input
manually (e.g. public keys) is going to be Base58, else it will be Base64.
- The `ABCD` part is binary data (plus some check bytes) that is encoded
either in Base64 or Base58.
```erlang
add_check_bytes(Bin) when is_binary(Bin) ->
<<CheckBytes:4/bytes, _/binary>> = crypto:hash(sha256, crypto:hash(sha256, Bin)),
<<Bin/binary, CheckBytes/binary>>.
```
When you decode the `ABCD` stuff, it decodes to the binary data suffixed by
the 4 check bytes (i.e. the **output** of the `add_check_bytes/1` function
above).
- Sometimes the binary data is plain data (e.g. account public keys).
Sometimes it's compound data (e.g. a transaction).
## Example: Decoding Public Keys
For instance, account public keys are encoded in Base58 and have the prefix
`ak_`
To decode one, get out the actual bytes of the public key, and check the
check bytes, I wrote this:
```erlang
do(["akd", AkStr]) ->
"ak_" ++ Base58Shit = AkStr,
CheckedBytes = gw_b58:dec(Base58Shit),
ShaSha4 =
fun(Bytes) ->
<<CheckBytes:4/bytes, _/binary>> = crypto:hash(sha256, crypto:hash(sha256, Bytes)),
CheckBytes
end,
<<DataBytes:(byte_size(CheckedBytes) - 4)/bytes,
CheckBytes:4/bytes>> = CheckedBytes,
io:format("~p~n", [DataBytes]),
case ShaSha4(DataBytes) =:= CheckBytes of
true -> io:format("checksum: passed~n", []);
false -> io:format("checksum: failed~n", [])
end;
```
```
[~] % gw akd ak_A3aMregStEULMXyPzNXWEfq1u75yM7BaQ5k8qVhCCpcvCr9Rx
<<20,137,82,130,217,195,19,25,115,137,60,225,221,88,168,194,156,8,88,244,17,30,
121,7,114,180,61,27,194,44,94,166>>
checksum: passed
```
## Compound Data (e.g. transactions)
If it's compound data (i.e. it has fields), then the binary data you get
out of the decode process is going to be encoded in RLP. (See
[[ RLP ]])
RLP-decode is going to give you a list-of-lists-of-\dots-of-binaries.
RLP data you get from Gajumaru is going to be of the format
```erlang
[Tag, Version | Fields]
```
The `Tag` and `Version` are going to be binaries that you need to pretend are
integers.
The tag and version tell you what the format of the fields are going to be.
These formats are documented in [[ Serialization ]].
## The Representation Problem
This problem of what is conceptually one piece of data having 4 different
representations that you have to juggle between in your code, I call **The
Representation Problem**. This is the hardest practical problem to deal with as
a developer using Gajumaru.
There is no real consistency or convention about when you use one
representation over another.
For instance, the public key I showed you above. When you encode it as an
`ak_...` string, you use just the 32-byte public key.
But when you are encoding that same key as a field to use in a spend
transaction, you need to encode it as `<<1:8, Pubkey/binary>>`.
Some functions you call from Erlang code take the public key as an argument.
There is no consistency about whether they want the `ak_...` string or if they
want the 32 bytes, etc.
Sometimes the function you call wants some weird data structure. Like the `1:8`
thing above corresponds to a convention about what sort of object the public
key points to (because you can spend to a normal account, to a contract, to a
name, etc; the number in the first byte indicates which type this is). There
was some function I called the other day which needed that information, but
instead of sending it the 33-byte augmented public key, it wanted a tuple that
was something like `{{type, account_pubkey}, {key, Bytes}}`.
There is no consistency across different modules about which Erlang atoms are
used. Sometimes it's `account_pubkey`, sometimes it's `account`. It's hell.
In practice you have to just look at the source code of the function you're
calling and see what it expects.
You also can't expect error messages that tell you what mistake you made
and where, particularly when you're interacting with a node over HTTP
(generally, HTTP wants API-encoded versions of things).
The Representation Problem is a (moderate) annoyance to excellent programmers.
The worst thing it can do is nerdsnipe you and foolishly make you think the
problem has some elegant solution and then you spend months trying to solve it
before you realize that there is no solution because the problem itself is
wrong.
This problem absolutely cripples average programmers trying to use Gajumaru.
It's an open question about whether or not that's a good thing (average
programmers probably shouldn't be anywhere near code that handles people's
money), but also a question that is out of scope.
There are many layers to the onion. Also there are multiple differnt onion
schema that sometimes collapse down to the same inner onion and sometimes you
have to traverse down one onion and up another and also the onion hates you and
it's rotten and poisonous.
All I can really do here is give you a field guide to how to deal with this
problem in practice. The best practical solution in general is to quit trying
to make sense of computers and give up on this whole programming thing and
become a beekeeper instead.

0
BaseN.md Normal file

@ -1,3 +1,8 @@
# Gajumaru Wiki: Home
## Quick Reference
1. [[ API Encoding ]] (`ak_...` garbage)
2. [[ BaseN ]]
3. [[ RLP ]]
4. [[ Serializations ]]

0
RLP.md Normal file

0
Serialization.md Normal file