Peter Harpending b1f3e0de89 update readme
2025-09-23 16:09:32 -07:00
2025-09-23 15:18:12 -07:00
2025-09-23 13:24:37 -07:00
2025-09-23 16:09:32 -07:00

gex = gajumaru exchange

Currently there is only one thing, which is the Gajumaru HTTP Daemon.

How to run gex_httpd

Last updated: September 23, 2025 (PRH).

Install Erlang and zx/zomp

Source: Building Erlang 26.2.5 on Ubuntu 24.04

Adapt this to your Linux distribution.

  1. Install necessary build tools

    sudo apt update
    sudo apt upgrade
    sudo apt install \
        gcc curl g++ dpkg-dev build-essential automake autoconf \
        libncurses-dev libssl-dev flex xsltproc libwxgtk3.2-dev \
        wget vim git
    
  2. Put Kerl somewhere in your $PATH. This is a tool to build Erlang releases.

    wget -O ~/bin/kerl https://raw.githubusercontent.com/kerl/kerl/master/kerl
    chmod u+x ~/bin/kerl
    
  3. Build Erlang from source using Kerl

    kerl update releases
    ## use the most recent one that looks stable
    ## you do need to type the number twice, that's not a typo
    kerl build 28.1 28.1
    kerl install 28.1 ~/.erts/28.1
    
  4. Put Erlang in your $PATH

    Update .bashrc or .zshrc or whatever with the following line:

    . $HOME/.erts/28.1/activate
    
  5. Install zx

    wget -q https://zxq9.com/projects/zomp/get_zx && bash get_zx
    
  6. Test zx works

    zx installs itself to ~/bin, so make sure that's in your $PATH.

    zx run erltris
    

Notes

Convention: [brackets] for jargon

  • You know how sometimes people will intermix technical jargon which has a very specific context-local definition with common parlance?
  • Our notational convention is to put [jargon terms] in square braces to warn the reader that the word is meant in some extremely precise technical sense, and the word doesn't necessarily mean what the dictionary says it means.
  • Specifically, [supervisor] is a jargon term that is standard in Erlang.
  • Do not confuse [supervisor] with [manager]. [Manager] is AFAIK Craig-specific nomenclature.
  • A [supervisor] is (roughly) a process that is in charge of a bunch of child processes. It is responsible for restarting processes that crash. (Yes Craig I know it's more nuanced than that).
  • The other common pattern is a [gen_server].
  • If all you take away from this document is that erlang has things called [supervisor]s and things called [gen_server]s, consider that a good day.

Big Picture: telnet chat server -> HTTP server

  • The default project (see initial commit) is a telnet echo server. It's like the most ghetto low-budget chat server imaginable.

    screenshot of gh_httpd running

    Lefty and middley can chat just like normal.

    However, righty (curl) foolishly thinks he is talking to an HTTP server. His request is echoed to lefty and middley.

    Curl crashed because instead of a valid HTTP response back, he got something like

    MESSAGE from YOU: GET / HTTP/1.1
    
  • We make this into an HTTP server by replacing the "echo my message to everyone else" logic with "parse this message as an HTTP request and send back an HTTP response" logic.

  • Our "application logic" or "business logic" or whatever is contained in that process of how the request is mapped to a response.

  • It really is not more complicated than that.

Basics of Erlang Processes

These are heuristics that are good starting points

  • each module ~= 1 process

  • it helps to think of erlang as an operating system, and erlang modules as shell scripts that run in that operating system.

  • some modules correspond to fungible processes, some are non-fungible

  • in Observer (observer:start())

    • named processes are non-fungible (e.g. gh_client_sup)
    • the name can be anything, but conventionally it's the module name
    • fungible processes have numbers (PIDs) (e.g. the gh_client code, which is the Erlang process that was on the other end of the conversation with the telnet windows)
    • named processes also have PIDs, they just also have names

  • you will want to get in the habit of any time you read code, always asking what process context the code is running in.

  • it is NOT the case that all code in module foo runs inside the context of process foo. It is very important that you make sure you understand that distinction, and always know where code is running.

Following the call chain of gex_httpd:listen(8080)

  • Reference commit: 49a09d192c6f2380c5186ec7d81e98785d667214

  • By default, the telnet server doesn't occupy a port

  • gex_httpd:listen(8080) tells it to listen on port 8080

    %% gex_httpd.erl
    -spec listen(PortNum) -> Result
        when PortNum :: inet:port_num(),
             Result  :: ok
                      | {error, {listening, inet:port_num()}}.
    %% @doc
    %% Make the server start listening on a port.
    %% Returns an {error, Reason} tuple if it is already listening.
    
    listen(PortNum) ->
        gh_client_man:listen(PortNum).
    
    
    %% gh_client_man:listen(8080)
    listen(PortNum) ->
        gen_server:call(?MODULE, {listen, PortNum}).
    
  • So this is a bit tricky.

  • The code inside that function runs in the context of the gex_httpd process (or whatever calling process)

    • the effect of that code is to send a message to the gh_client_man process (= ?MODULE)
    • that message is {listen, 8080}
    • in general, it's gen_server:call(PID, Message)
    • every process has a "mailbox" of messages. the process usually just sits there doing nothing until it gets a message, and then does something deterministically in response to the message.
    • gen_server, [supervisor], etc, all are standard library factoring-outs of common patterns of process configuration
    • The low-level primitive to receive messages is receive. We'll see its use later when we look at the gh_client code.
    • See pingpong example for a simplified example. Permalink.
    • All of this gen_server nonsense is a bunch of boilerplate that rewrites to a bunch of receives

Very Important: casts v. calls

  • Very important: sometimes you will also see gen_server:cast(PID, Message).

    It's very important that you understand the difference

  • So in our example

    • gex_httpd makes a call to gh_client_man
    • gex_httpd sends the message {listen, 8080} to gh_client_man
    • he is going to sit there and wait until gh_client_man sends him a message back. this is what makes a call a call. if gh_client_man never responds, gex_httpd will just sit there forever waiting, and never move on with his life.
  • in a cast, the message is sent and you move on with your day

  • think of calls like actual phone calls, where the other person has to answer, but if they don't, there's no voicemail, the phone just rings forever and you're just stuck listening to the phone ringing like sisyphus (there's an option of course to call with a timeout, etc... simplifying).

  • casts are like text messages. You send them. maybe you get a text back. who knows.

Continuing

  • Inside of gh_client_man's own process context, it listens for calls in the handle_call function

    %% gh_client_man.erl
    
    -spec handle_call(Message, From, State) -> Result
        when Message  :: term(),
             From     :: {pid(), reference()},
             State    :: state(),
             Result   :: {reply, Response, NewState}
                       | {noreply, State},
             Response :: ok
                       | {error, {listening, inet:port_number()}},
             NewState :: state().
    %% @private
    %% The gen_server:handle_call/3 callback.
    %% See: http://erlang.org/doc/man/gen_server.html#Module:handle_call-3
    
    handle_call({listen, PortNum}, _, State) ->
        {Response, NewState} = do_listen(PortNum, State),
        {reply, Response, NewState};
    handle_call(Unexpected, From, State) ->
        ok = io:format("~p Unexpected call from ~tp: ~tp~n", [self(), From, Unexpected]),
        {noreply, State}.
    
  • following the call chain, we look at gh_client_man:do_listen/2

    This is running inside the gh_client_man process context

    %% gh_client_man.erl
    
    -spec do_listen(PortNum, State) -> {Result, NewState}
        when PortNum  :: inet:port_number(),
             State    :: state(),
             Result   :: ok
                       | {error, Reason :: {listening, inet:port_number()}},
             NewState :: state().
    %% @private
    %% The "doer" procedure called when a "listen" message is received.
    
    do_listen(PortNum, State = #s{port_num = none}) ->
        SocketOptions =
            [inet6,
             {packet,    line},
             {active,    once},
             {mode,      binary},
             {keepalive, true},
             {reuseaddr, true}],
        {ok, Listener} = gen_tcp:listen(PortNum, SocketOptions),
        {ok, _} = gh_client:start(Listener),
        {ok, State#s{port_num = PortNum, listener = Listener}};
    do_listen(_, State = #s{port_num = PortNum}) ->
        ok = io:format("~p Already listening on ~p~n", [self(), PortNum]),
        {{error, {listening, PortNum}}, State}.
    
    • If we're already listening (i.e. our state already has a port), we tell the calling process to fuck off.

    • If we don't have a port number, we

      • make a TCP listen socket on that port
      • start a gh_client process which spawns an acceptor socket on the listen socket (kind of a "subsocket")
      • the gh_client process is the erlang process that talks to either the telnet chat clients, or eventually web browsers.
      • send ok back to whomever called us
  • Next let's look at how clients are started up

  • gh_client is called gh_client because from the perspective of our HTTP daemon, that is a client. gh_client is the representation of clients within the context of our HTTP server.

  • analogously, to disambiguate directionality re "encode"/"decode", usually the directionality is from the perspective of the program. This can be counterintuitive, because the program's perspective is usually the opposite of a human's; e.g. Binary data is clear to a program, but its representation as plain text is opaque.

  • In Erlang you always need to think about perspective

  • Last call in the call chain was gh_client:start(). We expect it to return the PID of the process that talks to clients.

%% gh_client.erl

-spec start(ListenSocket) -> Result
    when ListenSocket :: gen_tcp:socket(),
         Result       :: {ok, pid()}
                       | {error, Reason},
         Reason       :: {already_started, pid()}
                       | {shutdown, term()}
                       | term().
%% @private
%% How the gh_client_man or a prior gh_client kicks things off.
%% This is called in the context of gh_client_man or the prior gh_client.

start(ListenSocket) ->
    gh_client_sup:start_acceptor(ListenSocket).


%% gh_client_sup.erl

-spec start_acceptor(ListenSocket) -> Result
    when ListenSocket :: gen_tcp:socket(),
         Result       :: {ok, pid()}
                       | {error, Reason},
         Reason       :: {already_started, pid()}
                       | {shutdown, term()}
                       | term().
%% @private
%% Spawns the first listener at the request of the gh_client_man when
%% gex_httpd:listen/1 is called, or the next listener at the request of the
%% currently listening gh_client when a connection is made.
%%
%% Error conditions, supervision strategies and other important issues are
%% explained in the supervisor module docs:
%% http://erlang.org/doc/man/supervisor.html

start_acceptor(ListenSocket) ->
    supervisor:start_child(?MODULE, [ListenSocket]).
  • Reference:

  • gh_client_sup is the [supervisor] responsible for restarting client processes when they crash.

  • he is tasked at this moment with starting one. Let's see how that goes

    %% gh_client_sup.erl
    
    -spec start_acceptor(ListenSocket) -> Result
        when ListenSocket :: gen_tcp:socket(),
             Result       :: {ok, pid()}
                           | {error, Reason},
             Reason       :: {already_started, pid()}
                           | {shutdown, term()}
                           | term().
    %% @private
    %% Spawns the first listener at the request of the gh_client_man when
    %% gex_httpd:listen/1 is called, or the next listener at the request of the
    %% currently listening gh_client when a connection is made.
    %%
    %% Error conditions, supervision strategies and other important issues are
    %% explained in the supervisor module docs:
    %% http://erlang.org/doc/man/supervisor.html
    
    start_acceptor(ListenSocket) ->
        supervisor:start_child(?MODULE, [ListenSocket]).
    
  • If we look in the configuration for gh_client_sup, we see this:

    -spec init(none) -> {ok, {supervisor:sup_flags(), [supervisor:child_spec()]}}.
    %% @private
    %% The OTP init/1 function.
    
    init(none) ->
        RestartStrategy = {simple_one_for_one, 1, 60},
        Client    = {gh_client,
                     {gh_client, start_link, []},
                     temporary,
                     brutal_kill,
                     worker,
                     [gh_client]},
        {ok, {RestartStrategy, [Client]}}.
    
  • my eyes are drawn to {gh_client, start_link, []}

  • probably that's what's called to spawn one of the worker processes

  • let's look

    %% gh_client.erl
    
    -spec start_link(ListenSocket) -> Result
        when ListenSocket :: gen_tcp:socket(),
             Result       :: {ok, pid()}
                           | {error, Reason},
             Reason       :: {already_started, pid()}
                           | {shutdown, term()}
                           | term().
    %% @private
    %% This is called by the gh_client_sup. While start/1 is called to iniate a startup
    %% (essentially requesting a new worker be started by the supervisor), this is
    %% actually called in the context of the supervisor.
    
    start_link(ListenSocket) ->
        proc_lib:start_link(?MODULE, init, [self(), ListenSocket]).
    
  • Any time you see a 3-tuple of {Module, FunctionName, ArgumentList}, probably that's information about how to call some function

  • In this case, this is saying "to start one of the gh_client processes, we call gh_client:init(SupervisorPID, ListenSocket)"

    Let's take a look

    %% gh_client.erl
    
    -spec init(Parent, ListenSocket) -> no_return()
        when Parent       :: pid(),
             ListenSocket :: gen_tcp:socket().
    %% @private
    %% This is the first code executed in the context of the new worker itself.
    %% This function does not have any return value, as the startup return is
    %% passed back to the supervisor by calling proc_lib:init_ack/2.
    %% We see the initial form of the typical arity-3 service loop form here in the
    %% call to listen/3.
    
    init(Parent, ListenSocket) ->
        ok = io:format("~p Listening.~n", [self()]),
        Debug = sys:debug_options([]),
        ok = proc_lib:init_ack(Parent, {ok, self()}),
        listen(Parent, Debug, ListenSocket).
    
  • Ok let's look at the listen/3 function

    -spec listen(Parent, Debug, ListenSocket) -> no_return()
        when Parent       :: pid(),
             Debug        :: [sys:dbg_opt()],
             ListenSocket :: gen_tcp:socket().
    %% @private
    %% This function waits for a TCP connection. The owner of the socket is still
    %% the gh_client_man (so it can still close it on a call to gh_client_man:ignore/0),
    %% but the only one calling gen_tcp:accept/1 on it is this process. Closing the socket
    %% is one way a manager process can gracefully unblock child workers that are blocking
    %% on a network accept.
    %%
    %% Once it makes a TCP connection it will call start/1 to spawn its successor.
    
    listen(Parent, Debug, ListenSocket) ->
        case gen_tcp:accept(ListenSocket) of
            {ok, Socket} ->
                {ok, _} = start(ListenSocket),
                {ok, Peer} = inet:peername(Socket),
                ok = io:format("~p Connection accepted from: ~p~n", [self(), Peer]),
                ok = gh_client_man:enroll(),
                State = #s{socket = Socket},
                loop(Parent, Debug, State);
            {error, closed} ->
                ok = io:format("~p Retiring: Listen socket closed.~n", [self()]),
                exit(normal)
         end.
    
  • The lines that jump out to me are

                ok = gh_client_man:enroll(),
                State = #s{socket = Socket},
                loop(Parent, Debug, State);
    
  • The gh_client_man module is responsible for keeping track of all the running clients. So probably gh_client_man:enroll(self()) is just informing gh_client_man that this gh_client instance exists.

    If we look, that's precisely what's happening

    %% gh_client_man.erl
    %% remember, enroll/0 is running in the context of the calling code, and
    %% do_enroll/2 is running in the context of the gh_client_man process
    
    -spec enroll() -> ok.
    %% @doc
    %% Clients register here when they establish a connection.
    %% Other processes can enroll as well.
    
    enroll() ->
        gen_server:cast(?MODULE, {enroll, self()}).
    
    %% ...
    -spec do_enroll(Pid, State) -> NewState
        when Pid      :: pid(),
             State    :: state(),
             NewState :: state().
    
    do_enroll(Pid, State = #s{clients = Clients}) ->
        case lists:member(Pid, Clients) of
            false ->
                Mon = monitor(process, Pid),
                ok = io:format("Monitoring ~tp @ ~tp~n", [Pid, Mon]),
                State#s{clients = [Pid | Clients]};
            true ->
                State
        end.
    
  • Next line is loop(Parent, Debug, State). Let's look at gh_client:loop/3

    -spec loop(Parent, Debug, State) -> no_return()
        when Parent :: pid(),
             Debug  :: [sys:dbg_opt()],
             State  :: state().
    %% @private
    %% The service loop itself. This is the service state. The process blocks on receive
    %% of Erlang messages, TCP segments being received themselves as Erlang messages.
    
    loop(Parent, Debug, State = #s{socket = Socket}) ->
        ok = inet:setopts(Socket, [{active, once}]),
        receive
            {tcp, Socket, <<"bye\r\n">>} ->
                ok = io:format("~p Client saying goodbye. Bye!~n", [self()]),
                ok = gen_tcp:send(Socket, "Bye!\r\n"),
                ok = gen_tcp:shutdown(Socket, read_write),
                exit(normal);
            {tcp, Socket, Message} ->
                ok = io:format("~p received: ~tp~n", [self(), Message]),
                ok = gh_client_man:echo(Message),
                loop(Parent, Debug, State);
            {relay, Sender, Message} when Sender == self() ->
                ok = gen_tcp:send(Socket, ["Message from YOU: ", Message]),
                loop(Parent, Debug, State);
            {relay, Sender, Message} ->
                From = io_lib:format("Message from ~tp: ", [Sender]),
                ok = gen_tcp:send(Socket, [From, Message]),
                loop(Parent, Debug, State);
            {tcp_closed, Socket} ->
                ok = io:format("~p Socket closed, retiring.~n", [self()]),
                exit(normal);
            {system, From, Request} ->
                sys:handle_system_msg(Request, From, Parent, ?MODULE, Debug, State);
            Unexpected ->
                ok = io:format("~p Unexpected message: ~tp", [self(), Unexpected]),
                loop(Parent, Debug, State)
        end.
    
  • I'll let you figure this one out

  • I think the picture is clear. There's a lot of moving parts, but the basic principle is as follows:

    • gh_client instances are like infinity-spawn receptionists. Every time a web browser wants to talk to our server, we spawn a gh_client instance that talks to the web browser.
    • gh_client_man is responsible for any logic that spans across different gh_client instances (e.g. relaying messages).
    • Everything else is boilerplate
  • So our task is to remove the relay-messages logic, and replace it with http parse/respond logic.

Description
Public DEX/CEX code
Readme 522 KiB
Languages
Erlang 98.4%
HTML 1.6%