- zomp nodes must report the realms they provide to upstream nodes to which they connect. On redirect a list of hosts of the form [{zx:host(), [zx:realm()]}] must be provided to the downstream client. This is the only way that downstream clients can determine which redirect hosts are useful to it. - Change zx_daemon request references to counters. Count everything. This will be the only way to sort of track stats other than log analysis. Log analysis sucks. - Double-indexing must happen everywhere for anything to be discoverable without traversing the entire state of zx_daemon whenever a client or conn crashes. - zx_daemon request() types have been changes to flat, wide tuples as in the main zx module. This change is not yet refected in the using code. - Write a logging process. Pick a log rotation scheme. Eventually make it so that it can shed log loads in the event they get out of hand. - Create zx_daemon primes. See below New Feature: ZX Universal lock Cross-instance communication We don't want multiple zx_daemons doing funny things to the filesystem at the same time. We DO want to be able to run multiple ZX programs to be able to run at the same time. The solution is to guarantee that there is only ever one zx_daemon running at a given time. IPC is messy to get straight across various host systems, but network sockets to localhost work in a normal way. The first zx_daemon to start up will check for a lock file in the $ZOMP_HOME directory. If there is no lock file present it will write a lock file with a timestamp, begin listening on a local port, and then update the lock file with the listening port number. If it finds a lock file in $ZOMP_HOME it will attempt to connect to the running zx_daemon on the port indicated in the lock file. If only a timestamp exists with no port number then it will wait two seconds from the time indicated in the timestamp in the lock file before re-reading it -- if the second read still contains no port number it will remove the lock file and assume the primary role for the system itself. If a connection cannot be established then it will assume the instance that had written the lock file failed to delete it for Bad Reasons, remove the lock file, open a port, and write its own lock file, taking over as the lead zx_daemon. Any new zx_daemons that come up in the system will establish a connection to the zx_daemon that wrote the lock file, and proxy all requests through that primary zx_daemon. When a zx_daemon that is acting as primary for the system retires it will complete all ongoing actions first, then begin queueing requests without acting on them, designate the oldest peer as the new leader, get confirmation that the new leader has a port open, update the original lock file with the new port number, redirect all connections to the new zx_daemon, and then retire. init(Args) -> Stuff = do_stuff(), Tries = 3, Path = lock_file(), case check_for_leader(Tries, Path) of no_leader -> {found, Socket} -> end. check_for_leader(0, _) -> no_leader; check_for_leader(Tries, Path) -> case file:open(Path, [write, exclusive]) of {ok, FD} -> become_leader(FD); {error, eexist} -> contact_leader() end. case file:consult(Path) of {ok, Data} -> {error, enoexist} -> check_file(Path); end end,