blah blah blah

This commit is contained in:
Craig Everett 2018-03-08 10:05:36 +09:00
parent 62385cd088
commit 3dfa14d20c
2 changed files with 384 additions and 347 deletions

56
TODO
View File

@ -2,16 +2,50 @@
On redirect a list of hosts of the form [{zx:host(), [zx:realm()]}] must be provided to the downstream client. On redirect a list of hosts of the form [{zx:host(), [zx:realm()]}] must be provided to the downstream client.
This is the only way that downstream clients can determine which redirect hosts are useful to it. This is the only way that downstream clients can determine which redirect hosts are useful to it.
- The ZX daemon should be able to retry requests that were submitted but did not receive a response before the relevant - Connect attempts and established connections should have different report statuses.
connection was terminated for whatever reason. An established connection does not necessarily have other attempts being made concurrently, so a termination should initiate 3 new connect attempts as init.
The most obvious way to do this would be to keep a set or queue of references in each connection monitor's section, An attempt is just an attempt. It can fall flat and be replaced only 1::1.
clearing them when they receive responses, and pushing them back into the action queue (from the responses reference map)
when they fail.
- The same issue as the one above, but with subscriptions. Currently there is no obvious way to track what subscriptions flow
through which connections, and on termination or change of a connection there is no way to ensure that the subscription request
finds its way back into the action queue to resubmission once a realm becomes available again.
- The request tracking ref list is currently passing through the MX record. That is exactly the wrong place to put it.
It should DEFINITELY be in the CX record. New Feature: ZX Universal lock
MOVE IT. Cross-instance communication
We don't want multiple zx_daemons doing funny things to the filesystem at the same time.
We DO want to be able to run multiple ZX programs to be able to run at the same time.
The solution is to guarantee that there is only ever one zx_daemon running at a given time.
IPC is messy to get straight across various host systems, but network sockets to localhost work in a normal way.
The first zx_daemon to start up will check for a lock file in the $ZOMP_HOME directory.
If there is no lock file present it will write a lock file with a timestamp, begin listening on a local port, and then update the lock file with the listening port number.
If it finds a lock file in $ZOMP_HOME it will attempt to connect to the running zx_daemon on the port indicated in the lock file. If only a timestamp exists with no port number then it will wait two seconds from the time indicated in the timestamp in the lock file before re-reading it -- if the second read still contains no port number it will remove the lock file and assume the primary role for the system itself.
If a connection cannot be established then it will assume the instance that had written the lock file failed to delete it for Bad Reasons, remove the lock file, open a port, and write its own lock file, taking over as the lead zx_daemon.
Any new zx_daemons that come up in the system will establish a connection to the zx_daemon that wrote the lock file, and proxy all requests through that primary zx_daemon.
When a zx_daemon that is acting as primary for the system retires it will complete all ongoing actions first, then begin queueing requests without acting on them, designate the oldest peer as the new leader, get confirmation that the new leader has a port open, update the original lock file with the new port number, redirect all connections to the new zx_daemon, and then retire.
init(Args) ->
Stuff = do_stuff(),
Tries = 3,
Path = lock_file(),
case check_for_leader(Tries, Path) of
no_leader ->
{found, Socket} ->
end.
check_for_leader(0, _) ->
no_leader;
check_for_leader(Tries, Path) ->
case file:open(Path, [write, exclusive]) of
{ok, FD} -> become_leader(FD);
{error, eexist} -> contact_leader()
end.
case file:consult(Path) of
{ok, Data} ->
{error, enoexist} ->
check_file(Path);
end
end,

File diff suppressed because it is too large Load Diff