ERights Home elib / distrib 
Back to: Types of Object Passing 1st child: Data Pluribus Overview On to: CapTP: The Capability Transport Protocol


Last updated: [98/07/07 Bill]
[98/9/15 Bill] Added link to E Data Comm System Throught Firewalls.

Author: Bill Frantz (


This system performs the basic byte array transport between "vats" for the New E runtime. It is also responsible for connection establishment and tear down; and confidentiality and authentication of the data sent on the connections.

A vat is the part of the Neocosm implementation that has a unique network identity. We expect that normal circumstances, there will only be one vat running on a particular machine at one time. Neocom currently (28 May 1998) supports only one avatar per vat.

Related Documents

See Comm System Overview for information about the Comm System used in the 1998 alpha release, version r167.

See DataComm Startup Protocol for information on the start up protocol.

See E Data Comm System Through Firewalls for some thoughts on working through firewalls.

Hal Finney's review of an earlier version of VatTP.


The basic requirements of the Data Comm system are; connection management, reliability, ordering, encryption, authentication, and network location independence. These, and other requirements, are discussed in more detail below.

  • Connection management: The Data Comm system will maintain a list of current connections. It will build new connections as needed. It will tear down connections when requested. It will accept new connections from other vats.
  • Wire protocol: The Data Comm system will define the basic wire protocol. Certain parts of the protocol (e.g. insertion of the message type numbers) may be implemented by users of the Data Comm system.
  • Data Streaming: The Data Comm system will provide a primitive sufficient to support a streaming protocol for long messages. This will be designed with needs of art down loading in mind.
  • Reliability and ordering: The Data Comm system will provide "reliable" message transport with sufficent ordering to implement the E ordering guarantees.
  • Exception handling: The Data Comm system will provide sufficient error reporting so that the E runtime can correctly implement its "broken promise" logic. In addition, it will maintain a log of unusual events to aid in debugging the communication problems that will inevitability be encountered by Neocosm users.
  • Multicast support: The Data Comm system will cooperate with other software to support a "send to multiple objects" abstraction. This cooperation includes point to point message delivery, and link failure notification.
  • Secure Authenticated Links: The Data Comm system will implement authenticated, confidential links between E "vats" running on the same or different machines.
  • Network location independence: The Data Comm system will cooperate with one or more PLSs to allow an identity to move from one network address to another.
  • Firewall support: The Data Comm system will allow users to operate through certain kinds of firewall.
  • Kill Dead Connections: Connections will be periodically checked to ensure they are still working.
  • Build a authenticated connection to the vat at an arbitrary IP address. The initial user of this feature will be PLS registration.


Conceptual objects

  • VatIdentity the serializable class which holds the public/private keys that define the identity of the vat. It's getVatTPMgr method is how the VatTPMgr is created. (Thanks to Arturo Bejar for the suggestion of separating the infrequently changing vat identity code, which must be saved, from the more frequently changing communications code which can be recreated after every restart of the application.)
  • VatTPMgr which manages all the connections.
  • VatTPConnection which implements a single (TCP) connection.
  • ListenThread which listens for new incoming connections.

Connection establishment and tear down

    The VatTPMgr implements a method: public VatTPConnection getConnection(String registrarID, String[] plsList) which either returns an existing VatTPConnection, or creates a new one and starts it building a network connection. registrarID is the hash of the public key of the desired vat, and plsList is an array of PLS locations to try in attempting to locate that vat. The PLS locations are <IP address:port number> e.g. "".

    Because of the need to build an authenticated connection to the vat at an arbitrary IP address, the VatTPMgr also implements Promise connectToVatAt(String ipPort). ipPort is <IP address:port number> as above. If a remote vat is listening at that IP:port address, then the Promise will be forwarded to the VatTPConnection when the remote vat has been identified. If a connection error occurs before the remote vat is identified, then the Promise will be "smashed". Since this connection is a full fledged E-vat to E-vat connection, the NewConnectionNoticer registered with registerNewConnectionNoticer() will be called to hook up the normal E connection services.

    The VatTPMgr implements a registration method public void registerNewConnectionNoticer(NewConnectionNoticer noticer). It will call the object registered for each new VatTPConnection with public void noticeNewConnection(VatTPConnection connection). This call is designed to allow higher levels to register their message handlers on new inbound connections.

    To allow all the necessary "plumbing" to be connected when a connection is established, the VatTPConnection object will not send or receive any high level data until the noticeNewConnection method of the registered NewConnectionNoticer has returned. This method should ensure that all the necessary message handlers have been registered.

    Each VatTPConnection implements a method: public void shutDownConnection() which closes the connection. After this call the VatTPConnection object is no longer usable and should be discarded so it can be garbage collected.

Sending Data

    Each VatTPConnection implements two methods: public void sendMsg(byte[] message) throws IOException and public void sendMsg(byte[] message, Runnable notification, Runner placeToRun) throws IOException.

    In both cases, message is the message to be sent. It must not be altered after the call to sendMsg. The first byte of message is the message type and must be chosen from the types defined in the definition class Msg. (These two restrictions allow sendMsg to avoid copying the message into a private buffer.)

    A further restriction will be that there must be a handler registered for the message type. This restriction will allow sendMsg to use the handler data structure to validate sends. If this restriction is a problem it is easy to remove, but it seems to be a reasonable one give the symetric nature of the communication protocols.

    /* Message type codes. */
        /* Connection admin */
        static final String[] Version     = {"A", "A"}; // E protocol versions supported
        static final byte PROTOCOL_VERSION  = 1;  // Initial message followed by version string above
        static final byte STARTUP           = 2;  // Conn. startup protocol msg
        static final byte PROTOCOL_ACCEPTED = 3;  // Followed by version string of the selected protocol version
        static final byte SUSPEND           = 4;  // Take down physical connection
                                                  //   leaving logical connection intact
        static final byte PING              = 5;  // Check to see the connection is still there
        static final byte PONG              = 6;  // Response to ping
        public static final byte E_MSG     = 7;   // An E level message with envlelope etc.

    Parameter notification is a Runnable that will be enqueued using placeToRun.enqueue(notification) after message has been placed in the network output queue. If some error prevents message from being sent, notification may not be queued. This mechanism allows code outside the Data Comm system to implement a data streaming protocol with flow control for large data transfers. A suggested use is to start streaming by calling sendMsg two or three times with the initial blocks of the stream. The notification includes the next block number to send. The Runnable fetches that block, calculates the next block and calls sendMsg with the appropriate parameters to continue the stream.

    If there is a problem with the connection which prevents the message from being sent, IOException will be thrown. See Failure Notification for the fine print.

Receiving Data

    Each VatTPConnection implements a method: public void registerMsgHandler(int msgType, MsgHandler handler) throws IOException. The parameter msgType is the message type to be handled (from MsgTypes). An attempt to register more than one handler for a message type, or to register for an invalid message type will throw an exception. The parameter handler is an object which will handle the message data. It will be called with: void processMessage(byte[] message, VatTPConnection connection)where message is the only reference to the byte array, and connection is the VatTPConnection object which received the message. Note that one handler can process more than one message type by selecting on the first byte of the message. One handler can process more than one connection by selecting based on the VatTPConnection object passed.

Failure Notification

    There are at least two queues between the VatTPConnection and the network hardware. One is maintained as part of the VatTPConnection and allows sendMessage to be non-blocking. The other is maintained as part of the JavaVM/Platform TCP implementation. Senders are only notified of problems known before the message is placed in the VatTPConnection output queue. If a problem occurs when a message is in either of the output queues, the message is silently discarded. However, any failure to deliver an outbound message will cause the connection to be terminated. This termination will notify the input message handlers.

    If the sender is notified of a connection problem, the notification will be by having the call to sendMessage throw an IOException.

    Input message handlers will be notified of connection termination by: void connectionDead(Throwable /*nullOK*/ error) being called. To avoid multiple trace entries, any necessary Trace log entries will be made by the Data Comm layer.

Off the shelf alternatives

Using SSL has been rejected. See SSL vs. E Comm for the reasons.

Other Design Objectives, Constraints and Assumptions

The bug in the current connection set up protocol which allows a man-in-the-middle to eliminate encryption by modifing the crypto negoition should be fixed by verifing a hash of all the startup protocol messages after authentication has been set up.

Current implementation

This design is a simpilification of the r167 system. Basic code that will come over mostly unchanged includes the encryption, send and receive threads, message queuing, Trace log error handling, and startup protocol negoition.

The new design will have two threads per link instead of three (the function of RawConnection being taken over elsewhere). There will be many fewer classes and objects per connection. The main bodies of code should fairly clearly follow the conceptual objects described above.

The following description written 6/22/98 - Bill

The code consists of the following major classes:

  • VatIdentity - Owns the public/private keys which define the identity of the vat.
  • VatTPMgr - Manages the connections from this vat to other vats.
  • VatTPConnection - Manages one logical connection from this vat to a single other vat.
  • DataPath - Manages a single (TCP) connection. DataPaths can come and go with suspend/resume events and crossed connections while the VatTPConnection persists.
  • SendThread - A separate thread which sends data to the TCP socket. It is a separate thread to allow sends to be non-blocking.
  • RecvThread - A spearate thread to listen to the socket for incoming messages. It is a separate thread to allow vat processing of other events while waiting for an incoming message.
  • ListenThread - A separate thread to listen for new TCP connections. There is one instance of this thread for each identity.
  • StartUpProtocol - Handles the start up protocol for the connection.

Startup, Shutdown, and Steady State

The construction starts with a VatIdentity object which has either been instantiated or restored from a checkpoint. We further assume that it has been called for the instance of its VatTPMgr so the connections manager has been built. As part of building the connections manager, the ListenThread has been created and started.

The object of startup is to create the objects needed for the steady state. The object of shutdown is to clean up the steady state. Because of these objectives, I will describe the steady state first.

Steady State

There is a VatTPConnection object which is connected to the higher-level things. That VatTPConnection object is connected to a DataPath object which is connected to a SendThread and a RecvThread. The VatTPMgr has the VatTPConnection object registered in its list of running connections.

Startup Protocol

The startup protocol is handled by StartUpProtocol. It identifies the remote vat and sets up the secure connection. The startup protocol has four possible outcomes which are signaled by calling appropriate methods in its associated DataPath object.

  • abandonAllConnectionAttempts - Used when this connection is the one of a pair of crossed connections which will be closed.
  • tryNext - Used to try the next address in the search path. This is the return that is used after getting an address from the PLS or if a location fails to respond or has an error.
  • resumeConnection - Used to try to resume a connection. The suspendID presented by the remote end must match the local copy.
  • startupSuccessful - Used when the startup protocol has successfully completed and the connection is ready for higher level data.

Outbound Search Strategy

The request to create a new outbound connection, getConnection, takes a parameter which is the list of addresses to try in order to locate the remote vat. These addresses will be tried in the order presented. They can be either the expected address of the vat, or the address of a PLS the vat may register with. If the address is a PLS, it can return a new address to try in response to the start up protocol request for the registrarID. That address will be tried before the next address in the list passed to getConnection(). No address will be tried more than once.

If the address is not a PLS, either there is nothing listening at the remote address, there is a non-E system listening, there is some other vat listening, or it is the address of the desired vat. If there is nothing listening, then the TCP socket build will fail, and the next address will be tried. If there is a non-E system listening, then either it will perform an illegal start up protocol operation, or the start up will timeout. If it is some other vat, it will respond NOT_ME to the start up protocol. If it is the desired vat, the start up protocol will succeed and the connection will be made, or the two ends will be unable to agree on a version of the E comm protocol/encryption technique, and the attempt will fail. In this last case, the other addresses in the list will be tried.

Outbound Startup

The VatTPMgr has been called for a connection to a remote vat. It has determined that no existing connection exists. It creates a new VatTPConnection object which in turn creates a DataPath object and a StartUpProtocol object. The DataPath object creates a SendThread which builds a TCP connection to the first address in the search path and then creates a RecvThread. The StartUpProtocol object sends messages to initiate the startup protocol. If the first address is not the requested vat, the DataPath, StartUpProtocol, SendThread, and RecvThread are closed and the process continues by building a DataPath object for the next address in the search list.

Inbound Startup

The ListenThread receives the incoming socket and passes it to the VatTPMgr. The VatTPMgr creates a DataPath object to perform the startup protocol with this socket. When the startup protocol has proceeded far enough to identify the remote vat, the VatTPMgr is used to either connect it to an existing VatTPConnection object or to create a new one. If it is connected to an existing VatTPConnection object, there may be a crossed connection to contend with. If there is a crossed connection, the two StartUpProtocol objects work through the two DataPath objects and the single VatTPConnection object to resolve the connection down to only one DataPath object.


When a request to shutdown the connection is received by the VatTPConnection, it sends a shutdown message to the other end. It will not send any new messages after it has sent the shutdown message. When the other end receives the shutdown message, it notifies its registered MsgHandler objects that the connection has shut down and echos the shut down message. It then closes the socket, destroys the SendThread, and RecvThread, and notifies its VatTPConnection that it is dead. When the connection that originated the shutdown receives the shutdown message, it performs the same cleanup.


Suspend is similar to shutdown except that the VatTPConnection remains in suspended state instead of becoming dead. Any attempt to send a message on a suspended VatTPConnection object will initiate a new connection attempt as in Outbound Startup.


Resume is very much like startup. Part way thru the startup protocol, a resume message informs the other end that the operation is a resumption. The resuming vat presents its suspendID. If the ID matches the one stored locally, the connection is resumed. If it does not match, or none has been generated locally, the connection is not resumed. If there is a suspended connection with that identity, it is shutdown.

Is it JavaDoc'ed?

In many cases, this section can link to JavaDoc output from actual Java classes and interfaces. This saves writing documentation twice (the designers will have to JavaDoc their interfaces anyway). The JavaDoc should be linked into the design document. Chip's JavaDoc style guidelines (XXX file missing) explain how to use JavaDoc effectively.


All of these examples assume that a VatIdentity object, called vi, has been build and a VatTPMgr object has been collected by VatTPMgr cm = vi.getVatTPMgr(...). Furthermore, a permanent NewConnectionNoticer object, called ncn, has been registered with the VatTPMgr.

  • Building a connection to connect a proxy to a remote object
    1. VatTPConnection dc = cm.getConnection(...);
    2. MsgHandler mh = new RelayMsgHandler(...);
    3. dc.registerMsgHandler(Msg.E_MSG, mh);
    4. dc.sendMsg(first proxy protocol message);
  • Receiving a connection from a remote vat
    1. ncn is called with ncn.noticeNewConnection(dc); ncn does:
      1. MsgHandler mh = new RelayMsgHandler(...);
      2. dc.registerMsgHandler(Msg.E_MSG, mh);
    2. mh will be called with the first proxy protocol message.
  • Building a connection to register with a PLS at a specific IP:port address using E objects.
    1. Promise pr = cm.connectToVatAt(ipport, rn);
    2. Object doReg = new PLSRegistration(pr);
    3. E.whenKept(pr, doRegistration);
    4. E.whenBroken(pr, doRegistration);
    5. When the Promise pr is resolved, doReg is called with o); doReg does:
      1. if (myPromise.state == "BROKEN") //bitch and moan, we didn't find the PLS. return;
      2. VatTPConnection dc = (VatTPConnection)o; // Kept, object is the VatTPConnection
      3. String rid = dc.getRemoteRegistrarID();
      4. rn then creates a sturdy ref for the well known PLS registration swiss number, sn, and calls the CapTPMgr to resolve it: Proxy rp = proxyManager.resolveReference(rid, null, sn);
      5. It can then engage in the registration protocol using the standard proxy.
    6. When the startup protocol has completed, ncn is called with ncn.noticeNewConnection(dc); ncn does (same as above so the connection will handle E proxy messages.):
      1. MsgHandler mh = new RelayMsgHandler(...);
        1. dc.registerMsgHandler(Msg.E_MSG, mh);
  • Building a connection to register with a PLS at a specific IP:port address using special messages:
    1. PPromise pr = cm.connectToVatAt(ipport, rn);
    2. Object doReg = new PLSRegistration(pr);
    3. E.whenKept(pr, doRegistration);
    4. E.whenBroken(pr, doRegistration);
    5. When the Promise pr is resolved, doReg is called with o); doReg does:
      1. if (myPromise.state == "BROKEN") //bitch and moan, we didn't find the PLS. return;
      2. VatTPConnection dc = (VatTPConnection)o; // Kept, object is the VatTPConnection
      3. RegistrationMsgHandler rmh = new RegistrationMsgHandler(...);
      4. dc.registerMsgHandler(Msg.PLS_PROTOCOL, rmh);
      5. dc.sendMsg(first message in PLS registration protocol)
      6. rmh will receive the responses to the first message. rmh could be the same object as doReg, to keep all the registration protocol state machine in the same object.
    6. When the startup protocol has completed, ncn is called with ncn.noticeNewConnection(dc); ncn does (same as above so the connection will handle E proxy messages.):
      1. MsgHandler mh = new RelayMsgHandler(...);
      2. dc.registerMsgHandler(Msg.E_MSG, mh);

Testing and Debugging

See DataComm Testing.

Design Issues

Resolved Issues

History of issues raised and resolved during initial design, or during design inspections. Can also include alternative designs, with the reasons why they were rejected

  • [as of 6/15/98] There is no way to connect an incoming connection to the higher levels (multi-comm, object-com, proxy-comm). This issue is resolved by the addition of the registerNewConnectionNoticer method.
  • [as of 6/15/98] There is a race condition for outgoing connections where the startup protocol can complete before the higher levels have registered their listeners. If this occurs, incoming messages may be dropped (with error spam). This issue is resolved with the introduction of the enable() method on the VatTPConnection. [7/7/98]Upon reflection, the enable method is unnecessary. The VatTPMgr is notified that the connection is RUNNING when the last start up protocol message is processed. It calls the registered NewConnectionNoticer as a result of that notification. It can register the MsgHandlers before it returns which is before the RecvThread can introduce new messages into the DataPath/VatTPConnection (since the RecvThread is busy while the last start up message finishes being processed.)

Open Issues

  • In 1.1.3, Java appears to get the IP address of the machine once at startup. In the case of someone running Microcosm who is dropped by their ISP, they get a new IP address when they re-dial. We need to deal with this problem one way or another.
  • Jeff says, "Getting the IP address on windows is no problem. I've done a bunch a work with winsock. We could just add something to <shudder> native.dll."

  • You can't get the IP address of the local machine unless it is connected (duh). Eric reported this as a problem that had to be worked around in the r167 comm system. He also reported that the work around wasn't complete.
  • In r167, you can't perform in-world operations (e.g. build a turf) without being connected. Randy says we need to fix this. The fix is probably outside the Data Comm system level.
  • I (wsf) currently believe that the data streaming protocol described above will be sufficient to solve the art downloading problem. It will limit the delay in transmitting other E messages to the length of a small number of art blocks. If that is not sufficient, we can add a second, low priority, output queue which is served only when the high priority queue is empty.
  • Sidney suggests having a way to configure the "listen" address (IP + port number) during PLS registration to handle certain proxying firewalls.
  • Does LDAP (Lightweight Directory Access Protocol) have any application to our PLS requirements?
  • [as of 6/15/98] The reconnection of suspend connections is messy. A connection which is being resumed needs to be connected to the old VatTPConnection object so that object's clients are unaware of the suspend/resume. In R167 this was handled with two objects, but it would be nice to avoid the overhead of the extra method invocations multiple objects require.

Thread Handling

Each communication connection seems to need to juggle three threads. They are:

  • The Vat thread
  • The send thread - which allows non-blocking sends
  • The receive thread - which monitors the connection for input

The send and receive threads need to communicate with objects inside the vat for several reasons. These reasons include: updating comm statistics, new messages available for processing, error reporting, progress of send operations (the data streaming), and shutdown progress.

The E vat code provides two techniques for threads outside the vat to synchronize with the vat and communicate with in-vat objects. Both of these are implemented in org.erights.e.elib.prim.Runner. They are:

public Object callNow(Object rec, String verb, Object[] args) throws Throwable; Which does a normal E style CRAPI call of rec.verb(arg...) and

public Object now(Thunk todo) throws Throwable; Which calls the "Object run()" method in todo which implements Thunk.

In the r167 version, the external threads simply grabbed the vat lock and then used their reference to the in-vat objects to call directly.

The E techniques both have performance implications: callNow does: return now(new CallThunk(rec, verb, args)); And CallThunk saves the arguments and does an with them in its run method.

Using now(...) directly increases class bloat with a Thunk for each method called, or has an obscure switch function in one common DataCommThunk class.

In both cases, some extra objects are created and made garbage to acomplish the call. Is tight control of the vat lock (in Runner) worth the cost in object creation and extra classes?

The version as of 6/16/98 uses a single thunk with the messy switch statement for communication from the SendThread and RecvThread to the VatTPConnection. The CRAPI interface is used to notify the VatTPMgr of newly arrived Sockets.


Unless stated otherwise, all text on this page which is either unattributed or by Mark S. Miller is hereby placed in the public domain.
ERights Home elib / distrib 
Back to: Types of Object Passing 1st child: Data Pluribus Overview On to: CapTP: The Capability Transport Protocol
Download    FAQ    API    Mail Archive    Donate

report bug (including invalid html)

Golden Key Campaign Blue Ribbon Campaign