Tunneling Thru Firewalls

/ elib / distrib / vattp

Tunnelling
Thru Firewalls

Last updated: [98/09/15 Bill]
[98/09/16 Bill] Added a flow control protocol for each TCP stream and a description of the server logic.
[98/09/17 Bill] Added client implementation logic. Added information on configuration for firewall HTTP proxys. Added description of weak and strong authentication for HTTP_Logon to eliminate some kinds of denial of service attack.
[98/09/18 Bill] Eliminate the unauthenticated logon messages, and define the wire format of the messages.

Author: Bill Frantz (frantz-at-pwpconsult.com).

Introduction

This document describes some ideas for extending the DataComm system to operate through various types of firewall. There are four basic levels of problem:

Where the only problem is setting up the firewall to pass incoming sockets to the listen address on the local machine and advertising the ip:port used on the firewall machine.
Where incoming sockets can not be accepted.
Where the only communications permitted throught the firewall are via outgoing HTTP.
Where the only communications permitted throught the firewall are via outgoing HTTP, and connections must be made to port 80.

Requirements

The basic requirements is that the E Data Comm system be able to operate through firewalls without special configuration of the firewall. Furthermore this operation should be possible without the cooperation or permission of the firewall operator.

Architecture

HTTP Tunneling

HTTP Tunneling works by sending POST requests to a "HTTP server" and receiving replies. If the firewall allows us to use HTTP on any port, then we just need the DataComm HTTP Server code. Otherwise, if the machine must also support a real HTTP server, we will use a CGI to redirect the request to the non-port 80 server. Note that the Java virtual machine is configured to use a firewall proxy with the Java system properties: http.proxyHost and http.proxyPort. After this configuration has been set, the URL will use the firewall proxy to contact hosts outside the firewall.

If we can use HTTP/1.1 instead of HTTP/1.0, we may be able to take advantage of the reusable TCP connections which are supported in HTTP/1.1. In 1.1, everyone along the path, including proxys and firewalls has the option of tearing down the connection after the first round trip, but even if it only helps on some cases, it would be worth while.

The POST request can be sent through a URLConnection generated from a URL which specifies a protocol of "http", a host-name, port-number, and a path or cgi reference.

conn = url.openConnection(); conn.setDoOutput(true); conn.setUseCaches(false); conn.setRequestProperty("Content-type", "application/octet-stream");

inNotifier.deactivate(); in = null;

return out = conn.getOutputStream();

When the input stream is first read, the buffered output is sent:

        outNotifier.deactivate();
        out.close();
        out = null;

        // An HTTP error will either show up as an IOException, or it
        // will show up as the error response.  If the content type is
        // not "application/octet-stream", then we are dealing with an
        // error response.
        try {
            in = conn.getInputStream();
        } catch (IOException e) {
            throw new IOException("HTTP request failed");
        }

        String contentType = conn.getContentType();
        if (contentType == null ||
                !conn.getContentType().equals("application/octet-stream")) {
            throw new IOException("HTTP request failed");
        }

        return in;

The request starts out with a fixed header of POST and must include Content-length: header. To be complient with HTTP (see RFC 2068), it must also include Content-type:.

POST <URI-requested> HTTP/1.0\r\n
Content-type: application/octet-stream\r\n
"Content-length: " + sizeOfData + "\r\n"

The reply starts out with a fixed 200 OK header, and also includes Content-length: and Content-type:.

HTTP/1.0 200 OK\r\n Content-type: application/octet-stream\r\n "Content-length: " + sizeOfData + "\r\n" \r\n

If there is a client error, the reply is a fixed 400 Bad Request header.

HTTP/1.0 400 Bad Request - <message>\r\n \r\n

Followed by the data as a transparent byte stream. The system allways sends a least one byte of data (a nul) to support clients that object to zero bytes of data.

The receiver must discard any extra data which follows the sizeOfData bytes in the byte stream. The receiver must also skip all the other headers until it reads the blank line (or line consisting only of line terminators or whatever).

DataComm HTTP Server

The DataComm HTTP server process acts as a remote proxy for the firewalled vat client. The proxy supports a listen address where the vat can be contacted, and several TCP links to other vats. The protocol between the client and the proxy server identifies the TCP link with which a set of data is associated. Note that since normal vat-to-vat authentication and privacy measures are used, the client to proxy link does not need either encryption or authentication. However, some level of authentication would help discourage denial of service attacks. Since all communication is driven by the client, the proxy needs to be able to time out the client listen address and the TCP links. See also Server Design.

Note on Timeouts: It is possible that a slow link will result in it taking longer for the clent to send a HTTP_Session message than the server timeout. If the server can detect that the client has started sending a message, it can then use continued progress in receiving the message as the timeout criterien rather than just receipt of the message. This kind of timeout is straight forward when the client is connected directly to the server. I don't know if it is possible when the client messages are being redirected by a CGI.

Client - Proxy Message Formats

This protocol uses messages formatted with java.io.DataOutputStream. The protocol uses writeUTF(), writeByte(), writeShort() (read with readUnsignedShort()), and write(byte[]) in sending the data. In the descriptions below, the first three are refered to as UTF, byte, and short. The notation "byte[]" is also used. All byte[] parameters are assumed to be proceeded by a short giving the length of the byte array.

The HTTP_Logon message includes a list of acceptable protocol version numbers. The versionID described in this document is "T1".

All messages between the client and the proxy are carried in HTTP envelopes as described under HTTP Tunneling. Each of the major messages types (HTTP_Logon, HTTP_Session, HTTP_Shutdown, and HTTP_Error) are carried in a separate HTTP interchange.

Message Type codes

All message and response types are single bytes. The assigned values are:

HTTP_Logon = 0x01;

HTTP_Session = 0x02

HTTP_Shutdown = 0x03;

HTTP_Error = 0x04;

HTTP_Logged_On = 0x05;

HTTP_Set_Server_Nonce = 0x06;

The subtypes of HTTP_Session are assigned values

HTTP_NewConnection = 0x10;

HTTP_Data = 0x11;

HTTP_OK_To_Send = 0x12;

HTTP_Close = 0x13;

HTTP_InvalidID = 0x14;

HTTP_ConnectionFailed = 0x15;

HTTP_ConnectionComplete = 0x16;

Client Authentication

There is a trade off between server performance and the ability of a hostile user to cause denial of service attacks on the server and its clients. Most of these attacks can be eliminated by authenticating the logon message and using the VatID to control access to the server (for billing or to eliminate bad actors).

The server can check three levels of authentication. If the server never checks signatures, anyone who knows the vatID and the server URL can deny service to that vatID by sending a HTTP_Logon with that vatID. If the server checks the clientNonce, it protects against this attack by requiring an attacker to have a signed HTTP_Logon message. However, the message could come from having snooped the vat's communications. If the server checks require that the vat sign a random number provided by the server, and the server saves the last number it issued the client and makes sure the client is returning that number, then the server knows it is communicating with the client. The server should also ensure that the vatID is the hash of the public key.

The server can dynamically decide how much authentication to require. A policy of only checking authentication if the vatID is already logged on seems reasonable.

Message Descriptions

HTTP_Logon Message

<byte HTTP_Logon> <UTF VatID> <byte[]serverNonce> <byte[]clientNonce> <byte[]publicKey> <byte[]signature> - Indicates that <VatID> wants to use the server as a proxy. The serverNonce is a random number generated by the server, the clientNonce is a random number generated by the client, the public key is the client's public key, and the signature is the DSA signature over the sequence (as transmitted) <HTTP_Logon> <VatID> <serverNonce> <clientNonce>. The first time the client sends this message, it specifies a zero length serverNonce. The responses are:

<HTTP_Logged_On> <byte[]sessionID> <UTF listenAddress> - Indicates that the logon is successful and provides a sessionID for the session. The sessionID is sufficently large (64 bits?) that a hacker who is not tapping the communications between the client and the server can not easily guess it and interfere with the service. The listenAddress is the host:port the server is using to listen for connections to this vat.

<HTTP_Set_Server_Nonce> <byte[]serverNonce> - Indicates that the serverNonce in the logon message was missing or invalid. The client should resend the HTTP_Logon message using the serverNonce in this message.

HTTP_Session Message

The HTTP_Session message is used in both directions to pass data to the proxyed TCP connections, open new TCP connections, respond to new TCP connections and close TCP connections. The HTTP_Sessionmessage consists of a header and zero or more data segments. (A HTTP_Session message with zero data segments act as Ping/Pong message.) The client must send an HTTP_Session message every n (60?) seconds or the server will shutdown the session.

Messages which describe a specific TCP connection use a <ConnectionID> parameter. This parameter is an byte, limiting the maximum of proxied TCP sessions active to 255. Positive values are assigned by the client for outgoing connections. Negative values are assigned by the server for incoming connections. The value zero is not legal.

Each TCP connection has its own flow control. Both the client and server should limit the amount of data they send to a connection to the value in the last HTTP_OK_To_Send message for that connection.

The header is:

<HTTP_Session> <byte[]sessionID>

Any number of data segments may be included in the message. The legal data segments are:

<HTTP_NewConnection> <byte connectionID> <UTF HostID:port> - Client to host only. Build a TCP connection to the specified host and port, and use connectionID to refer to it in subsquent messages. Responses are not necessarily returned in the same exchanges. They are:

<HTTP_InvalidID> <byte connectionID> - The connectionID passed is invalid because either there is already a connection using that ID, or because the ID has the wrong sign. All connections with that ID are closed.

<HTTP_ConnectionFailed> <UTF reason> - The connection could not be make. <reason> is a textual message describing the reason for the failure.

<HTTP_ConnectionComplete> <byte connectionID> - The connection is ready to accept data.

<HTTP_NewConnection> <byte connectionID> <UTF host:port> - Server to client only. A new TCP connection has been established to the server's listen port for this VatID. The host and port are those of the remote end of the TCP connection.

<HTTP_Data> <byte connectionID> <byte[]data> - Indicates data to be passed to/received from the TCP connection.

<HTTP_OK_To_Send> <byte connectionID> <short bytesOfData> - Indicates that the other end may send up to bytesOfData to the connection connectionID.

<HTTP_Close> <byte connectionID> - Closes/indicates the connection has been closed.

HTTP_Shutdown Message

<HTTP_Shutdown> <byte[]sessionID> - Ends the session between the client and the server. The server closes all the TCP connections it has open on behalf of the client and stops listening for new connections to the client. If the shutdown was initiated by the client, the response (server to client) is:

<HTTP_Shutdown> <byte[]sessionID> - Shutdown complete

HTTP_Error

<HTTP_Error> <byte[]sessionID> <UTF reason> - An error occured on the session and it must be shut down. reason is a textual message describing the error. Possible errors are:

"Session not active" - Perhaps because it has been timed out
"Protocol error xxx" - An invalid message was received. xxx may provide more detail.
"More data than permitted for connection xxx" - The other end attempted to send more data than the last HTTP_OK_To_Send message permitted. xxx may provide more detail.

If an HTTP_Error message is received by the server, the response will be an HTTP_Shutdown message.

Off the shelf alternatives

The transport layer of RMI uses similar techniques, but it is not an exposed interface.

Other Design Objectives, Constraints and Assumptions

Current implementation

Server

This server design is a reference implementation. It is designed for clarity, not efficency. Being written in Java, it uses threads out the yingyang.

HTTPServeMain is the class which contains the main routine. It also listens to the HTTP port.

HTTPServeClientPeer is the class which handles HTTP input and output for a particular client.

HTTPServeClientState is the class which holds the client state between HTTP messages.

None.

Design Proposal

Server

The server waits for E connections on one port and HTTP requests on another. When the server gets an HTTP_Logon message, it builds the necessary data structures to service that vatID, generates a sessionID, and sends the sessionID in the HTTP response. The data structures include:

A way to queue received TCP messages to be sent to the client via HTTP.
A way to associate the connectionIDs with the associated TCP connection.
A way to queue messages received via HTTP on the appropriate TCP connection
A way to map from VatID to the HTTP connection for new incoming E connections.
The maximum amount of data the server can send the client for each connection.

The basic dataflow logic for various messages is:

New incoming TCP E connection

The server reads the new socket and saves the PROTOCOL_VERSION message (see Comm Connection Startup Protocol). It saves and reads the IWANT message and checks if it is proxying for the requested VatID. If the VatID is not known, it generates a NOTME response and closes the socket. Otherwise it associates the socket with the appropriate HTTP connection and generates three HTTP_Session submessages for the HTTP_NewConnection, the HTTP_OK_To_Send, and the HTTP_Data which are queued for the HTTP connection.

Incoming data on the TCP E connection

The server reads the data and queues it on the appropriate HTTP connection as a HTTP_Data message.

Incoming close on the TCP E connection

The server queues a HTTP_Close message on the appropriate HTTP connection.

Incoming HTTP message from the client

The data portion of the HTTP POST operation is read and the embeded messages are processed. When they have been processed, the output queue for the HTTP connection is encoded and sent back in the response. Note that the output queue is a FIFO queue to preserve the ordering of events. The specific POST messages are handled as follows:

HTTP_Logon from the HTTP client

If there is already a session in progress for this VatID, the server performs the following checks:

Check that the client is using the strong authentication form of logon, that the server has sent a nonce to this vatID within the timeout interval, and that the serverNonce from the client matches the last one the server sent to the client. If not send a HTTP_Set_Server_Nonce message requesting a new logon and record the random number send associated with the vatID.
Check that the vatID is the hash of the public key. If not, sent HTTP_Error.
Check the signature. If it doesn't check, send HTTP_Error.
Shutdown the old session.

Generate a new sessionID, build the necessary data structures, and queue a HTTP_Logged_On message as the response.

HTTP_Session

Each subtype is processed as follows:

HTTP_NewConnection from the HTTP client

The server checks the parameters to ensure they are valid. If they are not valid, an error response is queued for the HTTP connection. Otherwise an asynchronous operation is started to build the TCP connection. It will report its success or failure to the HTTP queue when it has finished.

HTTP_Data from the HTTP client

The data is queued for the appropriate TCP connection. When the data has been sent, a new HTTP_OK_To_Send messages is queued for the HTTP client.

HTTP_OK_To_Send from the HTTP client

The server updates its send limit for the connection.

HTTP_Close from the HTTP client

The designated socked is closed synchronously.

HTTP_Shutdown from the HTTP client

All TCP connections are closed. An HTTP_Shutdown message is queued and all the queued messages are included in the response. All the data structures associated with the session are discarded.

HTTP_Error from the HTTP client

This message is handled in the same way as an HTTP_Shutdown message.

Client

The client code involves changes to the current DataComm software. There are two obvious versions of the client that can be imagined:

A client that performs all its communication through HTTP Tunnelling
A client that is able to build "classic" direct TCP connections to some vats and uses HTTP Tunnelling for others.

The client that supports both direct and tunnelled connections has a number of problems to solve:

Which path should it try to connect to a particular vat?
What address should it list with the PLS?
How can it assure that it only has one connection to a particular vat (to preserve the E message ordering rules).

Tunnel Only

Extend the VatIdentity class to have a getVatTPMgr(URL url) method. The url specifies the HTTP Tunnel server.

Change DataComm to use a SocketFactory to get its Sockets. For direct connections, this factory returns standard system Sockets. For Tunnel connections, a different factory returns Sockets which use the HTTP Tunnel classes for communication. For incoming connections, the HTTP Tunnel classes can call VatTPMgr.newInboundSocket directly or through a Thunk. The Tunnel classes can directly return the address the server is listening at to the VatTPMgr using the listeningAt(String) method.

The TunnelSocket will respond to as follows to the standard Socket methods:

close() Closes this socket. Sends a HTTP_Close message and performs cleanup of the local resources.
getInetAddress() Returns the address to which the socket is connected. This address will either be simulated as best we can or DataComm will be changed to use instanceof to call a different method for this information.
getInputStream() Returns an input stream for this socket. The input stream will communicate with the Tunnelling classes.
getLocalAddress() Gets the local address to which the socket is bound. See getInetAddress().
getLocalPort() Returns the local port to which this socket is bound. See getInetAddress().
getOutputStream() Returns an output stream for this socket. Theoutput stream will communicate with the Tunnelling classes.
getPort() Returns the remote port to which this socket is connected. See getInetAddress().
toString() Converts this socket to a String.

The following methods will be implemented as NOPs sufficent for DataComm's use, or will throw exceptions.

getSoLinger() Returns setting for SO_LINGER.
getSoTimeout() Returns setting for SO_TIMEOUT.
getTcpNoDelay() Tests if TCP_NODELAY is enabled.
setSoLinger(boolean, int) Enable/disable SO_LINGER with the specified linger time.
setSoTimeout(int) Enable/disable SO_TIMEOUT with the specified timeout, in milliseconds.
setTcpNoDelay(boolean) Enable/disable TCP_NODELAY (disable/enable Nagle's algorithm).

Tunnel and Direct

With the above Tunnel Only architecture and some additional changes, there are simple answers to the Tunnel and Direct questions. The changes are to allow more than one socketFactory to be active in the objects under a particular VatTPMgr. The use of multiple factories also allows the vat to listen on more than one interface:port.

Which path should it try to connect to a particular vat?

Try them all. First try all the search addresses through each direct connection interface. Then try all the URLs registered for Tunnel connections.

What address should it list with the PLS?

All the addresses it is listening at. Even if they are not relevant to a particular network, trying to connect to them will fail unless there is a vat with the desired private key listening there.

How can it assure that it only has one connection to a particular vat (to preserve the E message ordering rules).

By running under one VatTPMgr, duplicate VatTPConnections will be prevented.

Which directories on our tree does this subsystem cover?

org/erights/e/net/data

Is it JavaDoc'ed?

In many cases, this section can link to JavaDoc output from actual Java classes and interfaces. This saves writing documentation twice (the designers will have to JavaDoc their interfaces anyway). The JavaDoc should be linked into the design document. Chip's JavaDoc style guidelines (XXX file missing) explain how to use JavaDoc effectively.

Examples

Testing and Debugging

See DataComm Testing.

Design Issues

This server design will not accept incoming TCP connections to VatID == "0", the connectToVatAt protocol because it supports more than one vat.
The unauthenticated logon allows anyone who knows the VatID and the HTTP server that vat is using to disconnect all that vat's connections.
How does the firewall HTTP proxy address get set up?

The user uses the Java system properties: http.proxyHost and http.proxyPort to set the host and port of the firewall proxy.

A vat which receives a connection where the originator goes through an HTTP Tunnel may be able to make a direct connection back. This will occur if the vat can not receive direct connections, but can make them. The architecture has no way to try for this direct connection

Unless stated otherwise, all text on this page which is either unattributed or by Mark S. Miller is hereby placed in the public domain.

/ elib / distrib / vattp

ELib E Language Smart Contracts Related

Download FAQ API Mail Archive Donate

report bug (including invalid html)

Introduction

Related Documents

Requirements

Architecture

HTTP Tunneling

DataComm HTTP Server

Client - Proxy Message Formats

Message Type codes

Client Authentication

Message Descriptions

Off the shelf alternatives

Other Design Objectives, Constraints and Assumptions

Current implementation

Server

Design Proposal

Server

Client

Which directories on our tree does this subsystem cover?

Is it JavaDoc'ed?

Examples

Testing and Debugging

Design Issues