This is a draft, and subject to change. Please do not implement it
yet. Thank you!
Examples can be found at the bottom of the file.
Thank you for your feedback!
Eric Kidd
eric.kidd@pobox.com
30 January 2001
The Binmode RPC Protocol
========================
Binmode RPC is an ultra-lightweight RPC protocol designed for 100%
compatibility with XML-RPC . It emphasizes
simplicity, dynamically-typed data, and extreme ease of implementation.
Two XML-RPC implementations that support 'binmode-rpc' may negotiate away
the XML part of XML-RPC, and replace it with a simple binary protocol.
Design goals:
* The complete specification should fit in a 350-line text file. :-)
* The protocol should be easy to implement.
* The protocol should provide a high degree of compression.
* The protocol should be very fast--faster than zlib compression.
* The protocol must be implementable in portable ANSI C, with no
'./configure' checks.
* The protocol must not contain any options, variant encodings
or similar hair. If you want DCE/RPC, you know where to find it.
* All protocol operations must be performed at the byte level
(except for UTF-8 encoding and decoding).
* The protocol must be semi-readable in a hex dump or Emacs buffer.
* The protocol must efficiently encode boxcarred calls that are
implemented using 'system.multicall'.
* The protocol must support an efficient encoding for
frequently-repeated string values.
* The protocol must never be sent to clients or servers which
don't support it.
* There must be a way for clients and servers to active the protocol
if both ends of the connection support it.
The X-XML-RPC-Extensions Header
-------------------------------
(First, we'll need a mechanism for unobtrusively announcing the presence of
non-standard capabilities.)
An XML-RPC implementation MAY advertise additional, non-standard
capabilities using the 'X-XML-RPC-Extensions' header.
Rationale: The 'X-XML-RPC-Extensions' header should be available to CGI
scripts in the environment variable HTTP_X_XML_RPC_EXTENSIONS.
If present, this header MUST contain a comma-separated list of
keywords. Parameter information MAY be included, if desired, in the
standard fashion used by HTTP 1.1 'Accept-Encoding' headers.
X-XML-RPC-Extensions: binmode-rpc
X-XML-RPC-Extensions: binmode-rpc, x-telepathic-transport
X-XML-RPC-Extensions: binmode-rpc,x-telepathic-transport
X-XML-RPC-Extensions: binmode-rpc, x-telepathic-transport;speed=low
If a client sends the X-XML-RPC-Extensions header in a request, the server
MAY use any of the specified extensions in its response.
Rationale: No client may be sent non-standard data without first having
advertised the ability to accept it.
If the server includes the X-XML-RPC-Extensions header in a response, the
client MAY use any of the specified extensions in further requests to that
URL. The client MUST NOT assume that the same extensions are available for
any other URL on the same server.
Rationale: No server may be sent non-standard data without first having
advertised the ability to accept it. Furthermore, this permission is
URL-specific, since different XML-RPC implementations may be located at
different URLs on a single server.
The client SHOULD NOT cache extension information about a particular server
for an excessive length of time (typically beyond a single program
invocation). If the client does cache this information indefinitely, it
SHOULD be able to cope if an extension is disabled.
Rationale: The XML-RPC implementation used on the server may be changed by
the administrator.
The 'binmode-rpc' Extension
-----------------------
A client or server which sends the 'binmode-rpc' extension MUST accept
message bodies of type 'application/x-binmode-rpc' in addition to the
regular 'text/xml'.
All servers which accept the binmode-rpc extension MUST also support
standard XML-RPC, as described by .
The 'application/x-binmode-rpc' Format
--------------------------------------
All documents of the type 'application/x-binmode-rpc' MUST begin with the
following byte sequence (represented here as a C string):
'binmode-rpc:'
This MUST be followed by a Call or a Response, encoded as described below:
Call := 'C' String Array
A Call consists of a single octet with the ASCII value 'C', followed by a
String containing the method name and an Array containing the parameters.
Response := 'R' (Value|Fault)
A Response MUST contain either a Value or a Fault.
Fault := 'F' Struct
A Fault contains a regular Struct (with members as specified by the the
XML-RPC specification).
Trailing data at the end of an 'application/x-binmode-rpc' document MUST be
ignored.
Byte-Order of Integers
----------------------
(The following integer types don't correspond directly to XML-RPC
integers--instead, they'll be used to *build* more complicated types.)
SignedLSB := a four-octet, signed, twos'-complement integer,
least-significant byte (LSB) first
UnsignedLSB := a four-octet, unsigned integer, LSB first
Raw integer data is encoded in little-endian format.
Rationale: A fixed, mandatory byte ordering is easier to implement than
approaches which allow multiple byte orderings, and little-endian CPUs
outnumber big-endian CPUs at the time of writing.
Values
------
Value := (Integer|Boolean|Double|DateTimeISO8601Binary|Array|Struct|
String|Other)
Integer := 'I' SignedLSB
Boolean := ('t'|'f')
Double := 'D' SizeOctet AsciiChar...
DateTimeISO8601 := '8' SizeOctet AsciiChar...
These two types are encoded with an unsigned size octet followed by the
specified number of ASCII characters. The values are encoded in the fashion
described by the XML-RPC specification.
Rationale: In both these cases, we're punting. Binary floating point
formats are highly non-portable, and cannot be easily manipulated by most
programming languages. XML-RPC values lack timezone
information, and are therefore difficult to convert to a binary format.
Binary := 'B' UnsignedLSB Octet...
This corresponds to the XML-RPC type, but without any encoding.
The UnsignedLSB specifies the number of octets of data.
Array := 'A' UnsignedLSB Value...
The UnsignedLSB specifies the number of values in the array.
Struct := 'S' UnsignedLSB (String,Value)...
The UnsignedLSB specifies the number of String,Value pairs in the struct.
The strings are keys; the values may be of any type.
Other := 'O' String Binary
Future XML-RPC types (if any) may be sent a String containing the type name
and a Binary block (as above) containing type-specific data.
Implementations MUST NOT encode any of the standard types using this
construct. Implementations MAY signal an error if data of type Other is
encountered.
Rationale: This is allowed to cause an error because most applications
won't understand the contents anyway. But if new types are added, dumb
gateways will be able to manipulate them in encapsulated format (if they so
desire).
Strings
-------
String := (RegularString|RecordedString|RecalledString)
We have three types of strings.
RegularString := 'U' StringData
StringData := UnsignedLSB Utf8Octet...
Strings are encoded in UTF-8 format. The UnsignedLSB specifies the number
of UTF-8 octets. Implementations SHOULD raise an error if they encounter
invalid UTF-8 data (e.g., ISO Latin 1 characters).
Rationale: Technically speaking, XML-RPC is limited to plain ASCII
characters, and may not contain 8-bit or 16-bit characters in any coding
system. But since XML-RPC is based on XML, adding Unicode is a trivial
enhancement to the basic protocol, and *somebody* will make it sooner or
later. When that day arrives, we want to be able to encode Unicode
characters.
Implements MUST encode UTF-8 characters using the minimum number of octets.
Implementations SHOULD raise an error if they encounter any UTF-8
characters encoded using more than the minimum number of octets.
Rationale: Overlong UTF-8 encodings are sometimes used to bypass string
validation in security code. They serve no legitimate purpose, either. So
to improve the overall security of the Universe, we work hard to discourage
them.
UTF-8 & Unicode FAQ: http://www.cl.cam.ac.uk/~mgk25/unicode.html
RecordedString := '>' CodebookPosition StringData
RecalledString := '<' CodebookPosition
CodebookPosition := UnsignedOctet
The 'binmode' format supports a 256-entry "codebook" of strings. At the
start of a data stream, the codebook is empty. When the decoder
encounters a RecordedString, it MUST store it into the specified codebook
position (and then proceed to decode it as a regular string).
When the decoder encounters a RecalledString, it MUST look it up in the
specified codebook position. If that codebook position has been set, the
implementation MUST use the string value found in the codebook. If the
position has not been set, the implementation MUST stop decoding and raise
an error. It is legal to change a codebook position once it has been set;
the most recent value applies.
A RecordedString or a RecalledString may be used anywhere a RegularString
may be used.
Rationale: XML-RPC data tends to contain large numbers of identical
strings. (These are typically the names of members or the names of
methods in a multicall.) To get any kind of reasonable data compression,
it's necessary to have some way of compressing these values. The codebook
mechanism is relatively simple and uncomplicated.
Implementations MAY choose not to use this feature when encoding data, but
MUST understand it when decoding data.
Rationale: On the decoding end of things, this feature is trivial to
implement, and must be present for the sake of interoperability. On the
encoding end of things, however, making effective use of this feature is
slightly trickier, so implementations are allowed (but not encouraged) to
omit it.
Compliance
----------
Implementations MUST implement all features of this protocol correctly,
particularly on the decoding end. In the case of this protocol, a 95% correct
implementation is 100% broken. Yes, this statement is redundant. ;-)
Examples
--------
Non-ASCII octets are specified as in C strings. Continued lines are
indicated by a trailing '\'; these should be joined together as one
sequence of bytes.
binmode-rpc:CU\003\0\0\0addA\002\0\0\0I\002\0\0\0I\002\0\0\0
binmode-rpc:RI\004\0\0\0
binmode-rpc:RFS\002\0\0\0 \
U\011\0\0\0faultCodeI\001\0\0\0 \
U\013\0\0\0faultStringU\021\0\0\0An error occurred
binmode-rpc:RA\006\0\0\0 \
>\000\003\0\0\0foo \
>\001\003\0\0\0bar \
<\000 \
>\000\003\0\0\0baz \
<\000 \
<\001
(This deserializes to ['foo', 'bar', 'foo', 'baz', 'baz', 'bar'].)
binmode-rpc:RU\042\0\0\0Copyright \302\251 1995 J. Random Hacker
(This is based on an example in the Unicode/UTF-8 FAQ (see above).)
binmode-rpc:RA\010\0\0\0 \
I\006\0\0\0 \
tf \
D\0042.75 \
8\02119980717T14:08:55 \
U\003\0\0\0foo \
B\003\0\0\0abc \
S\002\0\0\0U\003\0\0\0runt
Counter-Examples
----------------
The following specimens are illegal, and SHOULD be rejected by a compliant
implementation. Please test your code.
* A different format name:
binmode-rpc2:RI\004\0\0\0
* A built-in type incorrectly encoded using 'O':
binmode-rpc:ROU\006\0\0\0stringB\003\0\0\0xyz
* A recall of an unrecorded string:
binmode-rpc:R<\002
* ISO Latin 1 data in a string. (UTF-8 required!)
binmode-rpc:RU\041\0\0\0Copyright \251 1995 J. Random Hacker
* UTF-8 character encoded with too many octets (based on an example in the
Unicode/UTF-8 FAQ):
binmode-rpc:RU\041\0\0\0Bad linefeed: \300\212 (too many bytes)
A compliant implementation MUST NOT send any of these sequences.