binmode-rpc-rfc.txt 12 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338
  1. This is a draft, and subject to change. Please do not implement it
  2. yet. Thank you!
  3. Examples can be found at the bottom of the file.
  4. Thank you for your feedback!
  5. Eric Kidd
  6. eric.kidd@pobox.com
  7. 30 January 2001
  8. The Binmode RPC Protocol
  9. ========================
  10. Binmode RPC is an ultra-lightweight RPC protocol designed for 100%
  11. compatibility with XML-RPC <http://www.xmlrpc.com/>. It emphasizes
  12. simplicity, dynamically-typed data, and extreme ease of implementation.
  13. Two XML-RPC implementations that support 'binmode-rpc' may negotiate away
  14. the XML part of XML-RPC, and replace it with a simple binary protocol.
  15. Design goals:
  16. * The complete specification should fit in a 350-line text file. :-)
  17. * The protocol should be easy to implement.
  18. * The protocol should provide a high degree of compression.
  19. * The protocol should be very fast--faster than zlib compression.
  20. * The protocol must be implementable in portable ANSI C, with no
  21. './configure' checks.
  22. * The protocol must not contain any options, variant encodings
  23. or similar hair. If you want DCE/RPC, you know where to find it.
  24. * All protocol operations must be performed at the byte level
  25. (except for UTF-8 encoding and decoding).
  26. * The protocol must be semi-readable in a hex dump or Emacs buffer.
  27. * The protocol must efficiently encode boxcarred calls that are
  28. implemented using 'system.multicall'.
  29. * The protocol must support an efficient encoding for
  30. frequently-repeated string values.
  31. * The protocol must never be sent to clients or servers which
  32. don't support it.
  33. * There must be a way for clients and servers to active the protocol
  34. if both ends of the connection support it.
  35. The X-XML-RPC-Extensions Header
  36. -------------------------------
  37. (First, we'll need a mechanism for unobtrusively announcing the presence of
  38. non-standard capabilities.)
  39. An XML-RPC implementation MAY advertise additional, non-standard
  40. capabilities using the 'X-XML-RPC-Extensions' header.
  41. Rationale: The 'X-XML-RPC-Extensions' header should be available to CGI
  42. scripts in the environment variable HTTP_X_XML_RPC_EXTENSIONS.
  43. If present, this header MUST contain a comma-separated list of
  44. keywords. Parameter information MAY be included, if desired, in the
  45. standard fashion used by HTTP 1.1 'Accept-Encoding' headers.
  46. X-XML-RPC-Extensions: binmode-rpc
  47. X-XML-RPC-Extensions: binmode-rpc, x-telepathic-transport
  48. X-XML-RPC-Extensions: binmode-rpc,x-telepathic-transport
  49. X-XML-RPC-Extensions: binmode-rpc, x-telepathic-transport;speed=low
  50. If a client sends the X-XML-RPC-Extensions header in a request, the server
  51. MAY use any of the specified extensions in its response.
  52. Rationale: No client may be sent non-standard data without first having
  53. advertised the ability to accept it.
  54. If the server includes the X-XML-RPC-Extensions header in a response, the
  55. client MAY use any of the specified extensions in further requests to that
  56. URL. The client MUST NOT assume that the same extensions are available for
  57. any other URL on the same server.
  58. Rationale: No server may be sent non-standard data without first having
  59. advertised the ability to accept it. Furthermore, this permission is
  60. URL-specific, since different XML-RPC implementations may be located at
  61. different URLs on a single server.
  62. The client SHOULD NOT cache extension information about a particular server
  63. for an excessive length of time (typically beyond a single program
  64. invocation). If the client does cache this information indefinitely, it
  65. SHOULD be able to cope if an extension is disabled.
  66. Rationale: The XML-RPC implementation used on the server may be changed by
  67. the administrator.
  68. The 'binmode-rpc' Extension
  69. -----------------------
  70. A client or server which sends the 'binmode-rpc' extension MUST accept
  71. message bodies of type 'application/x-binmode-rpc' in addition to the
  72. regular 'text/xml'.
  73. All servers which accept the binmode-rpc extension MUST also support
  74. standard XML-RPC, as described by <http://www.xmlrpc.org/spec>.
  75. The 'application/x-binmode-rpc' Format
  76. --------------------------------------
  77. All documents of the type 'application/x-binmode-rpc' MUST begin with the
  78. following byte sequence (represented here as a C string):
  79. 'binmode-rpc:'
  80. This MUST be followed by a Call or a Response, encoded as described below:
  81. Call := 'C' String Array
  82. A Call consists of a single octet with the ASCII value 'C', followed by a
  83. String containing the method name and an Array containing the parameters.
  84. Response := 'R' (Value|Fault)
  85. A Response MUST contain either a Value or a Fault.
  86. Fault := 'F' Struct
  87. A Fault contains a regular Struct (with members as specified by the the
  88. XML-RPC specification).
  89. Trailing data at the end of an 'application/x-binmode-rpc' document MUST be
  90. ignored.
  91. Byte-Order of Integers
  92. ----------------------
  93. (The following integer types don't correspond directly to XML-RPC
  94. integers--instead, they'll be used to *build* more complicated types.)
  95. SignedLSB := a four-octet, signed, twos'-complement integer,
  96. least-significant byte (LSB) first
  97. UnsignedLSB := a four-octet, unsigned integer, LSB first
  98. Raw integer data is encoded in little-endian format.
  99. Rationale: A fixed, mandatory byte ordering is easier to implement than
  100. approaches which allow multiple byte orderings, and little-endian CPUs
  101. outnumber big-endian CPUs at the time of writing.
  102. Values
  103. ------
  104. Value := (Integer|Boolean|Double|DateTimeISO8601Binary|Array|Struct|
  105. String|Other)
  106. Integer := 'I' SignedLSB
  107. Boolean := ('t'|'f')
  108. Double := 'D' SizeOctet AsciiChar...
  109. DateTimeISO8601 := '8' SizeOctet AsciiChar...
  110. These two types are encoded with an unsigned size octet followed by the
  111. specified number of ASCII characters. The values are encoded in the fashion
  112. described by the XML-RPC specification.
  113. Rationale: In both these cases, we're punting. Binary floating point
  114. formats are highly non-portable, and cannot be easily manipulated by most
  115. programming languages. XML-RPC <dateTime.iso8601> values lack timezone
  116. information, and are therefore difficult to convert to a binary format.
  117. Binary := 'B' UnsignedLSB Octet...
  118. This corresponds to the XML-RPC <base64> type, but without any encoding.
  119. The UnsignedLSB specifies the number of octets of data.
  120. Array := 'A' UnsignedLSB Value...
  121. The UnsignedLSB specifies the number of values in the array.
  122. Struct := 'S' UnsignedLSB (String,Value)...
  123. The UnsignedLSB specifies the number of String,Value pairs in the struct.
  124. The strings are keys; the values may be of any type.
  125. Other := 'O' String Binary
  126. Future XML-RPC types (if any) may be sent a String containing the type name
  127. and a Binary block (as above) containing type-specific data.
  128. Implementations MUST NOT encode any of the standard types using this
  129. construct. Implementations MAY signal an error if data of type Other is
  130. encountered.
  131. Rationale: This is allowed to cause an error because most applications
  132. won't understand the contents anyway. But if new types are added, dumb
  133. gateways will be able to manipulate them in encapsulated format (if they so
  134. desire).
  135. Strings
  136. -------
  137. String := (RegularString|RecordedString|RecalledString)
  138. We have three types of strings.
  139. RegularString := 'U' StringData
  140. StringData := UnsignedLSB Utf8Octet...
  141. Strings are encoded in UTF-8 format. The UnsignedLSB specifies the number
  142. of UTF-8 octets. Implementations SHOULD raise an error if they encounter
  143. invalid UTF-8 data (e.g., ISO Latin 1 characters).
  144. Rationale: Technically speaking, XML-RPC is limited to plain ASCII
  145. characters, and may not contain 8-bit or 16-bit characters in any coding
  146. system. But since XML-RPC is based on XML, adding Unicode is a trivial
  147. enhancement to the basic protocol, and *somebody* will make it sooner or
  148. later. When that day arrives, we want to be able to encode Unicode
  149. characters.
  150. Implements MUST encode UTF-8 characters using the minimum number of octets.
  151. Implementations SHOULD raise an error if they encounter any UTF-8
  152. characters encoded using more than the minimum number of octets.
  153. Rationale: Overlong UTF-8 encodings are sometimes used to bypass string
  154. validation in security code. They serve no legitimate purpose, either. So
  155. to improve the overall security of the Universe, we work hard to discourage
  156. them.
  157. UTF-8 & Unicode FAQ: http://www.cl.cam.ac.uk/~mgk25/unicode.html
  158. RecordedString := '>' CodebookPosition StringData
  159. RecalledString := '<' CodebookPosition
  160. CodebookPosition := UnsignedOctet
  161. The 'binmode' format supports a 256-entry "codebook" of strings. At the
  162. start of a data stream, the codebook is empty. When the decoder
  163. encounters a RecordedString, it MUST store it into the specified codebook
  164. position (and then proceed to decode it as a regular string).
  165. When the decoder encounters a RecalledString, it MUST look it up in the
  166. specified codebook position. If that codebook position has been set, the
  167. implementation MUST use the string value found in the codebook. If the
  168. position has not been set, the implementation MUST stop decoding and raise
  169. an error. It is legal to change a codebook position once it has been set;
  170. the most recent value applies.
  171. A RecordedString or a RecalledString may be used anywhere a RegularString
  172. may be used.
  173. Rationale: XML-RPC data tends to contain large numbers of identical
  174. strings. (These are typically the names of <struct> members or the names of
  175. methods in a multicall.) To get any kind of reasonable data compression,
  176. it's necessary to have some way of compressing these values. The codebook
  177. mechanism is relatively simple and uncomplicated.
  178. Implementations MAY choose not to use this feature when encoding data, but
  179. MUST understand it when decoding data.
  180. Rationale: On the decoding end of things, this feature is trivial to
  181. implement, and must be present for the sake of interoperability. On the
  182. encoding end of things, however, making effective use of this feature is
  183. slightly trickier, so implementations are allowed (but not encouraged) to
  184. omit it.
  185. Compliance
  186. ----------
  187. Implementations MUST implement all features of this protocol correctly,
  188. particularly on the decoding end. In the case of this protocol, a 95% correct
  189. implementation is 100% broken. Yes, this statement is redundant. ;-)
  190. Examples
  191. --------
  192. Non-ASCII octets are specified as in C strings. Continued lines are
  193. indicated by a trailing '\'; these should be joined together as one
  194. sequence of bytes.
  195. binmode-rpc:CU\003\0\0\0addA\002\0\0\0I\002\0\0\0I\002\0\0\0
  196. binmode-rpc:RI\004\0\0\0
  197. binmode-rpc:RFS\002\0\0\0 \
  198. U\011\0\0\0faultCodeI\001\0\0\0 \
  199. U\013\0\0\0faultStringU\021\0\0\0An error occurred
  200. binmode-rpc:RA\006\0\0\0 \
  201. >\000\003\0\0\0foo \
  202. >\001\003\0\0\0bar \
  203. <\000 \
  204. >\000\003\0\0\0baz \
  205. <\000 \
  206. <\001
  207. (This deserializes to ['foo', 'bar', 'foo', 'baz', 'baz', 'bar'].)
  208. binmode-rpc:RU\042\0\0\0Copyright \302\251 1995 J. Random Hacker
  209. (This is based on an example in the Unicode/UTF-8 FAQ (see above).)
  210. binmode-rpc:RA\010\0\0\0 \
  211. I\006\0\0\0 \
  212. tf \
  213. D\0042.75 \
  214. 8\02119980717T14:08:55 \
  215. U\003\0\0\0foo \
  216. B\003\0\0\0abc \
  217. S\002\0\0\0U\003\0\0\0runt
  218. Counter-Examples
  219. ----------------
  220. The following specimens are illegal, and SHOULD be rejected by a compliant
  221. implementation. Please test your code.
  222. * A different format name:
  223. binmode-rpc2:RI\004\0\0\0
  224. * A built-in type incorrectly encoded using 'O':
  225. binmode-rpc:ROU\006\0\0\0stringB\003\0\0\0xyz
  226. * A recall of an unrecorded string:
  227. binmode-rpc:R<\002
  228. * ISO Latin 1 data in a string. (UTF-8 required!)
  229. binmode-rpc:RU\041\0\0\0Copyright \251 1995 J. Random Hacker
  230. * UTF-8 character encoded with too many octets (based on an example in the
  231. Unicode/UTF-8 FAQ):
  232. binmode-rpc:RU\041\0\0\0Bad linefeed: \300\212 (too many bytes)
  233. A compliant implementation MUST NOT send any of these sequences.