Binary Serializer Format
This is the original thrift binary protocol implementation for providence. The serializer serializes all numbers as big-endian, using no zig-zag encoding etc. It has a "versioning" system, but there is only one version of the original protocol; v1.
Per default the binary serializer will write the versioned data stream, but accept both versioned with v1 protocol and unversioned when deserializing. Setting the serializer to strict mode will only accept the desired protocol version.
Service calls
Service calls comes in two versions, versioned and unversioned.
CALL ::= VERSIONED | UNVERSIONED
VERSIONED ::= VERSION 0x00 CALL_TYPE [32-bit name-size] [name] [32-bit sequence] MESSAGE
UNVERSIONED ::= [32-bit name-size] [name] CALL_TYPE [32-bit sequence] MESSAGE
VERSION ::= 0x80 VERSION_NO
VERSION_NO ::= 0x01
CALL_TYPE ::= 0x01 | 0x02 | 0x03 | 0x04
The type of call is detected by reading the first 4 bytes as a signed int. If the value is less than 0, it is versioned (hence the 0x80 byte), otherwise it is unversioned. The number is split into version and call type by:
version = (0x7FFF0000 & num) >>> 16
call_type = (byte) (0x000000FF & num)
The 'middle' 0 byte is currently not used. Each call type also determines what the 'message' field can be, and if the call expects a reply or not.
CALL 0x01 ::= method request wrapper
REPLY 0x02 ::= method response wrapper
EXCEPTION 0x03 ::= application exception
ONEWAY 0x04 ::= message request wrapper (no reply expected)
The call and oneway types are both method calls, where the oneway type does not expect, or require a reply. The call require a relply eve in the return type is 'void'. This is where a 'field' type can be void.
Messages
Messages are a stream of fields
, and terminated with a 0-byte. Each field
is self-contained. The null byte is in the place of the field type-id.
MESSAGE ::= [FIELD]* STOP
Fields
Each field is encoded as follows:
STOP ::= 0x00
FIELD ::= TYPE FIELD_ID VALUE
FIELD_ID ::= 32-bit integer (the field ID)
Encoding of values
In the binary protocol, each thrift fields value type is mapped to a type ID. It is mostly a 1-to-1 mapping, but with a few exceptions.
The field types are:
stop ::= 0
void ::= 1
bool ::= 2
byte ::= 3
double ::= 4
i16 ::= 6
i32 | enum ::= 8
i64 ::= 10
string | binary ::= 11
message ::= 12
map ::= 13
set ::= 14
list ::= 15
And the values are encoded as:
VALUE ::= VOID | BOOL | BYTE | I16 | I32 | I64 | DOUBLE | ENUM |
STRING | BINARY | MESSAGE | MAP | SET | LIST
VOID ::= [empty]
BOOL ::= (8-bit: 0 for false, 1 for true)
BYTE ::= (8-bit signed)
I16 ::= (16-bit signed)
I32 ::= (32-bit signed)
ENUM ::= (32-bit signed of value)
I64 ::= (64-bit signed)
DOUBLE ::= (64-bit 1:11:52 encoded double, 64-bit float)
STRING ::= (32-bit size) (utf-8 encoded data)
BINARY ::= (32-bit size) (raw data)
MESSAGE ::= as a message
MAP ::= (8-bit key type) (8-bit item type) (32-bit size) N * ([key] [item])
LIST | SET ::= (8-bit item type) (32-bit size) N * (item)