Filename extension | .bson |
---|---|
Internet media type | none[1] |
Type of format | Data interchange |
Extended from | JSON |
Standard(s) | no RFC yet |
Website | bsonspec.org |
BSON ( /ˈbiːsɒn/) is a computer data interchange format used mainly as a data storage and network transfer format in the MongoDB database. It is a binary form for representing simple data structuresand associative arrays (called objects or documents in MongoDB). The name "BSON" is based on the term JSON and stands for "Binary JSON".[2]
Contents[hide] |
[edit]Data types and syntax
BSON documents (objects) consist of an ordered list of elements. Each element consists of a field name, a type, and a value. Field names are strings. Types include:
- string
- integer (32- or 64-bit)
- double (64-bit IEEE 754 floating point number)
- date (integer number of milliseconds since the Unix epoch)
- byte array (binary data)
- boolean (
true
andfalse
) - null
- BSON object
- BSON array
- regular expression
- JavaScript code
BSON types are nominally a superset of JSON types (JSON does not have a date or a byte array type, for example[3]), with one exception of not having a universal "number" type as JSON does.
[edit]Efficiency
Compared to JSON, BSON is designed to be efficient both in storage space and scan-speed. Large elements in a BSON document are prefixed with a length field to facilitate scanning. In some cases, BSON will use more space than JSON due to the length prefixes and explicit array indices.[2]
[edit]See also
- JSON
- Protocol Buffers
- Action Message Format
- Apache Thrift
- MessagePack
- Document-oriented database
- Abstract Syntax Notation One (ASN.1)
- Wireless Binary XML (WBXML)
[edit]References
[edit]External links
BSON
11101011
10101110
01010101
Version 1.0
BSON is a binary format in which zero or more key/value pairs are stored as a single entity. We call this entity a document.
The following grammar specifies version 1.0 of the BSON standard. We've written the grammar using a pseudo-BNF syntax. Valid BSON data is represented by the document
non-terminal.
Basic Types
The following basic types are used as terminals in the rest of the grammar. Each type must be serialized in little-endian format.
byte | 1 byte (8-bits) |
int32 | 4 bytes (32-bit signed integer) |
int64 | 8 bytes (64-bit signed integer) |
double | 8 bytes (64-bit IEEE 754 floating point) |
Non-terminals
The following specifies the rest of the BSON grammar. Note that quoted strings represent terminals, and should be interpreted with C semantics (e.g. "\x01"
represents the byte0000 0001
). Also note that we use the *
operator as shorthand for repetition (e.g.("\x01"*2)
is "\x01\x01"
). When used as a unary operator, *
means that the repetition can occur 0 or more times.
document | ::= | int32 e_list "\x00" | BSON Document |
e_list | ::= | element e_list | Sequence of elements |
| | "" | ||
element | ::= | "\x01" e_name double | Floating point |
| | "\x02" e_name string | UTF-8 string | |
| | "\x03" e_name document | Embedded document | |
| | "\x04" e_name document | Array | |
| | "\x05" e_name binary | Binary data | |
| | "\x06" e_name | Undefined — Deprecated | |
| | "\x07" e_name (byte*12) | ObjectId | |
| | "\x08" e_name "\x00" | Boolean "false" | |
| | "\x08" e_name "\x01" | Boolean "true" | |
| | "\x09" e_name int64 | UTC datetime | |
| | "\x0A" e_name | Null value | |
| | "\x0B" e_name cstring cstring | Regular expression | |
| | "\x0C" e_name string (byte*12) | DBPointer — Deprecated | |
| | "\x0D" e_name string | JavaScript code | |
| | "\x0E" e_name string | Symbol — Deprecated | |
| | "\x0F" e_name code_w_s | JavaScript code w/ scope | |
| | "\x10" e_name int32 | 32-bit Integer | |
| | "\x11" e_name int64 | Timestamp | |
| | "\x12" e_name int64 | 64-bit integer | |
| | "\xFF" e_name | Min key | |
| | "\x7F" e_name | Max key | |
e_name | ::= | cstring | Key name |
string | ::= | int32 (byte*) "\x00" | String |
cstring | ::= | (byte*) "\x00" | CString |
binary | ::= | int32 subtype (byte*) | Binary |
subtype | ::= | "\x00" | Binary / Generic |
| | "\x01" | Function | |
| | "\x02" | Binary (Old) | |
| | "\x03" | UUID (Old) | |
| | "\x04" | UUID | |
| | "\x05" | MD5 | |
| | "\x80" | User defined | |
code_w_s | ::= | int32 string document | Code w/ scope |
Examples
The following are some example documents (in JavaScript / Python style syntax) and their corresponding BSON representations. Try mousing over them for some useful correlation.
{"hello": "world"} |
→ | "\x16\x00\x00\x00\x02hello\x00 |
{"BSON": ["awesome", 5.05,1986]} |
→ | "\x31\x00\x00\x00\x04BSON\x00\x26\x00 |