Programs usually deal with data in two representations.

  1. In memory data is kept in data structures
  2. Data stored on disk, or sent over a network needs to be encoded into a byte format

Program Specifics

Most programming languages have built in support for in memory data structures → byte sequences.

  • Python has pickle
  • Ruby has Marshal Etc

There are more standardized encodings as well:

  • JSON
  • XML But these encodings are flawed since they don’t adhere to a schema and are in general, too ambiguous and could be more compact.

Good Binary Encodings

Most binary encodings follow the same format for encoding:

  1. Define how many objects are in sequence
  2. Define lead bit for type and length
  3. Encode data Popular formats include: