Programs usually deal with data in two representations.
- In memory data is kept in data structures
- Data stored on disk, or sent over a network needs to be encoded into a byte format
Program Specifics
Most programming languages have built in support for in memory data structures → byte sequences.
- Python has pickle
- Ruby has Marshal Etc
There are more standardized encodings as well:
- JSON
- XML But these encodings are flawed since they don’t adhere to a schema and are in general, too ambiguous and could be more compact.
Good Binary Encodings
Most binary encodings follow the same format for encoding:
- Define how many objects are in sequence
- Define lead bit for type and length
- Encode data Popular formats include: