Encoding / Decoding JSON objects in Python

Mar 12th, 2019 in Python, Javascript by Michael Cho

← All articles


A summary of how to serialize and deserialize from Python data structures into JSON-compatible strings, which is especially useful when working with API requests and responses.

Python's `json` module is useful for serializing / deserializing from Python data structures into JSON-compatible strings(https://www.json.org/json-en.html), which is particularly useful when consuming or sending API requests and responses.

This article dives into how to use this module more closely, using Python 3.6 (most of this will still hold true for other Python versions though).

 

Serializing from Python data structures to JSON string

Firstly let's start with the scenario where you have some Python data structure (list, dictionary, etc) and want to use this as a JSON string.

The main way of doing this is to use json.dumps() (more common) or json.dump() (less common, will explain why shortly).

Here's an example of how it works:


# Start with some Python structures
jedis = [{'name': 'Mace Windu', 'ability': 8, 'alive': False}, {'name': 'Luke', 'children': None, 'alive': True}]

output = json.dumps(jedis)
print(output) # => '[{"name": "Mace Windu", "ability": 8, "alive": false}, {"name": "Luke", "children": null, "alive": true}]'
print(type(output)) # => <class 'str'>

Basically the output is a string, with conversion into JSON structures - eg False in Python becomes false in JSON, None becomes null, etc. The default conversion table should handle most use cases, although if necessary you can subclass the `json.JSONEncoder` class to serialize your own objects.

json.dumps() also accepts a number of optional parameters which you may find useful - I find myself overriding defaults for `ensure_ascii` and `sort_keys` sometimes.

For example:


profile = {"name": "Darth", "motto": u"i ♥ cats"}
print(json.dumps(profile))  # => '{"name": "Darth", "motto": "i \\u2665 cats"}'
print(json.dumps(profile, ensure_ascii=False, sort_keys=True)) # => '{"motto": "i ♥ cats", "name": "Darth"}'

Difference with `json.dumps()` vs `json.dump()`

Nearly all the time I find myself using json.dumps(). Far less commonly, json.dump() can be used - its only difference is that it accepts a file-like object as a second parameter.

For example, writing to a file or IO object, like so:


weapons = ["death star", "lightsaber"]
with open("my_file.json", "w") as f:
    json.dump(weapons, f)

Note: Because both dump() and dumps() always outputs a str object, you must ensure your file-like object can support this. In the example above, if you had used "wb" as the mode for opening the file instead of "w" (ie binary mode, which expects a bytes object), you will get a TypeError: a bytes-like object is required, not 'str'.

 

Deserializing from JSON string to Python data structures

The corresponding actions of deserializing from JSON string to Python structures are json.loads() and json.load(), and once-again the json.loads() variant is far more commonly used.

I suspect the "s" suffix on the end of `dump()` and `load()` signifies <string>, ie dump string / load string vs the regular dump (bytes) / load (bytes).

Here's how it works in practice:


# From a JSON string
family = '{"father": "Anakin", "sister": "Leia", "brother": null}'
print(json.loads(family))  # => {'father': 'Anakin', 'sister': 'Leia', 'brother': None}

The conversion from JSON data structures to Python structures happen using the same conversion table mentioned earlier. 

There are also similar optional parameters which json.loads() accepts - mostly around how to parse numerical values. eg


inventory = '{"robots": 3}'
print(json.loads(inventory))  # => {'robots': 3}

import decimal
print(json.loads(x, parse_int=decimal.Decimal))  # => {'robots': Decimal('3')}

Similarly, the json.load() method is used for reading from a file-like object.


with open("family.json", "r") as f:
    json.load(f)

However unlike for json.dump(), I haven't had any issues with using different modes for the file object - both "r" and "rb" works just fine.