Learn by reading through in order

pickle and base64 — Object Serialization and Binary-to-Text Conversion

Use pickle.dumps / loads to serialize dicts and custom classes to bytes for full restoration, and base64.b64encode / b64decode to put binary on text rails.

We'll cover two modules used when moving data through files or across the network. pickle saves Python objects (dicts, class instances, etc.) in a form that can be fully restored later, while base64 converts binary data like images or audio into a string of "alphanumerics plus two symbols" so you can drop it straight into email, JSON, or URLs. Both are standard library modules for data conversion at internal boundaries or external transports.

Sorting out what each module is for

The two modules naturally split by input/output combination. Use the diagram below to grab the big picture, then we'll dig into typical uses and APIs in each section.

Roles of the two modules
pickleobject ↔ bytesinternal / between procsbase64binary ↔ ASCIItransport (JSON/mail)
pickle does object ↔ bytes (internal storage, between processes); base64 does binary ↔ ASCII text (transport conversion: JSON / mail / URLs). Their use cases don't overlap, so you'll sometimes combine them.

pickle — Convert Python Objects Straight to Bytes

pickle is a standard library module that converts Python objects into a form that can be fully restored later. The result is a bytes value (a contiguous sequence of integers from 0 to 255 — a type for handling "sequences of numbers" rather than "sequences of characters" like strings). Bytes are the basic format for situations where machines, not humans, will read the data — saving to files, sending over the network, handing data to another process.

Four typical uses for pickle
save ML modelmodel.pkl filesinterprocess transfermultiprocessingcompute cacheskip heavy worktask queueCelery moves args
Saving ML models / transferring objects between processes / caching expensive computations / task queues. The common thread is that they all stay inside the Python world — for talking to the outside world, use json.

Unlike json, pickle's strength is being able to save not just dicts / lists / strings / numbers but also custom class instances and function objects, which makes it useful for saving ML models or passing objects between Python processes.

Symmetry of pickle.dumps / loads
Python object(dict / instance / etc.)pickle.dumpsbytes(binary form)bytes(binary form)pickle.loadsPython object(dict / instance / etc.)writeread
Top row: dumps = the Python object → bytes direction. Bottom row: loads = the bytes → Python object direction. Unlike json, custom class instances are preserved as-is, but the return value is bytes, not text (so when writing to a file, open in binary mode "wb" / "rb").

Never loads pickle data from someone else

pickle.loads can execute arbitrary Python code, so passing in bytes from an untrusted source becomes an arbitrary-code-execution vulnerability. Limit yourself to "I made it, I read it back" scenarios and never use it on data received over the network or from someone else. For external communication, use json instead.

FunctionRoleNotes
pickle.dumps(obj)Python object → bytesreturns bytes (not str)
pickle.loads(b)bytes → Python objecttrusted sources only
pickle.dump(obj, file)write to a fileopen in binary mode open(..., "wb")
pickle.load(file)read from a fileopen in binary mode open(..., "rb")

Convert a score dictionary to bytes with pickle and check the return type. Pickle's bytes can vary in length across environments, so we'll verify it via type.

① Import the pickle module.

② Set up the dictionary {"name": "Alice", "scores": [80, 90, 75]} in a variable named data.

③ Convert data to bytes with pickle, store it in pickled, and print the type name as Type: ◯◯ (it should be bytes).

(If you run this correctly, an explanation will appear.)

Python Editor

Run code to see output

Restore the pickled you made in Practice 1 back into a Python object and verify it matches the original data exactly. The variables data and pickled from Practice 1 are still available, so you only need to write the restoration step.

④ Restore pickled with pickle.loads into a variable named restored.

⑤ Print Restored name: ◯◯ (the name field of the restored object) and Equal to original: True / False for the equality check.

Python Editor

Run code to see output

base64 — Move Binary Data Through Text

base64 is a module that converts arbitrary binary data into a string of "alphanumerics plus two symbols". Since 3 bytes become 4 characters, the size grows about 1.33×, but it's used widely in situations where you need to carry binary data through text-only channels, such as email attachments, embedding images in JSON or URLs, or storing bytes in SQL.

The two basic APIs:

FunctionInputReturns
base64.b64encode(bytes)bytes (the source data)Base64-encoded bytes
base64.b64decode(Base64)bytes / str (Base64 form)the original bytes
Four typical uses for base64
image in JSON / HTMLdata:image/png;base64,…email attachmentsMIME mechanismJWT tokens(web auth)payload encodingcerts in env varsmulti-line data on one line
Embedding images in JSON / HTML (data URLs) / email attachments (MIME) / JWT tokens (JSON Web Token — a web auth token that carries user info with tamper detection) payloads / putting certificates into env vars or config files. The common thread is moving binary through text-only channels.
How base64 works
binary datab"Python is fun!"b64encodeBase64 stringUHl0aG9uIGlzIGZ1biE=Base64 stringUHl0aG9uIGlzIGZ1biE=b64decodebinary datab"Python is fun!"encodedecode
Top row: encode = the binary data → Base64 string direction. Bottom row: decode = the Base64 string → original binary data direction. Internally, 3 bytes are split into four 6-bit chunks mapped to A–Z / a–z / 0–9 / + / / (64 characters); the character count grows to 4/3 the original, with = padding any shortfall.
import base64

# bytes → Base64
binary = b"Python is fun!"
encoded = base64.b64encode(binary)
print(encoded)              # b'UHl0aG9uIGlzIGZ1biE='
print(encoded.decode())     # UHl0aG9uIGlzIGZ1biE= (converted to str)

# Base64 → bytes
decoded = base64.b64decode(encoded)
print(decoded)              # b'Python is fun!'
print(decoded == binary)    # True

For URLs, use urlsafe_b64

Standard b64encode includes + and /, so embedding it directly in URLs or JSON keys requires escaping. For URLs and JSON, choose base64.urlsafe_b64encode instead — it swaps +- and /_ to give you a URL-safe variant. Use base64.urlsafe_b64decode to restore.

Encode a bytes string with Base64, decode it again, and verify the result matches the original.

① Import the base64 module.

② Set up the bytes string b"Python is fun!".

③ Don't print the Base64-encoded bytes as bytes — convert it to an ASCII string first and print it as Base64 string: ◯◯.

④ Decode the Base64 back to the original binary and print whether it matches the original as Equal to original: True / False.

Python Editor

Run code to see output
QUIZ

Knowledge Check

Answer each question one by one.

Q1What's the danger of calling pickle.loads on untrusted pickle data?

Q2What's the type of pickle.dumps(obj)'s return value?

Q3Which function is best when you want to put a base64 string in a URL query parameter?