This article covers I/O for two common structured data formats — json, frequently used by web APIs, and csv, which opens cleanly in spreadsheet software.
JSON vs CSV — When to use which?JSON can express key/value pairs and nested structures, so it shows up everywhere in web API responses and config files. CSV is a flat "one row = one record" table format that opens directly in spreadsheets like Excel, which makes it the staple of data analysis.
json — Two-Way Conversion Between Python Objects and JSON Strings
JSON (JavaScript Object Notation) is a lightweight text format that expresses structure with just "key/value pairs" and "ordered lists", used as the standard for web API responses and config files. Python's json module converts dicts, lists, strings, numbers, bools, and None to and from JSON.
Two basic functions are enough to remember. json.dumps(object) converts Python → JSON string, and json.loads(string) converts JSON string → Python. The trailing s stands for string — it sets them apart from dump / load, which work directly with files.
Symmetry of json.dumps and json.loadsTop row: dumps (dump string) = the Python → JSON string direction. Bottom row: loads (load string) = the JSON string → Python direction. The variants without the s (dump / load) work directly with file objects.
Function
Role
Notes
json.dumps(obj)
Python → JSON string
returns a str
json.loads(text)
JSON string → Python
returns a dict / list / etc.
json.dump(obj, file)
Python → write to file
pass the f from open()
json.load(file)
file → Python
pass the f from open()
indent=N
pretty-print with N-space indent
for human readers
ensure_ascii=False
emit non-ASCII as-is
the default escapes them
ensure_ascii defaults to True
By default, json.dumps({"name": "café"}) outputs '{"name": "caf\u00e9"}' — non-ASCII characters get escaped as \u. It's technically valid, but hard for humans to read and bloats file size, so for data containing non-ASCII characters (accents, emoji, CJK, etc.), make a habit of passing ensure_ascii=False.
Convert user info to JSON, parse it back into a Python object, and check the type of each variable with type().
① Import the json module.
② Build a dictionary with name="Alice" and an item list ["Apple", "Banana"] (two keys: name and items).
③ Convert the dict to a pretty-printed (indented) JSON string and print it, then print type(text): to show the type of the JSON string (disable non-ASCII escaping in case the data contains accents, emoji, etc.).
④ Parse the JSON string back into a Python object, then print type(parsed): and type(items): for each (extract items from parsed).
⑤ From the parsed object, print name: ◯◯ and First item: ◯◯ (the first element of items).
(If you run this correctly, an explanation will appear.)
Python Editor
Run code to see output
csv Basics — Handle Rows with reader and writer
CSV (Comma-Separated Values) is a plain format where one comma-separated line equals one record, and since spreadsheet software like Excel can open it directly, it's everywhere in business workflows. Python's csv module provides functions for reading and writing this format row by row.
The basics are csv.writer(file) and csv.reader(file): the former writes a list of values as a single row, the latter reads CSV one row at a time as a list of values. Two gotchas: first, everything you read back is a string — if you need integers, convert with int() yourself. Second, always pass newline='' to open(...) so the csv module can manage newline characters itself.
What csv.writer and csv.reader dowriter.writerow(list)writes a list of values as one CSV row. reader does the reverse — pulls a CSV row out as a list of values — and crucially, all values come out as strings.
Always pass newline='' to open
The csv module manages newline characters itself, so you need to pass newline='' like open("x.csv", "w", newline=''). Skip it and you can end up with CSVs that contain blank rows on Windows — a classic gotcha called out in the official Python docs.
Write a list of users to CSV, then read it back row by row using a file output.csv on the in-browser filesystem (VFS).
① Import the csv module.
② Open output.csv in write mode (with newline=''), build a writer, and write 3 rows: a header ["name", "age"] and two data rows ["Alice", 30] and ["Bob", 25].
③ Reopen the same file in read mode (with newline=''), build a reader, and print rows one at a time with for row in reader:.
Python Editor
Run code to see output
DictWriter and DictReader — Read and Write by Column Name
The csv.writer / reader from the previous section work by position, so adding columns or changing their order forces you to rewrite every row[0] / row[1] access. DictWriter / DictReader are versions that read and write by column name (header) — you can write a list of dicts straight to CSV and read it back as a list of dicts.
Real-world data is mostly CSVs with a header row, so in actual projects you'll reach for these far more often.
Why DictWriter / DictReader are handyDictWriter turns a list of dicts into a CSV with a header row defined by fieldnames. DictReader does the reverse — reads a CSV with a header row as a list of dicts, so you can access values by column name like row["name"].
Write a list-of-dicts of users to users.csv, then read it back and print each row formatted. Try reading by column name instead of by position.
① Import csv.
② Build a list of two users (3 columns: name / age / city).
- First: name="Alice", age=30, city="Tokyo"
- Second: name="Bob", age=25, city="Osaka"
③ Open users.csv in write mode, build a DictWriter, and write the header + data rows (pass fieldnames=["name", "age", "city"]).
④ Reopen the file and build a DictReader, and print each row in the form {name} ({age} y/o) {city}.
Python Editor
Run code to see output
Real-world Example: Aggregate titanic.csv
So far we've made small datasets in code and written them out. Let's finish by reading a real dataset and aggregating it. The subject is the famous Titanic dataset on Kaggle (891 rows / 12 columns), with columns like PassengerId / Survived (0 = died, 1 = survived) / Pclass (cabin class) / Name / Sex / Age / Fare.
The python_console for the practice preloads the external CSV into the in-browser virtual filesystem (VFS) via fileUrls, so your code can just call open("titanic.csv"). We'll write the same task with both csv.reader (positional) and csv.DictReader (by column name).
Use the positional csv.reader to read titanic.csv and count the total number of passengers and survivors.
① Import the csv module.
② Open titanic.csv in read mode (with newline='') and build a reader.
③ Skip the header row with next(reader).
④ Loop the rest with for row in reader: and count total and survived (rows where the Survived column = index 1 is "1").
⑤ Print Total: 891 and Survived: 342.
Python Editor
Run code to see output
Use the column-name-based csv.DictReader to read titanic.csv and compute the mean of the Age column. The Titanic data has rows where Age is blank, so we'll handle that too.
① Import csv.
② Open titanic.csv in read mode (with newline='') and build a DictReader.
③ In the loop, take row["Age"] from each row and collect only the non-empty ones with float(row["Age"]) into a list.
④ Print the count of valid rows and the average age (2 decimal places) like this:
- Valid records: ◯ (expected: 714)
- Average age: ◯◯.◯◯ (expected: 29.70)
Python Editor
Run code to see output
QUIZ
Knowledge Check
Answer each question one by one.
Q1What does json.dumps({"name": "café"}, ensure_ascii=False) include in its output?
Q2What's the type of values returned when reading rows back with csv.reader?
Q3What's best suited for reading a CSV with a header row as a list of dicts?