Learn by reading through in order

Functions ② — String Functions (LENGTH / TRIM / REPLACE / UPPER / SUBSTR)

The second of three SQL function articles. Cover LENGTH for counting characters, TRIM for stripping whitespace, REPLACE for substitutions, UPPER / LOWER for case conversion, and SUBSTR — all on a CSV-loaded staff dataset.

Data used in this article — the staff table

The second function article focuses on string functions. We'll walk through the functions you'll reach for whenever you need to clean or transform text pulled from a live table: length, whitespace stripping, substitution, case conversion, and substring extraction.

The dataset is the same staff table as last time (10 rows: name / city / salary, etc.). Using the names in the name column as the star, four exercises cover length calculation, last-name substitution, uppercasing, and initial extraction.

Before diving into the exercises, let's check the column definitions and sample data of the staff table.

① Run PRAGMA table_info(staff); to see the column names, types, and primary key.

② Run SELECT * FROM staff LIMIT 5; to preview the first 5 rows.

SQL Editor

Run a query to see results

String length — `LENGTH` and `CHAR_LENGTH`

The function for measuring the length of a string behaves differently between this course's console (SQLite) and MySQL, so watch out:

- This course's console (SQLite): LENGTH(s) returns the character count ('あいう' → 3)

- MySQL: LENGTH(s) returns the byte count ('あいう' → 9). To get the character count, use CHAR_LENGTH(s)

The example here is Japanese, but the same gotcha applies to any non-ASCII data: accented Latin ('café' → 4 chars / 5 bytes), Vietnamese ('phở' → 3 chars / 5 bytes), emoji ('🎉' → 1 char / 4 bytes), and so on. This article uses LENGTH() for the exercises, but knowing the difference is essential the moment international text enters your database.

How `LENGTH` and `CHAR_LENGTH` differ
InputSQLiteLENGTHMySQLLENGTH (byte)MySQLCHAR_LENGTH'ABC'3 (chars)3 (bytes)3 (chars)'あいう'3 (chars)9 (bytes)3 B per char in UTF-83 (chars)'Alice'555
This course's console (SQLite) returns the character count with LENGTH. MySQL returns the byte count with LENGTH, and the character count with CHAR_LENGTH. The gap is most visible with multi-byte characters like Japanese.
-- 1) Count characters (in this course's console, LENGTH returns the character count)
SELECT LENGTH('Alice');         -- 5
SELECT LENGTH('あいう');         -- 3 (here), 9 (MySQL's LENGTH = byte count)

-- 2) Sort by column length
SELECT name, LENGTH(name) AS name_len FROM staff
ORDER BY name_len DESC;

-- Reference: in MySQL, character count is CHAR_LENGTH(s)
--   SELECT CHAR_LENGTH('あいう') FROM dual;  -- 3

Imagine an employee list that highlights the "TOP 3 employees by full-name length." (If you run it correctly, an explanation will appear.)

① From the staff table, pull name and `LENGTH(name)` aliased as `name_len` — 2 columns total.

② Order by name_len descending. For ties, break them by name ascending (alphabetical) — multi-level sort.

③ Limit to the top 3 rows.

④ Verify that the result is 3 rows: Iris Watanabe 13 / Alice Tanaka 12 / Carol Tanaka 12.

SQL Editor

Run a query to see results

Whitespace stripping and substitution — `TRIM` and `REPLACE`

In real-world tables, you'll run into unintended whitespace — "someone copy-pasted with extra leading/trailing spaces," "a pre-migration system left full-width spaces inside the data," and so on. `TRIM(s)` strips whitespace from both ends of a string; LTRIM(s) strips just the left side, and RTRIM(s) just the right.

`REPLACE(s, find, replace)` replaces every occurrence of find in s with replace. It's the go-to function for normalization and migrations — "abbreviate the common last name Tanaka to T. for a compact display," "swap the domain part of an email address to a new domain," etc. Combine it with UPDATE and you can rewrite values directly in the table itself.

-- 1) Whitespace stripping: TRIM / LTRIM / RTRIM
SELECT TRIM('  Hello  ');     -- 'Hello'
SELECT LTRIM('  Hello  ');    -- 'Hello  '
SELECT RTRIM('  Hello  ');    -- '  Hello'

-- TRIM on a column: detect rows with whitespace by length mismatch
SELECT name FROM staff
WHERE LENGTH(name) <> LENGTH(TRIM(name));

-- 2) Substitution: REPLACE
SELECT REPLACE('I LIKE APPLE', 'APPLE', 'BANANA');
-- 'I LIKE BANANA'

-- REPLACE on a column — abbreviate 'Tanaka' to 'T.'
SELECT name, REPLACE(name, 'Tanaka', 'T.') AS short_name
FROM staff;

Applying TRIM — combine with UPDATE to fix inconsistent text

TRIM isn't just for reading — paired with UPDATE, it can rewrite values directly in the table. For example, to bulk-normalize a name column that has spurious leading/trailing spaces, write something like UPDATE staff SET name = TRIM(name) WHERE LENGTH(name) <> LENGTH(TRIM(name)); — the safe pattern is filter with WHERE first, then UPDATE. If you drop the WHERE, every row gets TRIM applied (mostly harmless but wasteful UPDATEs). REPLACE works the same way: UPDATE table SET column = REPLACE(column, 'A', 'B') is a basic technique for bulk-fixing inconsistent text.

Imagine a request: "on a compact dashboard, abbreviate the common last name Tanaka to T.."

① From the staff table, pull name and `REPLACE(name, 'Tanaka', 'T.')` aliased as `short_name` — 2 columns total.

② Verify that the result has 10 rows. The 4 names containing Tanaka (Alice / Carol / Frank / Jack) should have short_name like 'Alice T.' / 'Carol T.'.

SQL Editor

Run a query to see results

Case conversion and substring — `UPPER` / `LOWER` / `SUBSTR`

`UPPER(s)` converts a string to all uppercase, and `LOWER(s)` to all lowercase. Use them when you want case-insensitive search conditions (WHERE UPPER(email) = 'X@Y.COM') or consistent display formatting. They have no effect on characters without a case distinction (like Japanese).

`SUBSTR(s, start, length)` extracts `length` characters starting from position `start` (you can also write it as SUBSTRING). SQL positions are 1-based, so SUBSTR('Alice Tanaka', 1, 5) returns 'Alice'. Omit the third argument and it returns everything to the end; pass a negative start like SUBSTR(s, -3) and it counts from the end.

UPPER / LOWER / SUBSTR at a glance
FunctionCall exampleResultUPPERUPPER('Alice')'ALICE'LOWERLOWER('ALICE')'alice'SUBSTRSUBSTR('Alice Tanaka', 1, 5)'Alice'
UPPER and LOWER take one argument and change the case of the whole string. SUBSTR takes 3 arguments — (target, start, length) — and the start position is 1-based.
-- 1) UPPER / LOWER
SELECT UPPER('apple'), LOWER('APPLE');
-- 'APPLE', 'apple'

-- On a column: pull names in uppercase and lowercase
SELECT name, UPPER(name) AS upper_name, LOWER(name) AS lower_name
FROM staff;

-- 2) SUBSTR: positions are 1-based
SELECT SUBSTR('Alice Tanaka', 1, 5);   -- 'Alice'
SELECT SUBSTR('Alice Tanaka', 7, 6);   -- 'Tanaka'
SELECT SUBSTR('Alice Tanaka', 7);      -- 'Tanaka'  -- omit length → to end
SELECT SUBSTR('Alice Tanaka', -6);     -- 'Tanaka'  -- negative → from end

-- First character of each name
SELECT name, SUBSTR(name, 1, 1) AS initial FROM staff;

-- Reference: REVERSE is a MySQL / Oracle function — not in this course's console
--   SELECT REVERSE('Alice Tanaka');  -- 'akanaT ecilA' (works in MySQL)

Imagine a request: "group employee files by initial, so extract each initial and sort."

① From the staff table, pull name, `UPPER(name)` as `upper_name`, and `SUBSTR(name, 1, 1)` as `initial` — 3 columns total.

② Order by initial ascending (alphabetical), with name ascending as the tie-breaker.

③ Verify that the result has 10 rows, starting with Alice Tanaka / ALICE TANAKA / A and ending with Jack Tanaka / JACK TANAKA / J.

SQL Editor

Run a query to see results

Finding the position of a substring — `INSTR`

`INSTR(target, search)` returns the position (1-based index) where `search` first appears in target. Positions start at 1, and if the search string isn't found, it returns 0. For example, INSTR('Alice Tanaka', ' ') returns 6 because the space is at the 6th position.

Combine it with SUBSTR and you can slice a string based on a separator instead of a fixed position. It's the go-to combo for transforming variable-length strings — "split an email address around the @," "extract just the last name from a full name," etc.

-- 1) INSTR alone — check the space position
SELECT name, INSTR(name, ' ') AS space_pos FROM staff;
-- 'Alice Tanaka' → 6, 'Bob Suzuki' → 4, ...

-- 2) Combined with SUBSTR to extract the last name
SELECT name,
       SUBSTR(name, INSTR(name, ' ') + 1) AS last_name
FROM staff;

-- 3) Split around '@' (email example)
SELECT INSTR('alice@example.com', '@'); -- 6
How INSTR + SUBSTR slice out the last name
StepExpressionResult① Find thespace positionINSTR('Alice Tanaka',' ')6② Compute thestart of the last name6 + 17③ Slice fromthere to the endSUBSTR('Alice Tanaka', 7)'Tanaka'
The 3 steps to extract the last name from 'Alice Tanaka'. ① INSTR finds the position of the space (6th character) → ② +1 computes the start of the last name (7th character) → ③ SUBSTR slices from there to the end, returning 'Tanaka'.

Imagine a request: "extract just the last name part of each full name and show the top 5 in alphabetical order by last name." In staff, name is in the form "Alice Tanaka" — "first space last," split by a half-width space. Since names vary in length, instead of cutting at a fixed position, use `INSTR` to find the space position and then slice after it.

① From the staff table, pull name and `SUBSTR(name, INSTR(name, ' ') + 1)` aliased as `last_name` — 2 columns total.

② Multi-level sort: `last_name` ascending, with `name` ascending as the tie-breaker. Limit to the top 5 rows.

③ Verify that the result is 5 rows (David Sato / Emi Sato / Henry Sato / Bob Suzuki / Grace Suzuki).

SQL Editor

Run a query to see results
QUIZ

Knowledge Check

Answer each question one by one.

Q1In this course's console (SQLite), what does SELECT LENGTH('あいう'); return?

Q2What does SELECT REPLACE('I LIKE APPLE', 'APPLE', 'BANANA'); return?

Q3What does SELECT SUBSTR('Alice Tanaka', 7, 6); return?