Learn by reading through in order

Functions ② — String Functions (LENGTH / TRIM / REPLACE / UPPER / SUBSTR)

This article is part of the SQL Course, where you master practical SQL skills from scratch, from the fundamentals through to complex queries and SQL tuning.
Use LENGTH for character count, TRIM to strip whitespace, REPLACE for substitution, plus UPPER / LOWER, SUBSTR and INSTR to transform strings on a CSV-loaded staff dataset.

Data used in this article — the staff table

The second function article focuses on string functions. We'll walk through the functions you'll reach for whenever you need to clean or transform text pulled from a live table: length, whitespace stripping, substitution, case conversion, and substring extraction.

The dataset is the same staff table as last time (10 rows: name / city / salary, etc.). Using the names in the name column as the star, four exercises cover length calculation, last-name substitution, uppercasing, and initial extraction.

Before diving into the exercises, let's check the column definitions and sample data of the staff table.

① Run PRAGMA table_info(staff); to see the column names, types, and primary key.

② Run SELECT * FROM staff LIMIT 5; to preview the first 5 rows.

SQL Editor

Run a query to see results

String length — `LENGTH` and `CHAR_LENGTH`

The function for measuring the length of a string behaves differently between this course's console (SQLite) and MySQL, so watch out:

- This course's console (SQLite): LENGTH(s) returns the character count ('あいう' → 3)

- MySQL: LENGTH(s) returns the byte count ('あいう' → 9). To get the character count, use CHAR_LENGTH(s)

The example here is Japanese, but the same gotcha applies to any non-ASCII data: accented Latin ('café' → 4 chars / 5 bytes), Vietnamese ('phở' → 3 chars / 5 bytes), emoji ('🎉' → 1 char / 4 bytes), and so on. This article uses LENGTH() for the exercises, but knowing the difference is essential the moment international text enters your database.

How `LENGTH` and `CHAR_LENGTH` differ
InputSQLiteLENGTHMySQLLENGTH (byte)MySQLCHAR_LENGTH'ABC'3 (chars)3 (bytes)3 (chars)'あいう'3 (chars)9 (bytes)3 B per char in UTF-83 (chars)'Alice'555
This course's console (SQLite) returns the character count with LENGTH. MySQL returns the byte count with LENGTH, and the character count with CHAR_LENGTH. The gap is most visible with multi-byte characters like Japanese.
-- 1) Count characters (in this course's console, LENGTH returns the character count)
SELECT LENGTH('Alice');         -- 5
SELECT LENGTH('あいう');         -- 3 (here), 9 (MySQL's LENGTH = byte count)

-- 2) Sort by column length
SELECT name, LENGTH(name) AS name_len FROM staff
ORDER BY name_len DESC;

-- Reference: in MySQL, character count is CHAR_LENGTH(s)
--   SELECT CHAR_LENGTH('あいう') FROM dual;  -- 3

Imagine an employee list that highlights the "TOP 3 employees by full-name length." (If you run it correctly, an explanation will appear.)

① From the staff table, pull name and LENGTH(name) aliased as name_len — 2 columns total.

② Order by name_len descending. For ties, break them by name ascending (alphabetical) — multi-level sort.

③ Limit to the top 3 rows.

④ Verify that the result is 3 rows: Iris Watanabe 13 / Alice Tanaka 12 / Carol Tanaka 12.

SQL Editor

Run a query to see results

Whitespace stripping and substitution — `TRIM` and `REPLACE`

In real-world tables, you'll run into unintended whitespace — "someone copy-pasted with extra leading/trailing spaces," "a pre-migration system left full-width spaces inside the data," and so on. TRIM(s) strips whitespace from both ends of a string; LTRIM(s) strips just the left side, and RTRIM(s) just the right.

REPLACE(s, find, replace) replaces every occurrence of find in s with replace. It's the go-to function for normalization and migrations — "abbreviate the common last name Tanaka to T. for a compact display," "swap the domain part of an email address to a new domain," etc. Combine it with UPDATE and you can rewrite values directly in the table itself.

-- 1) Whitespace stripping: TRIM / LTRIM / RTRIM
SELECT TRIM('  Hello  ');     -- 'Hello'
SELECT LTRIM('  Hello  ');    -- 'Hello  '
SELECT RTRIM('  Hello  ');    -- '  Hello'

-- TRIM on a column: detect rows with whitespace by length mismatch
SELECT name FROM staff
WHERE LENGTH(name) <> LENGTH(TRIM(name));

-- 2) Substitution: REPLACE
SELECT REPLACE('I LIKE APPLE', 'APPLE', 'BANANA');
-- 'I LIKE BANANA'

-- REPLACE on a column — abbreviate 'Tanaka' to 'T.'
SELECT name, REPLACE(name, 'Tanaka', 'T.') AS short_name
FROM staff;

Applying TRIM — combine with UPDATE to fix inconsistent text

TRIM isn't just for reading — paired with UPDATE, it can rewrite values directly in the table. For example, to bulk-normalize a name column that has spurious leading/trailing spaces, write something like UPDATE staff SET name = TRIM(name) WHERE LENGTH(name) <> LENGTH(TRIM(name)); — the safe pattern is filter with WHERE first, then UPDATE. If you drop the WHERE, every row gets TRIM applied (mostly harmless but wasteful UPDATEs). REPLACE works the same way: UPDATE table SET column = REPLACE(column, 'A', 'B') is a basic technique for bulk-fixing inconsistent text.

Imagine a request: "on a compact dashboard, abbreviate the common last name Tanaka to T.."

① From the staff table, pull name and REPLACE(name, 'Tanaka', 'T.') aliased as short_name — 2 columns total.

② Verify that the result has 10 rows. The 4 names containing Tanaka (Alice / Carol / Frank / Jack) should have short_name like 'Alice T.' / 'Carol T.'.

SQL Editor

Run a query to see results

Case conversion and substring — `UPPER` / `LOWER` / `SUBSTR`

UPPER(s) converts a string to all uppercase, and LOWER(s) to all lowercase. Use them when you want case-insensitive search conditions (WHERE UPPER(email) = 'X@Y.COM') or consistent display formatting. They have no effect on characters without a case distinction (like Japanese).

SUBSTR(s, start, length) extracts length characters starting from position start (you can also write it as SUBSTRING). SQL positions are 1-based, so SUBSTR('Alice Tanaka', 1, 5) returns 'Alice'. Omit the third argument and it returns everything to the end; pass a negative start like SUBSTR(s, -3) and it counts from the end.

UPPER / LOWER / SUBSTR at a glance
FunctionCall exampleResultUPPERUPPER('Alice')'ALICE'LOWERLOWER('ALICE')'alice'SUBSTRSUBSTR('Alice Tanaka', 1, 5)'Alice'
UPPER and LOWER take one argument and change the case of the whole string. SUBSTR takes 3 arguments — (target, start, length) — and the start position is 1-based.
-- 1) UPPER / LOWER
SELECT UPPER('apple'), LOWER('APPLE');
-- 'APPLE', 'apple'

-- On a column: pull names in uppercase and lowercase
SELECT name, UPPER(name) AS upper_name, LOWER(name) AS lower_name
FROM staff;

-- 2) SUBSTR: positions are 1-based
SELECT SUBSTR('Alice Tanaka', 1, 5);   -- 'Alice'
SELECT SUBSTR('Alice Tanaka', 7, 6);   -- 'Tanaka'
SELECT SUBSTR('Alice Tanaka', 7);      -- 'Tanaka'  -- omit length → to end
SELECT SUBSTR('Alice Tanaka', -6);     -- 'Tanaka'  -- negative → from end

-- First character of each name
SELECT name, SUBSTR(name, 1, 1) AS initial FROM staff;

-- Reference: REVERSE is a MySQL / Oracle function — not in this course's console
--   SELECT REVERSE('Alice Tanaka');  -- 'akanaT ecilA' (works in MySQL)

Imagine a request: "group employee files by initial, so extract each initial and sort."

① From the staff table, pull name, UPPER(name) as upper_name, and SUBSTR(name, 1, 1) as initial — 3 columns total.

② Order by initial ascending (alphabetical), with name ascending as the tie-breaker.

③ Verify that the result has 10 rows, starting with Alice Tanaka / ALICE TANAKA / A and ending with Jack Tanaka / JACK TANAKA / J.

SQL Editor

Run a query to see results

Finding the position of a substring — `INSTR`

INSTR(target, search) returns the position (1-based index) where search first appears in target. Positions start at 1, and if the search string isn't found, it returns 0. For example, INSTR('Alice Tanaka', ' ') returns 6 because the space is at the 6th position.

Combine it with SUBSTR and you can slice a string based on a separator instead of a fixed position. It's the go-to combo for transforming variable-length strings — "split an email address around the @," "extract just the last name from a full name," etc.

-- 1) INSTR alone — check the space position
SELECT name, INSTR(name, ' ') AS space_pos FROM staff;
-- 'Alice Tanaka' → 6, 'Bob Suzuki' → 4, ...

-- 2) Combined with SUBSTR to extract the last name
SELECT name,
       SUBSTR(name, INSTR(name, ' ') + 1) AS last_name
FROM staff;

-- 3) Split around '@' (email example)
SELECT INSTR('alice@example.com', '@'); -- 6
How INSTR + SUBSTR slice out the last name
StepExpressionResult① Find thespace positionINSTR('Alice Tanaka',' ')6② Compute thestart of the last name6 + 17③ Slice fromthere to the endSUBSTR('Alice Tanaka', 7)'Tanaka'
The 3 steps to extract the last name from 'Alice Tanaka'. ① INSTR finds the position of the space (6th character) → ② +1 computes the start of the last name (7th character) → ③ SUBSTR slices from there to the end, returning 'Tanaka'.

Imagine a request: "extract just the last name part of each full name and show the top 5 in alphabetical order by last name." In staff, name is in the form "Alice Tanaka" — "first space last," split by a half-width space. Since names vary in length, instead of cutting at a fixed position, use INSTR to find the space position and then slice after it.

① From the staff table, pull name and SUBSTR(name, INSTR(name, ' ') + 1) aliased as last_name — 2 columns total.

② Multi-level sort: last_name ascending, with name ascending as the tie-breaker. Limit to the top 5 rows.

③ Verify that the result is 5 rows (David Sato / Emi Sato / Henry Sato / Bob Suzuki / Grace Suzuki).

SQL Editor

Run a query to see results
QUIZ

Knowledge Check

Answer each question one by one.

Q1In this course's console (SQLite), what does SELECT LENGTH('あいう'); return?

Q2What does SELECT REPLACE('I LIKE APPLE', 'APPLE', 'BANANA'); return?

Q3What does SELECT SUBSTR('Alice Tanaka', 7, 6); return?