usefmtly

URL Encoding Explained

URLs can only contain a limited set of characters. URL encoding — also called percent-encoding — converts everything else into a safe format using % followed by a two-digit hex code.

Why URL encoding exists

A URL can only contain characters from a restricted set: letters, digits, and a handful of punctuation marks. Every other character — spaces, ampersands, quotes, non-ASCII text — needs to be encoded before it can appear in a URL without breaking it.

The encoding is simple: take the character's byte value, write it in hexadecimal, and prefix it with %. A space (byte 0x20) becomes %20. The at sign (0x40) becomes %40.

Example

Before encoding

https://example.com/search?q=hello world&lang=en

After encoding

https://example.com/search?q=hello%20world&lang=en

How percent-encoding works

Each unsafe character is replaced by %XX where XX is the character's UTF-8 byte value in uppercase hexadecimal.

Space%20UTF-8: 0x20Most common encoding you'll encounter
€ (euro sign)%E2%82%ACUTF-8: 0xE2 0x82 0xACMulti-byte UTF-8 — each byte gets its own % prefix
中 (Chinese character)%E4%B8%ADUTF-8: 0xE4 0xB8 0xADNon-ASCII always encodes to multiple % sequences

Characters that must be encoded

These characters have special meaning in URLs. When they appear as data (inside a parameter value, for example), they must be encoded — otherwise the URL parser misreads them as structural separators.

CharEncodedWhy it matters
(space)%20Separates words — never valid in a URL
&%26Separates query parameters
=%3DSeparates key from value in query string
?%3FStarts the query string
#%23Starts a fragment identifier
+%2BMeans "space" in form-encoded data
/%2FPath separator — encode when used inside a param value
@%40Separates credentials from host
:%3ASeparates protocol, host, and port
%%25The escape character itself — must be encoded when literal

Characters that are always safe

A–Z a–z 0–9Unreserved — always safe, never encoded
- _ . ~Unreserved punctuation — safe in any URL part

These 66 characters never need encoding in any part of a URL.

encodeURI vs encodeURIComponent

JavaScript has two built-in functions for URL encoding. They encode different sets of characters and are not interchangeable.

encodeURI()

Encodes a complete URL. Leaves structural characters untouched: : / ? # [ ] @ ! $ & ' ( ) * + , ; =

encodeURI("https://x.com/path?q=hello world")
→ "https://x.com/path?q=hello%20world"

encodeURIComponent()

Encodes a value inside a URL. Encodes everything except unreserved characters — including & = ? /

encodeURIComponent("hello & goodbye")
→ "hello%20%26%20goodbye"
Rule of thumb: use encodeURIComponent() when encoding a query parameter value. Use encodeURI() only when encoding a full URL that you don't control.

Common mistakes

Encoding the whole URL as a query parameter value

redirect=https://example.com/path?q=1redirect=https%3A%2F%2Fexample.com%2Fpath%3Fq%3D1

If you pass a URL as a value inside another URL, the inner URL must be fully encoded — every /, ?, = and & in it.

Double-encoding

%2520 (encoding a % that is already encoded)%20 (encode once and stop)

If you encode an already-encoded string, % becomes %25 and the recipient gets literal %20 in the value instead of a space.

Using + for spaces outside form data

q=hello+world in a path segmentq=hello%20world

+ means space only in application/x-www-form-urlencoded (HTML form submissions). In a path or other context, + is a literal plus sign.

URL encoding vs base64 vs HTML entities

Three different encoding schemes, three different contexts:

URL encoding (%XX)Characters inside URLs and query stringshello world → hello%20world
Base64Binary data in text-only systems (emails, JSON, data URIs)Hello → SGVsbG8=
HTML entitiesSpecial characters inside HTML markup<div> → &lt;div&gt;

Free tools for this