How URL normalization works
URL normalization modifies separators, encoded elements, and literal bytes in incoming URLs so that they conform to a consistent formatting standard.
For example, consider a firewall rule that blocks requests whose URLs match www.example.com/hello
. The rule would not block a request containing an encoded element — www.example.com/%68ello
. Normalizing incoming URLs at the edge helps simplify Cloudflare firewall rules expressions that use URLs.
The URL normalization performed according to RFC-3986 is as follows:
- The following unreserved characters are
percent decoded:
- Alphabetical characters:
a
-z
,A
-Z
(decoded from%41
-%5A
and%61
-%7A
) - Digit characters:
0
-9
(decoded from%30
-%39
) - hyphen
-
(%2D
), period.
(%2E
), underscore_
(%5F
), and tilde~
(%7E
)
- Alphabetical characters:
- These reserved characters are not encoded or decoded:
: / ? # [ ] @ ! $ & ' ( ) * + , ; =
- Other characters, for example literal byte values, are percent encoded.
- Percent encoded representations are converted to upper case.
- URL paths are normalized according to the Remove Dot Segments protocol.
In addition to the rules defined in RFC-3986, Cloudflare can apply the following extra normalization techniques:
- Normalize back slashes (
\
) into forward slashes (/
). - Merge successive forward slashes (for example,
//
will be normalized to/
).
The performed URL normalization varies according to the configured settings. For more information, refer to URL normalization settings .
URL normalization examples
The following table shows some examples of URL normalization when using the Cloudflare normalization type :
URL | Normalized URL |
---|---|
example.com/en/hello/ |
example.com/en/hello/ |
example.com/en//%68ello\path |
example.com/en/hello/path |
example.com\hello |
example.com/hello |
example.com/./en//hello./ |
example.com/en/hello./ |