Name |
URL Encoding |
|
Likelyhood of attack |
Typical severity |
High |
High |
|
Summary |
This attack targets the encoding of the URL. An adversary can take advantage of the multiple way of encoding an URL and abuse the interpretation of the URL. |
Prerequisites |
The application should accepts and decodes URL input. The application performs insufficient filtering/canonicalization on the URLs. |
Execution Flow |
Step |
Phase |
Description |
Techniques |
1 |
Explore |
[Survey web application for URLs with parameters] Using a browser, an automated tool or by inspecting the application, an adversary records all URLs that contain parameters. |
- Use a spidering tool to follow and record all links and analyze the web pages to find entry points. Make special note of any links that include parameters in the URL.
|
2 |
Experiment |
[Probe URLs to locate vulnerabilities] The adversary uses the URLs gathered in the "Explore" phase as a target list and tests parameters with different encodings of special characters to see how the web application will handle them. |
- Use URL encodings of special characters such as semi-colons, backslashes, or question marks that might be filtered out normally.
- Combine the use of URL encodings with other encoding techniques such as the triple dot and escape slashes.
|
3 |
Exploit |
[Inject special characters into URL parameters] Using the information gathered in the "Experiment" phase, the adversary injects special characters into the URL using URL encoding. This can lead to path traversal, cross-site scripting, SQL injection, etc. |
|
|
Solutions | Refer to the RFCs to safely decode URL. Regular expression can be used to match safe URL patterns. However, that may discard valid URL requests if the regular expression is too restrictive. There are tools to scan HTTP requests to the server for valid URL such as URLScan from Microsoft (http://www.microsoft.com/technet/security/tools/urlscan.mspx). Any security checks should occur after the data has been decoded and validated as correct data format. Do not repeat decoding process, if bad character are left after decoding process, treat the data as suspicious, and fail the validation process. Assume all input is malicious. Create an allowlist that defines all valid input to the software system based on the requirements specifications. Input that does not match against the allowlist should not be permitted to enter into the system. Test your decoding process against malicious input. Be aware of the threat of alternative method of data encoding and obfuscation technique such as IP address encoding. (See related guideline section) When client input is required from web-based forms, avoid using the "GET" method to submit data, as the method causes the form data to be appended to the URL and is easily manipulated. Instead, use the "POST method whenever possible. |
Related Weaknesses |
CWE ID
|
Description
|
CWE-20 |
Improper Input Validation |
CWE-73 |
External Control of File Name or Path |
CWE-74 |
Improper Neutralization of Special Elements in Output Used by a Downstream Component ('Injection') |
CWE-172 |
Encoding Error |
CWE-173 |
Improper Handling of Alternate Encoding |
CWE-177 |
Improper Handling of URL Encoding (Hex Encoding) |
|
Related CAPECS |
CAPEC ID
|
Description
|
CAPEC-267 |
An adversary leverages the possibility to encode potentially harmful input or content used by applications such that the applications are ineffective at validating this encoding standard. |
|