Name |
Using Unicode Encoding to Bypass Validation Logic |
|
Likelyhood of attack |
Typical severity |
Medium |
High |
|
Summary |
An attacker may provide a Unicode string to a system component that is not Unicode aware and use that to circumvent the filter or cause the classifying mechanism to fail to properly understanding the request. That may allow the attacker to slip malicious data past the content filter and/or possibly cause the application to route the request incorrectly. |
Prerequisites |
Filtering is performed on data that has not be properly canonicalized. |
Execution Flow |
Step |
Phase |
Description |
Techniques |
1 |
Explore |
[Survey the application for user-controllable inputs] Using a browser or an automated tool, an attacker follows all public links and actions on a web site. They record all the links, the forms, the resources accessed and all other potential entry-points for the web application. |
- Use a spidering tool to follow and record all links and analyze the web pages to find entry points. Make special note of any links that include parameters in the URL.
- Use a proxy tool to record all user input entry points visited during a manual traversal of the web application.
- Use a browser to manually explore the website and analyze how it is constructed. Many browsers' plugins are available to facilitate the analysis or automate the discovery.
|
2 |
Experiment |
[Probe entry points to locate vulnerabilities] The attacker uses the entry points gathered in the "Explore" phase as a target list and injects various Unicode encoded payloads to determine if an entry point actually represents a vulnerability with insufficient validation logic and to characterize the extent to which the vulnerability can be exploited. |
- Try to use Unicode encoding of content in Scripts in order to bypass validation routines.
- Try to use Unicode encoding of content in HTML in order to bypass validation routines.
- Try to use Unicode encoding of content in CSS in order to bypass validation routines.
|
|
Solutions | Ensure that the system is Unicode aware and can properly process Unicode data. Do not make an assumption that data will be in ASCII. Ensure that filtering or input validation is applied to canonical data. Assume all input is malicious. Create an allowlist that defines all valid input to the software system based on the requirements specifications. Input that does not match against the allowlist should not be permitted to enter into the system. |
Related Weaknesses |
CWE ID
|
Description
|
CWE-20 |
Improper Input Validation |
CWE-74 |
Improper Neutralization of Special Elements in Output Used by a Downstream Component ('Injection') |
CWE-172 |
Encoding Error |
CWE-173 |
Improper Handling of Alternate Encoding |
CWE-176 |
Improper Handling of Unicode Encoding |
CWE-179 |
Incorrect Behavior Order: Early Validation |
CWE-180 |
Incorrect Behavior Order: Validate Before Canonicalize |
CWE-183 |
Permissive List of Allowed Inputs |
CWE-184 |
Incomplete List of Disallowed Inputs |
CWE-692 |
Incomplete Denylist to Cross-Site Scripting |
CWE-697 |
Incorrect Comparison |
|
Related CAPECS |
CAPEC ID
|
Description
|
CAPEC-267 |
An adversary leverages the possibility to encode potentially harmful input or content used by applications such that the applications are ineffective at validating this encoding standard. |
|
Taxonomy: OWASP Attacks |
Entry ID
|
Entry Name
|
Link |
Unicode Encoding |
|