Name |
Exploiting Multiple Input Interpretation Layers |
|
Likelyhood of attack |
Typical severity |
Medium |
High |
|
Summary |
An attacker supplies the target software with input data that contains sequences of special characters designed to bypass input validation logic. This exploit relies on the target making multiples passes over the input data and processing a "layer" of special characters with each pass. In this manner, the attacker can disguise input that would otherwise be rejected as invalid by concealing it with layers of special/escape characters that are stripped off by subsequent processing steps. The goal is to first discover cases where the input validation layer executes before one or more parsing layers. That is, user input may go through the following logic in an application: <parser1> --> <input validator> --> <parser2>. In such cases, the attacker will need to provide input that will pass through the input validator, but after passing through parser2, will be converted into something that the input validator was supposed to stop. |
Prerequisites |
User input is used to construct a command to be executed on the target system or as part of the file name. Multiple parser passes are performed on the data supplied by the user. |
Execution Flow |
Step |
Phase |
Description |
Techniques |
1 |
Explore |
[Determine application/system inputs where bypassing input validation is desired] The attacker first needs to determine all of the application's/system's inputs where input validation is being performed and where they want to bypass it. |
- While using an application/system, the attacker discovers an input where validation is stopping them from performing some malicious or unauthorized actions.
|
2 |
Experiment |
[Determine which character encodings are accepted by the application/system] The attacker then needs to provide various character encodings to the application/system and determine which ones are accepted. The attacker will need to observe the application's/system's response to the encoded data to determine whether the data was interpreted properly. |
- Determine which escape characters are accepted by the application/system. A common escape character is the backslash character, '\'
- Determine whether URL encoding is accepted by the application/system.
- Determine whether UTF-8 encoding is accepted by the application/system.
- Determine whether UTF-16 encoding is accepted by the application/system.
- Determine if any other encodings are accepted by the application/system.
|
3 |
Experiment |
[Combine multiple encodings accepted by the application.] The attacker now combines encodings accepted by the application. The attacker may combine different encodings or apply the same encoding multiple times. |
- Combine same encoding multiple times and observe its effects. For example, if special characters are encoded with a leading backslash, then the following encoding may be accepted by the application/system: "\\\.". With two parsing layers, this may get converted to "\." after the first parsing layer, and then, to "." after the second. If the input validation layer is between the two parsing layers, then "\\\.\\\." might pass a test for ".." but still get converted to ".." afterwards. This may enable directory traversal attacks.
- Combine multiple encodings and observe the effects. For example, the attacker might encode "." as "\.", and then, encode "\." as "\.", and then, encode that using URL encoding to "%26%2392%3B%26%2346%3B"
|
4 |
Exploit |
[Leverage ability to bypass input validation] Attacker leverages their ability to bypass input validation to gain unauthorized access to system. There are many attacks possible, and a few examples are mentioned here. |
- Gain access to sensitive files.
- Perform command injection.
- Perform SQL injection.
- Perform XSS attacks.
|
|
Solutions | An iterative approach to input validation may be required to ensure that no dangerous characters are present. It may be necessary to implement redundant checking across different input validation layers. Ensure that invalid data is rejected as soon as possible and do not continue to work with it. Make sure to perform input validation on canonicalized data (i.e. data that is data in its most standard form). This will help avoid tricky encodings getting past the filters. Assume all input is malicious. Create an allowlist that defines all valid input to the software system based on the requirements specifications. Input that does not match against the allowlist would not be permitted to enter into the system. |
Related Weaknesses |
CWE ID
|
Description
|
CWE-20 |
Improper Input Validation |
CWE-74 |
Improper Neutralization of Special Elements in Output Used by a Downstream Component ('Injection') |
CWE-77 |
Improper Neutralization of Special Elements used in a Command ('Command Injection') |
CWE-78 |
Improper Neutralization of Special Elements used in an OS Command ('OS Command Injection') |
CWE-179 |
Incorrect Behavior Order: Early Validation |
CWE-181 |
Incorrect Behavior Order: Validate Before Filter |
CWE-183 |
Permissive List of Allowed Inputs |
CWE-184 |
Incomplete List of Disallowed Inputs |
CWE-697 |
Incorrect Comparison |
CWE-707 |
Improper Neutralization |
|
Related CAPECS |
CAPEC ID
|
Description
|
CAPEC-267 |
An adversary leverages the possibility to encode potentially harmful input or content used by applications such that the applications are ineffective at validating this encoding standard. |
|