Purpose
Enable users to compare systems based on the impact realistic human errors have on the security of a system.
Methodology:
The process contains 3 steps:
- Use existing configuration files
- Inject realistic human errors in these configuration files
- Run security-assessing tools against the system using the new configuration files and try to find security vulnerabilities
A realistic human error in a configuration file can create security vulnerabilities in multiple ways:
- Expose new functionality to attackers of the system. For instance, wrongfully including the log as part of the configuration gives attacks the possibility of executing arbitrary code on the server.
- Expose new documents to attackers. For instance, some password protected URLs become publicly accessible.
We start with existing configuration files because it represents the goal of the system administrator.
The configuration file
Generally, a system’s configuration is organized as a collection of one or more configuration files, each being composed of sections that contain directives. Directives can further be split into their constituent tokens.
ConfErr converts a system’s configuration to an abstract representation that is program independent. This representation reflects none of that configuration’s peculiarities, such as configuration file format, but maintains a configuration’s vital properties: (1) a configuration component’s type, e.g., section; (2) that component’s containment relations to other components; and (3) the order of siblings.
The abstract representation is a directed acyclic graph (DAG), in which nodes correspond to that configuration’s components and edges between nodes reflect that configuration’s structure. A DAG is a suitable representation for the configurations of all the systems we tested: line-oriented configurations (MySQL, PostgreSQL), hierarchical configurations (LDAP), as well as custom formats (Apache httpd).
A system’s configuration is converted to an abstract representations using parsers, and an abstract representation is transformed back to a configuration using serializers. Parsers and serializers can be either system-agnostic, as for line-oriented configurations such as MySQL’s, or system specific, as is the case for Apache. Below is a conceptual depiction of the configuration file.
Configuration Example[section "NS"] N1 = V1 //Directive D1 N2 = V2 //Directive D2 N3 = V3 //Directive D3 N4 = V4 //Directive D4
Square nodes maintain the structure of the configuration file, while round nodes are the "strings" that appear in the configuration file.
Error models
HEX is a "human error constraint solver" that injects realistic human errors into strings. Given a grammar or a constraint on the value of the string, HEX can generate new strings that break the grammar or the constraints or not.
There are 4 types of errors:
- Swap adjacent elements in a list
- Replace an element in a list with another one
- Remove elements in a list
- Insert elements in a list
All error models work on lists. A list is a set of configuration components that have common properties, e.g., it can be the list of the characters of a configuration directive's value (i.e., a string) or they can be list of the names of directives in a section. Therefore, error models can be applied both to strings and to higher level configuration components, by abstracting the higher level components to a string.
Defining the error models formally
| Error model |
Input |
Output |
| Swap |
<a, b, c> |
{<b, a, c>, <a, c, b>} |
| Replace |
<a, b, c>, alphabet(x) |
{<alphabet(a), b, c>, <a, alphabet(b), c>, <a, b, alphabet(c)>} |
| Remove |
<a, b, c> |
{<b, c>, <a, c>, <a, b>} |
| Insert |
<a, b>, alphabet(x) |
{<alphabet(a), a, b>, <a, alphabet(a), b>, <a, alphabet(b), b>, <a, b, alphabet(b)>} |
Definitions of alphabet
For operation on strings, with x being the input.
| Name |
Definition |
Example |
Replace |
Insert |
| KeyboardLayout |
x -> {keysNear(x)} |
'g' -> {'t', 'v', 'f', 'h'} |
Yes |
Yes |
| ChangeCase |
x -> {shift(x), alt(x)} |
'q' -> {'Q', '@'} |
Yes |
No |
| Phoneme |
x -> {phoneme(x)} |
'f' -> {"ph"} |
Yes |
No |
| AllCharacters |
x -> {everyLatinCharacter} |
'a' -> {'a', 'b', 'c', 'd', ...} |
Yes |
Yes |
For operation on configuration components (i.e., sections and directives,) with x being the input. The examples refer to the DAG defined above.
| Name |
Definition |
Example |
Replace |
Insert |
| Siblings |
x -> {y | parent(x) == parent(y) &&
x.type == y.type} |
D1 -> {D2, D3, D4} |
Yes |
Yes |
| Related |
x -> {y | depth(x) == depth(y) &&
x.type == y.type} |
N1 -> {N2, N3, N4} |
Yes |
Yes |
| Vicinity |
x -> {y | preorder[i] == x && preorder[j] == y && |i - j| < 2 && x.type == y.type} |
V2 -> {V1, N2, N3, V3} |
Yes |
Yes |
Error scores for error models specific to strings
| Error Model |
Score |
| Phoneme-Based Replace Characters |
2.3 |
| Swap Adjacent Characters |
1.7 |
| Replace Characters With Keyboard Neighbors |
0.6 |
| Delete Characters |
5.1 |
| Change Case Modifier |
0.04 |
The scores were computed using the most frequent >4000 misspelled words from Wikipedia. The current version of the system injects maximum 2 errors into each list and whose score is computed by summing up the scores of the errors that affected it.
Smart error injectors
A smart error injector understands the values of directives, converts these values into a tree that can be fed to the error models, and provides new alphabet functions to the insert and replace error models.
There are 2 smart error injectors in ConfErr:
- AbstractToList: interprets the value of directive as a list of elements, and constructs an alphabet function that always returns this list of elements. For example, one can configure the modules loaded by the IIS StaticFile handler, by specifying the value of the modules directive. If this value is StaticFileModule,DefaultDocumentModule,DirectoryListingModule, then AbstractToList generates a list with three elements, creates an alphabet function f that returns this list, and uses f for the previously defined error models.
- Alternatives: uses a grammar or a set of constraints or a user-defined dictionary to generate an alphabet function for the value of a directive. For example, in IIS, the enabled directives can be have value either true or false, and if the value of the directive is true, then Alternatives will generate the alphabet function f : {true} -> {false}.
Policy Breaker
ConfErr can use the above error models to generate valid configuration files that break user-defined policies using the PolicyBreaker. Such a policy consists of constraints over the values of a configuration directives. For example, one can specify that the number of maximum clients served by Apache HTTPD to be greater than 100 but less than 600. If the value of this directive is 200, then PolicyBreaker will use the error models to generate a value that is outside of the [100, 600] range, and a possible value is 2009 (the user pressed both 0 and 9 when typing in the last 0.)
Security vulnerability scanners
How security vulnerability scanners work:
- Send HTTP requests to a web site / web application. These requests are generated in one of three ways:
- Use a database of HTTP requests targeted at a specific web server or specific web application. Examples include site/sites/default/settings.php, targeting Drupal installations.
- Request an ordinary page and look at some of the headers, e.g., that the server is Apache, and that the its version is 1.3
- Look at links, cookies, forms to find out injection points
- Analyze the response for particular the presence of particular strings. If the string appears, then a vulnerability is flagged
- Use internal or external tools to exploit the vulnerabilities
I tested multiple tools that perform HTTP requests against web servers and web applications in search of security vulnerabilities:
| Tool |
Observations |
| Nikto |
|
| Skipfish |
|
| Webshag |
|
| W3AF |
Includes internal exploit tool. Includes Nikto |
| OpenVAS |
Can be used together with Metasploit (an exploit framework) |
| Arachni |
Too expensive. Takes ~1h to run a scan |
| HP WebInspect |
Commercial. Can run only against a demo web site |
| IBM AppScan |
Commercial. Can run only against a demo web site |
The overarching question is how to tell whether an error will cause a security vulnerability
The naive answer is to try out all possible errors and run the security-assessing tools against the newly obtained configurations.
Challenges:
- For each configuration parameter, the number of errors that can be injected into it can be very large. Therefore, it is impractical to generate all possible configuration parameter values that are affected by a realistic error. Solutions:
- Use a grammar that specifies, for each configuration parameter, the legal values of that parameter and then inject errors that will not cause the grammar to be violated. The rationale is that a value that breaks the grammar will not be accepted by the system, and the system will not start, and no security vulnerability will be exploitable. The grammar can be received as input or learnt from example configuration files. Challenges with this solution:
- When generating new configuration parameters that respect the grammar, we cannot distinguish between them, in terms of their effects. Therefore, to be conservative, we cannot exclude any of them. Therefore, the number of generated configuration files might still be very high.
- Use concolic execution. See exactly how the behavior of a system is influenced by a configuration parameter and generate new parameters that contain realistic errors, but explore new execution paths. Challenges with this solution:
- Symbolic execution is not able to distinguish between the effects on the security of a system. The same execution path might enable or not an exploit. For example, if a system executes commands from a file at startup, as part of its configuration process, then it matters which file is referenced by the configuration file. If the file is publicly available, then the system can be hacked. Otherwise, not. However, symbolic execution cannot distinguish between the namers of the two files or their contents. Therefore, for some configuration parameters, multiple values that take the program along the same path must be accepted. Again, this might keep the number of possible configurations to test at an impractical level.
- Use AFEX to prioritize testing. This is orthogonal to the previous 2 solutions, in that AFEX will guide selection of tests in a quest to optimize the value of a metric that quantifies the security of a system.
- How to tell exactly the effect of a configuration error on the security of the system? The question here is whether the fact that we have access to the entire system can be leveraged to improve testing tools.
Using symbolic execution
- Concolic:
- For each configuration parameter
- Store the configuration parameter's value as ConfigValue
- Mark the configuration parameter's value as being a symbolic value
- Whenever the state forks, keep only the state for which ConfigValue is a solution. This will transform symbolic execution into concolic execution
- Save the constraints generated at each fork
- For each constraint prefix P(param) = C1 (param) ^ C2(param) ^ ... ^ Cn(param), generate param' = inject_errors(param) such that P'(param') = C1 (param') ^ C2(param') ^ ... ^ not(Cn(param')) evaluates to true. New path explored due to the error.
- For each constraint prefix P(param) = C1 (param) ^ C2(param) ^ ... ^ Cn(param), generate param' = inject_errors(param) such that P(param') = C1 (param') ^ C2(param') ^ ... ^ Cn(param') evaluates to true. Same path is explored, but now with an "erroneous" value.
- Challenge: might have to do a lot of injections to get P(param') or P'(param') to evaluate to true. Solution:
- Use HAMPI + STP (Johannes' input)
- encode all possible errors on param as a HAMPI constraint
- get the STP expression of the HAMPI constraint
- generate new STP query by combining the STP expression of HAMPI with the STP constraints obtained by concolically running the system
- pass the solution to this query to HAMPI
- obtain new string that satisfies P or P'
- Challenge: how many values to generate for P(param') and how many for P'(param')?
- Challenge: should the symbolic exploration be concolic, symbolic, or something in between?
- Something in between: concolic while in the config parser and symbolic afterward?
- What's the difference? The system forks every time after reading the configuration. Will explore more code in the same amount of time, because time will not be spent running security-assessing tools.
- What's the down side? We are interested in discovering security issues, and we can't find these without running the tools against the server.
- I wonder if running Daikon against the server running with the initial configuration can discover some invariants and then use symbolic execution to see if those invariants can be violated.
- Or, I could read into the source code of the security-assessing tools and see what do they exactly do. W3AF has the option to exploit vulnerabilities. Might be interesting to see what those exploits need of the system. Basically, the idea is to distill what the security-assessing tools are doing into constraints on the state of the program.
Implementation
The ConfErr WebService is available on yoda3.
| System |
Link |
Observations |
| IIS 7 |
IIS 7 |
Can mark symbolically only the values of directives |
| Apache 2.0.53 |
Apache 2.0.53 |
Can mark symbolically names and values of directives and sections |
| Lighttpd 1.4.29 |
Lighttpd 1.4.29 |
Can mark symbolically names and values of directives and sections |
S2E is able to run concolically IIS. I found where Apache and Lighttpd read their configuration files, so I expect to make those systems run concolically with a reduced amount of effort.
Currently, there is a bug when the symbolic value propagates to other variables. - FIXED!
HEX is implemented.
Converting an automaton to a regex is implemented.
|