2 \defgroup goto-symex goto-symex
6 \author Kareem Khazem, Martin Brain
8 This directory contains a symbolic evaluation system for goto-programs.
9 This takes a goto-program and translates it to an equation system by
10 traversing the program, branching and merging and unwinding loops as
11 needed. Each reverse goto has a separate counter (the actual counting is
12 handled by `cbmc`, see the `–unwind` and `–unwind-set` options). When a
13 counter limit is reach, an assertion can be added to explicitly show
14 when analysis is incomplete. The symbolic execution includes constant
15 folding so loops that have a constant number of iterations will be
16 handled completely (assuming the unwinding limit is sufficient).
18 The output of the symbolic execution is a system of equations; an object
19 containing a list of `symex_target_elements`, each of which are
20 equalities between `expr` expressions. See `symex_target_equation.h`.
21 The output is in static, single assignment (SSA) form, which is *not*
22 the case for goto-programs.
24 \section symbolic-execution Symbolic Execution
26 In the \ref goto-symex directory.
36 1 [shape=none, label=""];
37 2 [label="goto conversion"];
38 3 [shape=none, label=""];
39 1 -> 2 [label="goto-programs, goto-functions, symbol table"];
40 2 -> 3 [label="equations"];
44 \subsection symex-overview Overview
46 The \ref bmct class gets a reference to an \ref symex_target_equationt
47 "equation" (initially, an empty list of \ref symex_target_equationt::SSA_stept
48 "single-static assignment steps") and a goto-program from the frontend.
49 The \ref bmct creates a goto_symext to symbolically execute the
50 goto-program, thereby filling the equation, which can then be passed
51 along to the SAT solver.
53 The symbolic execution state is maintained by goto_symex_statet. This is
54 a memory-heavy data structure, so goto_symext creates it on-demand and
55 lets it go out-of-scope as soon as possible. However, the process of
56 symbolic execution fills out an additional \ref symbol_tablet
57 "symbol table" of dynamically-created objects; this symbol table is
58 needed when solving the equation. This symbol table must thus be
59 exported out of the state before it is torn down; this is done through
60 the parameter "`new_symbol_table`" passed as a reference to the various
61 functions in goto_symext.
63 The main symbolic execution loop code is goto_symext::symex_step(). This
64 function case-switches over the type of the instruction that we're
65 currently executing, and calls various other functions appropriate to
66 the instruction type, i.e. goto_symext::symex_function_call() if the
67 current instruction is a function call, goto_symext::symex_goto() if the
68 current instruction is a goto, etc.
70 \subsection symex-path Path exploration
72 By default, CBMC symbolically executes the entire program, building up
73 an equation representing all instructions, which it then passes to the
74 solver. Notably, it _merges_ paths that diverge due to a goto
75 instruction, forming a constraint that represents having taken either of
76 the two paths; the code for doing this is goto_symext::merge_goto().
78 CBMC can operate in a different mode when the `--paths` flag is passed
79 on the command line. This disables path merging; instead, CBMC executes
80 a single path at a time, calling the solver with the equation
81 representing that path, then continues to execute other paths.
82 The 'other paths' that would have been taken when the program branches
83 are saved onto a workqueue so that CBMC can continue to execute the
84 current path, and then later retrieve saved paths from the workqueue.
85 Implementation-wise, \ref bmct passes a worklist to goto_symext in
86 bmct::do_language_agnostic_bmc(). If path exploration is enabled,
87 goto_symext will fill up the worklist whenever it encounters a branch,
88 instead of merging the paths on the branch. After the initial symbolic
89 execution (i.e. the first call to bmct::run() in
90 bmct::do_language_agnostic_bmc()), \ref bmct continues popping the
91 worklist and executing untaken paths until the worklist empties. Note
92 that this means that the default model-checking behaviour is a special
93 case of path exploration: when model-checking, the initial symbolic
94 execution run does not add any paths to the workqueue but rather merges
95 all the paths together, so the additional path-exploration loop is
98 \subsection ssa-renaming SSA renaming levels
100 In goto-programs, variable names get a prefix to indicate their scope
101 (like `main::1::%foo` or whatever). At symbolic execution level, variables
102 also get a _suffix_ because we’re doing single-static assignment. There
103 are three “levels” of renaming. At Level 2, variables are renamed every
104 time they are encountered in a function. At Level 1, variables are
105 renamed every time the functions that contain those variables are
106 called. At Level 0, variables are renamed every time a new thread
107 containing those variables are spawned. We can inspect the renamed
108 variable names with the –show-vcc flag, which prints a string with the
109 following format: `%%s!%%d@%%d#%%d`. The string field is the variable name,
110 and the numbers after the !, @, and # are the L0, L1, and L2 suffixes
111 respectively. The following examples illustrate Level 1 and 2 renaming:
119 $ cbmc --show-vcc l1.c
124 That is, the L0 names for both xs are 0; the L1 names for both xs are 1;
125 but each occurrence of x within main() gets a fresh L2 suffix (2 and 3,
139 $ cbmc --show-vcc l2.c
148 That is, every time we enter function foo, the L1 counter of x is
149 incremented (from 1 to 2) and the L0 counter is reset (back to 2, after
150 having been incremented up to 4). The L0 counter then increases every
151 time we access x as we walk through the function.
154 \section counter-example-production Counter Example Production
156 In the \ref goto-symex directory.
159 * symex_target_equationt
162 * fault_localizationt
163 * counterexample_beautificationt
169 1 [shape=none, label=""];
170 2 [label="goto conversion"];
171 3 [shape=none, label=""];
172 1 -> 2 [label="solutions"];
173 2 -> 3 [label="counter-examples"];