Symbooglix ICST2016 Appendix

This page contains extra data that could not fit into the ICST2016 page limit that some may find useful and is noted as being available as "on the project website ".

Run of Symbooglix snapshots on all benchmarks

To optimise Symbooglix we ran it on a randomly selected subset (10%) of each benchmark suite (SV-COMP and GPU) which we refer to as the "training set". We then optimised Symbooglix using only this "training set" taking periodic snapshots of the tool and running it on those bencharks.

In the paper we provided "quantile plots" for each benchmark suite where we ran each Symbooglix snapshot on the "training set". We now show these plots along side quanitle plots where those snapshots have been run on all the benchmarks. This was done after we had performed optimisation so we did not have access to this data during our optimisation process to guide our optimisations.



Result type counts for GPU

Here are the classification counts when running on the training set.

Snapshot Verified Bug found False alarm Unknown
Baseline 33 3 0 21
GlobalDDE 33 3 0 21
GotoAssumeLA 33 3 0 21
ExprSimpl 33 3 0 21
ConstrIndep 33 3 0 21
RemSomeRecur 37 3 0 17
RemSomeDbg 37 3 0 17
MapConstIdx 37 3 0 17
MapSymIdx 37 3 0 17
EffcntClone 37 3 0 17
SimplSolv 37 3 0 17

Here are the classification counts when running on the whole GPU benchmark suite.

Snapshot Verified Bug found False alarm Unknown
Baseline 260 29 0 290
GlobalDDE 260 29 0 290
GotoAssumeLA 261 31 0 287
ExprSimpl 268 31 0 280
ConstrIndep 266 31 0 282
RemSomeRecur 301 35 0 243
RemSomeDbg 301 35 0 243
MapConstIdx 303 35 0 241
MapSymIdx 302 35 0 242
EffcntClone 300 35 0 244
SimplSolv 301 35 0 243

Looking at the table above a few interesting things can be observed.

Result type counts for SVCOMP

Here are the classification counts when running on the training set.

Snapshot Verified Bug found False alarm Unknown
Baseline 0 0 0 374
GlobalDDE 9 35 0 330
GotoAssumeLA 9 36 0 329
ExprSimpl 14 37 0 323
ConstrIndep 15 37 0 322
RemSomeRecur 15 39 0 320
RemSomeDbg 15 39 0 320
MapConstIdx 15 38 0 321
MapSymIdx 17 40 0 317
EffcntClone 17 40 0 317
SimplSolv 17 40 0 317

One observation about the above table is that between MapConstIdx and RemSomeDbg one less bug was found. This was the array-examples/standard_seq_init_true-unreach-call_ground.i_.bpl benchmark. In RemSombDbg the bug was found very close to the timeout and so the either result inconsistency (due to running on our compute cluster) or slight overhead added by MapConstIdx lead to this benchmark timing out for MapConstIdx.

Here are the classification counts when running on the whole SVCOMP benchmark suite.

Snapshot Verified Bug found False alarm Unknown
Baseline 0 0 0 3749
GlobalDDE 141 339 0 3269
GotoAssumeLA 141 363 0 3245
ExprSimpl 200 390 0 3159
ConstrIndep 204 392 0 3153
RemSomeRecur 204 393 0 3152
RemSomeDbg 204 393 0 3152
MapConstIdx 204 393 0 3152
MapSymIdx 236 395 0 3118
EffcntClone 236 395 0 3118
SimplSolv 236 394 0 3119

A few observations about the above table:

Scatter plot of Symbooglix vs Duality

Below is a scatter plot of bug finding times. Each point is a single benchmark in the SV-COMP suite and its position shows the execution time of Symbooglix and Duality for that benchmark. For 236 benchmarks Symbooglix was faster than Duality and for 350 benchmarks Duality was faster than Symbooglix.