Benchmark Scenarios¶
There are a number of existing scenarios that come with NASim. They cover a range of complexities and sizes and are intended to be used to help with benchmarking algorithms. Additionally, there are two flavours of existing scenarios: static and generated.
Note
For full list of benchmark scenarios see All benchmark scenarios.
Static scenarios are predefined and will be exactly the same every time they are loaded. They are defined in .yaml files in the nasim/scenarios/benchmark/ directory.
Generated are scenario generated using the Scenario Generator based on some parameters. While certain features of the each scenario will remain constant between generations (e.g. number of hosts, services, exploits), other features may change (e.g. specific host configurations, firewall settings, exploit probabilities) depending on the random seed.
All benchmark scenarios¶
The following table provides details of each benchmark scenario currently available in NASim.
Name | Type | Subnets | Hosts | OS | Services | Processes | Exploits | PrivEscs | Actions | Observation Dims | States | Step Limit |
---|---|---|---|---|---|---|---|---|---|---|---|---|
tiny | static | 4 | 3 | 1 | 1 | 1 | 1 | 1 | 18 | 4X14 | 576 | 1000 |
tiny-hard | static | 4 | 3 | 2 | 3 | 2 | 3 | 2 | 27 | 4X18 | 9216 | 1000 |
tiny-small | static | 5 | 5 | 2 | 3 | 2 | 3 | 2 | 45 | 6X20 | 15360 | 1000 |
small | static | 5 | 8 | 2 | 3 | 2 | 3 | 2 | 72 | 9X23 | 24576 | 1000 |
small-honeypot | static | 5 | 8 | 2 | 3 | 2 | 3 | 2 | 72 | 9X23 | 24576 | 1000 |
small-linear | static | 7 | 8 | 2 | 3 | 2 | 3 | 2 | 72 | 9X22 | 24576 | 1000 |
medium | static | 6 | 16 | 2 | 5 | 3 | 5 | 3 | 192 | 17X27 | 393216 | 2000 |
medium-single-site | static | 2 | 16 | 2 | 5 | 3 | 5 | 3 | 192 | 17x34 | 393216 | 2000 |
medium-multi-site | static | 7 | 16 | 2 | 5 | 3 | 5 | 3 | 192 | 17X29 | 393216 | 2000 |
tiny-gen | generated | 4 | 3 | 1 | 1 | 1 | 1 | 1 | 18 | 4X14 | 576 | 1000 |
tiny-gen-rangoal | generated | 4 | 3 | 1 | 1 | 1 | 1 | 1 | 18 | 4X14 | 576 | 1000 |
small-gen | generated | 5 | 8 | 2 | 3 | 2 | 3 | 2 | 72 | 9X23 | 24576 | 1000 |
small-gen-rangoal | generated | 5 | 8 | 2 | 3 | 2 | 3 | 2 | 72 | 9X23 | 24576 | 1000 |
medium-gen | generated | 6 | 16 | 2 | 5 | 2 | 5 | 2 | 176 | 17X26 | 196608 | 2000 |
large-gen | generated | 8 | 23 | 3 | 7 | 3 | 7 | 3 | 322 | 24X32 | 4521984 | 5000 |
huge-gen | generated | 11 | 38 | 4 | 10 | 4 | 10 | 4 | 684 | 39X40 | 2.39E+08 | 10000 |
pocp-1-gen | generated | 10 | 35 | 2 | 50 | 2 | 60 | 2 | 2310 | 36X75 | 1.51E+19 | 30000 |
pocp-2-gen | generated | 21 | 95 | 3 | 10 | 3 | 30 | 3 | 3515 | 96X48 | 1.49E+08 | 30000 |
The number of actions is calculated as Hosts X (Exploits + PrivEscs + 4). The +4 is for the 4 scans available for each host (OSScan, ServiceScan, ProcessScan, and SubnetScan).
The number of states is calculated as Hosts X 2^(3 + OS + Services) X 3 *. Here the first 3 comes from the *compromised, reachable and discovered features of the state and the base of 2 is due to all state features being boolean (present/absent). The second 3 comes from the number of possible access levels possible on a host.
The table below provides mean steps to reach the goal and reward (+/- stdev) for a uniform random agent, with scores averaged over 100 runs.
Scenario Name | Steps | Total Reward |
---|---|---|
tiny | 108.02 +/- 43.82 | 91.98 +/- 43.82 |
tiny-hard | 135.31 +/- 65.56 | 21.05 +/- 85.45 |
tiny-small | 319.56 +/- 124.26 | -225.86 +/- 167.14 |
small | 501.94 +/- 181.40 | -469.80 +/- 241.99 |
small-honeypot | 448.72 +/- 151.62 | -476.08 +/- 222.41 |
small-linear | 566.00 +/- 177.08 | -555.08 +/- 241.06 |
medium | 1371.45 +/- 420.41 | -1875.29 +/- 660.62 |
medium-single-site | 654.89 +/- 385.76 | -782.17 +/- 581.14 |
medium-multi-site | 1060.94 +/- 389.86 | -1394.71 +/- 590.89 |
tiny-gen | 86.56 +/- 40.16 | 116.43 +/- 40.15 |
tiny-gen-rgoal | 98.94 +/- 47.83 | 104.02 +/- 47.80 |
small-gen | 435.73 +/- 205.61 | -228.53 +/- 214.34 |
small-gen-rgoal | 423.52 +/- 226.68 | -218.62 +/- 240.20 |
medium-gen | 1002.94 +/- 468.10 | -788.64 +/- 481.86 |
large-gen | 2548.62 +/- 1224.08 | -2327.34 +/- 1241.92 |
huge-gen | 6303.86 +/- 2403.40 | -6075.69 +/- 2434.77 |
pocp-1-gen | 15189.46 +/- 6879.75 | -14947.80 +/- 6887.43 |
pocp-2-gen | 17211.38 +/- 5855.83 | -16871.05 +/- 5864.58 |
Notes on the scenarios¶
The tiny, small, medium, large, and huge (and their generated versions) are all based on the network scenarios first used by:
- Sarraute, Carlos, Olivier Buffet, and Jörg Hoffmann. “POMDPs make better hackers: Accounting for uncertainty in penetration testing.” Twenty-Sixth AAAI Conference on Artificial Intelligence. 2012.
- Speicher, Patrick, et al. “Towards Automated Network Mitigation Analysis (extended).” arXiv preprint arXiv:1705.05088 (2017).
The pocp-1-gen and pocp-2-gen scenarios are based on the work by:
The other scenarios were made up by author after looking at some random google images of network layouts, and playing around with different interesting network topologies.