Interacting with NASim Environment¶
Assuming you are comfortable loading an environment from a scenario (see Starting a NASim Environment or Starting NASim using OpenAI gym), then interacting with a NASim Environment is very easy and follows the same interface as gymnasium.
Starting the environment¶
First thing is simply loading the environment:
import nasim
# load my environment in the desired way (make_benchmark, load, generate)
env = nasim.make_benchmark("tiny")
# or using gym
import gymnasium as gym
env = gym.make("nasim:Tiny-PO-v0")
Here we are using the default environment parameters: fully_obs=False
, flat_actions=True
, and flat_obs=True
.
The number of actions can be retrieved from the environment action_space
attribute as follows:
# When flat_actions=True
num_actions = env.action_space.n
# When flat_actions=False
nvec_actions = env.action_space.nvec
The shape of the observations can be retrieved from the environment observation_space
attribute as follows:
obs_shape = env.observation_space.shape
Getting the initial observation and resetting the environment¶
To reset the environment and get the initial observation, use the reset()
function:
o, info = env.reset()
The info
return value contains optional auxiliary information.
Performing a single step¶
A step in the environment can be taken using the step(action)
function. Here action
can take a few different forms depending on if using flat_actions=True
or flat_actions=False
, for our example we can simply pass an integer with 0 <= action < N, which specifies the index of the action in the action space. The step
function then returns a (Observation, float, bool, bool, dict)
tuple corresponding to observation, reward, done, step limit reached, auxiliary info, respectively:
action = # integer in range [0, env.action_space.n]
o, r, done, step_limit_reached, info = env.step(action)
if done=True
then the goal has been reached, and the episode is over. Alternatively, if the current scenario has a step limit and step_limit_reached=True
then, well, the step limit has been reached. Following both cases, it is then recommended to stop or reset the environment, otherwise theres no gaurantee of what will happen (especially the first case).
Visualizing the environment¶
You can use the render()
function to get a human readable visualization of the state of the environment. To use render correctly make sure to pass render_mode="human"
to the environment initialization function:
import nasim
# load my environment in the desired way (make_benchmark, load, generate)
env = nasim.make_benchmark("tiny", render_mode="human")
# or using gym
import gymnasium as gym
env = gym.make("nasim:Tiny-PO-v0", render_mode="human")
env.reset()
# render the environment
# (if render_mode="human" is not passed during initialization this will do nothing)
env.render()
An example agent¶
Some example agents are provided in the nasim/agents
directory. Here is a quick example of a hypothetical agent interacting with the environment:
import nasim
env = nasim.make_benchmark("tiny")
agent = AnAgent(...)
o, info = env.reset()
total_reward = 0
done = False
step_limit_reached = False
while not done and not step_limit_reached:
a = agent.choose_action(o)
o, r, done, step_limit_reached, info = env.step(a)
total_reward += r
print("Done")
print("Total reward =", total_reward)
It’s as simple as that.