pantheonrl.algos.adap.agent.AdapAgent
- class AdapAgent(model, log_interval=None, working_timesteps=1000, callback=None, tb_log_name='AdapAgent', latent_syncer=None)[source]
Bases:
OnPolicyAgentAgent representing an ADAP learning algorithm.
The get_action and update functions are based on the learn function from
OnPolicyAlgorithm.- Parameters:
model (ADAP) – Model representing the agent’s learning algorithm
log_interval – Optional log interval for policy logging
working_timesteps – Estimate for number of timesteps to train for.
callback – Optional callback fed into the OnPolicyAlgorithm
tb_log_name – Name for tensorboard log
latent_syncer (AdapPolicy | None) –
Methods
Return an action given an observation.
Call the model's learn function with the given parameters
Add new rewards and done information.
- get_action(obs)[source]
Return an action given an observation.
The agent saves the last transition into its buffer. It also updates the model if the buffer is full.
- Parameters:
obs (Observation) – The observation to use
- Returns:
The action to take
- Return type:
ndarray
- learn(**kwargs)
Call the model’s learn function with the given parameters
- Return type:
None