pantheonrl.algos.bc
Behavioural Cloning (BC). Trains policy by applying supervised learning to a fixed dataset of (observation, action) pairs generated by some expert demonstrator.
https://github.com/HumanCompatibleAI/imitation/blob/master/src/imitation/algorithms/bc.py
Functions
Reconstruct a saved policy. Args: policy_path: path where .save_policy() has been run. device: device on which to load the policy. Returns: policy: policy with reloaded weights. |
Classes
Behavioral cloning (BC). |
|
Shell class for BC policy |
|
A callable that returns a constant learning rate. |
|
Wraps DataLoader so that all BC batches can be processed in a one for-loop. Also uses tqdm to show progress in stdout. Args: data_loader: An iterable over data dicts, as used in BC. n_epochs: The number of epochs to iterate through in one call to __iter__. Exactly one of n_epochs and n_batches should be provided. n_batches: The number of batches to iterate through in one call to __iter__. Exactly one of n_epochs and n_batches should be provided. on_epoch_end: A callback function without parameters to be called at the end of every epoch. on_batch_end: A callback function without parameters to be called at the end of every batch. |