algorithms.RL_Algorithm package¶
Subpackages¶
- algorithms.RL_Algorithm.GAIL package
- algorithms.RL_Algorithm.optimizers package
- Subpackages
- algorithms.RL_Algorithm.optimizers.utils package
- Submodules
- algorithms.RL_Algorithm.optimizers.utils.math module
- algorithms.RL_Algorithm.optimizers.utils.replay_memory module
- algorithms.RL_Algorithm.optimizers.utils.tools module
- algorithms.RL_Algorithm.optimizers.utils.torch module
- algorithms.RL_Algorithm.optimizers.utils.zfilter module
- Module contents
- algorithms.RL_Algorithm.optimizers.utils package
- Submodules
- algorithms.RL_Algorithm.optimizers.trpo module
- Module contents
- Subpackages
Submodules¶
algorithms.RL_Algorithm.utils module¶
-
class
algorithms.RL_Algorithm.utils.
RewardHandler
(use_env_rewards=True, critic_clip_low=-inf, critic_clip_high=inf, critic_initial_scale=1.0, critic_final_scale=1.0, recognition_initial_scale=1, recognition_final_scale=1.0, augmentation_scale=1.0, normalize_rewards=False, alpha=0.01, max_epochs=10000, summary_writer=None)[source]¶ Bases:
object
-
merge
(paths, critic_rewards=None, recognition_rewards=None)[source]¶ Add critic and recognition rewards to path rewards based on settings
- Args:
paths: list of dictionaries as described in process_samples critic_rewards: list of numpy arrays of equal shape as corresponding path[‘rewards’] recognition_rewards: same as critic rewards
-