algorithms.policy package¶
Submodules¶
algorithms.policy.GRUCell module¶
-
class
algorithms.policy.GRUCell.
GRUCell
(input_size, hidden_size)[source]¶ Bases:
torch.nn.modules.module.Module
-
forward
(x, h=None)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
algorithms.policy.GRUNetwork module¶
-
class
algorithms.policy.GRUNetwork.
GRUNetwork
(input_dim, output_dim, hidden_dim, gru_layer=<class 'algorithms.policy.GRUCell.GRUCell'>, output_nonlinearity=None)[source]¶ Bases:
torch.nn.modules.module.Module
-
forward
(x, h=None)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
algorithms.policy.GaussianGRUPolicy module¶
-
class
algorithms.policy.GaussianGRUPolicy.
GaussianGRUPolicy
(env_spec, hidden_dim=32, feature_network=None, state_include_action=True, gru_layer=<class 'algorithms.policy.GRUCell.GRUCell'>, output_nonlinearity=None, mode: int = 0, log_std=0, cuda_enable=True)[source]¶ Bases:
torch.nn.modules.module.Module
-
property
action_space
¶
-
property
distribution
¶
-
forward
(x, h=None)[source]¶ - Parameters
x – input feature
h – hidden layer
- Returns
output mean, log std for action and hidden layer for next round
-
get_action
(observation)[source]¶ - Parameters
observation – input observation
- Returns
get actions from the given observation
-
get_actions
(observations)[source]¶ - Parameters
observations – a batch of observations
- Returns
get the corresponding batch of actions
-
get_actions_with_prev
(observations, prev_actions, prev_hiddens)[source]¶ - Parameters
observations – input batch of observations
prev_actions – previous batch of actions
prev_hiddens – previous hidden layer
- Returns
actions for the current batch of observations
-
get_fim
(x, actions)[source]¶ - Parameters
x – input observation feature
actions – input actions
- Returns
get fisher information matrix
-
get_kl
(x, actions, h=None)[source]¶ - Parameters
x – input feature
actions – actions
h – hidden layer
- Returns
KL divergence of updated policy and the old one
-
get_log_prob
(x, actions)[source]¶ - Parameters
x – input obs feature
actions – input actions
- Returns
log likelihood of the actions given the distribution output by the network
-
load_param
(param_path: str)[source]¶ - Parameters
param_path – saved parameter file path
- Returns
no return, load the parameter into the current model
-
property
observation_space
¶
-
property
recurrent
¶
-
reset
(dones=None)[source]¶ - Parameters
dones – indicators of whether all the agent have finished their episode or not
- Returns
no return, update some information according the given list of dones
-
property
state_info_specs
¶
-
property
vectorized
¶
-
property
algorithms.policy.GaussianMLPBaseline module¶
-
class
algorithms.policy.GaussianMLPBaseline.
GaussianMLP
(input_dim, output_dim, mean_network=None, optimizer=None, hidden_size=(32, 32), step_size=0.01, init_std=1.0, normalize_inputs=True, normalize_outputs=True, subsample_factor=1.0, max_itr=20)[source]¶ Bases:
torch.nn.modules.module.Module
-
fit
(xs, ys)[source]¶ - Parameters
xs – feature
ys – ground truth y
- Returns
no return, fit our model accordingly
-
forward
(x)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
algorithms.policy.MLP module¶
-
class
algorithms.policy.MLP.
MLP
(input_size, hidden_size, output_size)[source]¶ Bases:
torch.nn.modules.module.Module
-
forward
(x)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-