algorithms.RL_Algorithm.GAIL package¶
Submodules¶
algorithms.RL_Algorithm.GAIL.gail module¶
-
class
algorithms.RL_Algorithm.GAIL.gail.
GAIL
(env, policy, baseline, critic=None, recognition=None, step_size=0.01, reward_handler=<algorithms.RL_Algorithm.utils.RewardHandler object>, saver=None, saver_filepath=None, validator=None, snapshot_env=True, scope=None, n_itr=500, start_itr=0, batch_size=5000, max_path_length=500, discount=0.99, gae_lambda=1, plot=False, pause_for_plot=False, center_adv=True, positive_adv=False, store_paths=False, whole_paths=True, fixed_horizon=False, sampler_cls=None, sampler_args=None, force_batch_sampler=False, max_kl=None, damping=None, l2_reg=None, policy_filepath=None, critic_filepath=None, env_filepath=None, cuda_enable=True, args=None)[source]¶ Bases:
object
-
load
()[source]¶ Load parameters from a filepath. Symmetric to _save. This is not ideal, but it’s easier than keeping track of everything separately.
-
optimize_policy
(itr, samples_data)[source]¶ Update the critic and recognition model in addition to the policy
- Args:
itr: iteration counter samples_data: dictionary resulting from process_samples
- keys: ‘rewards’, ‘observations’, ‘agent_infos’, ‘env_infos’, ‘returns’,
‘actions’, ‘advantages’, ‘paths’
- the values in the infos dicts can be accessed for example as:
samples_data[‘agent_infos’][‘prob’]
and the returned value will be an array of shape (batch_size, prob_dim)
-
-
algorithms.RL_Algorithm.GAIL.gail.
ones
(*sizes, out=None, dtype=None, layout=torch.strided, device=None, requires_grad=False) → Tensor¶ Returns a tensor filled with the scalar value 1, with the shape defined by the variable argument
sizes
.- Args:
- sizes (int…): a sequence of integers defining the shape of the output tensor.
Can be a variable number of arguments or a collection like a list or tuple.
out (Tensor, optional): the output tensor dtype (
torch.dtype
, optional): the desired data type of returned tensor.Default: if
None
, uses a global default (seetorch.set_default_tensor_type()
).- layout (
torch.layout
, optional): the desired layout of returned Tensor. Default:
torch.strided
.- device (
torch.device
, optional): the desired device of returned tensor. Default: if
None
, uses the current device for the default tensor type (seetorch.set_default_tensor_type()
).device
will be the CPU for CPU tensor types and the current CUDA device for CUDA tensor types.- requires_grad (bool, optional): If autograd should record operations on the
returned tensor. Default:
False
.
Example:
>>> torch.ones(2, 3) tensor([[ 1., 1., 1.], [ 1., 1., 1.]]) >>> torch.ones(5) tensor([ 1., 1., 1., 1., 1.])
-
algorithms.RL_Algorithm.GAIL.gail.
tensor
(data, dtype=None, device=None, requires_grad=False, pin_memory=False) → Tensor¶ Constructs a tensor with
data
.Warning
torch.tensor()
always copiesdata
. If you have a Tensordata
and want to avoid a copy, usetorch.Tensor.requires_grad_()
ortorch.Tensor.detach()
. If you have a NumPyndarray
and want to avoid a copy, usetorch.as_tensor()
.Warning
When data is a tensor x,
torch.tensor()
reads out ‘the data’ from whatever it is passed, and constructs a leaf variable. Thereforetorch.tensor(x)
is equivalent tox.clone().detach()
andtorch.tensor(x, requires_grad=True)
is equivalent tox.clone().detach().requires_grad_(True)
. The equivalents usingclone()
anddetach()
are recommended.- Args:
- data (array_like): Initial data for the tensor. Can be a list, tuple,
NumPy
ndarray
, scalar, and other types.- dtype (
torch.dtype
, optional): the desired data type of returned tensor. Default: if
None
, infers data type fromdata
.- device (
torch.device
, optional): the desired device of returned tensor. Default: if
None
, uses the current device for the default tensor type (seetorch.set_default_tensor_type()
).device
will be the CPU for CPU tensor types and the current CUDA device for CUDA tensor types.- requires_grad (bool, optional): If autograd should record operations on the
returned tensor. Default:
False
.- pin_memory (bool, optional): If set, returned tensor would be allocated in
the pinned memory. Works only for CPU tensors. Default:
False
.
Example:
>>> torch.tensor([[0.1, 1.2], [2.2, 3.1], [4.9, 5.2]]) tensor([[ 0.1000, 1.2000], [ 2.2000, 3.1000], [ 4.9000, 5.2000]]) >>> torch.tensor([0, 1]) # Type inference on data tensor([ 0, 1]) >>> torch.tensor([[0.11111, 0.222222, 0.3333333]], dtype=torch.float64, device=torch.device('cuda:0')) # creates a torch.cuda.DoubleTensor tensor([[ 0.1111, 0.2222, 0.3333]], dtype=torch.float64, device='cuda:0') >>> torch.tensor(3.14159) # Create a scalar (zero-dimensional tensor) tensor(3.1416) >>> torch.tensor([]) # Create an empty tensor (of size (0,)) tensor([])
-
algorithms.RL_Algorithm.GAIL.gail.
zeros
(*sizes, out=None, dtype=None, layout=torch.strided, device=None, requires_grad=False) → Tensor¶ Returns a tensor filled with the scalar value 0, with the shape defined by the variable argument
sizes
.- Args:
- sizes (int…): a sequence of integers defining the shape of the output tensor.
Can be a variable number of arguments or a collection like a list or tuple.
out (Tensor, optional): the output tensor dtype (
torch.dtype
, optional): the desired data type of returned tensor.Default: if
None
, uses a global default (seetorch.set_default_tensor_type()
).- layout (
torch.layout
, optional): the desired layout of returned Tensor. Default:
torch.strided
.- device (
torch.device
, optional): the desired device of returned tensor. Default: if
None
, uses the current device for the default tensor type (seetorch.set_default_tensor_type()
).device
will be the CPU for CPU tensor types and the current CUDA device for CUDA tensor types.- requires_grad (bool, optional): If autograd should record operations on the
returned tensor. Default:
False
.
Example:
>>> torch.zeros(2, 3) tensor([[ 0., 0., 0.], [ 0., 0., 0.]]) >>> torch.zeros(5) tensor([ 0., 0., 0., 0., 0.])