algorithms.RL_Algorithm.optimizers.utils package

Submodules

algorithms.RL_Algorithm.optimizers.utils.math module

algorithms.RL_Algorithm.optimizers.utils.math.normal_entropy(std)[source]
Parameters

std – std value

Returns

calculate normal distribution entropy

algorithms.RL_Algorithm.optimizers.utils.math.normal_log_density(x, mean, log_std, std)[source]
Parameters
  • x – x var

  • mean – mean for the normal distribution

  • log_std – log std

  • std – std

Returns

log density of x var given the distribution

algorithms.RL_Algorithm.optimizers.utils.replay_memory module

class algorithms.RL_Algorithm.optimizers.utils.replay_memory.Memory[source]

Bases: object

append(new_memory)[source]
push(*args)[source]

Saves a transition.

sample(batch_size=None)[source]
class algorithms.RL_Algorithm.optimizers.utils.replay_memory.Transition(state, action, mask, next_state, reward)

Bases: tuple

property action

Alias for field number 1

property mask

Alias for field number 2

property next_state

Alias for field number 3

property reward

Alias for field number 4

property state

Alias for field number 0

algorithms.RL_Algorithm.optimizers.utils.tools module

algorithms.RL_Algorithm.optimizers.utils.tools.assets_dir()[source]

algorithms.RL_Algorithm.optimizers.utils.torch module

algorithms.RL_Algorithm.optimizers.utils.torch.compute_flat_grad(output, inputs, filter_input_ids={}, retain_graph=False, create_graph=False)[source]
algorithms.RL_Algorithm.optimizers.utils.torch.get_flat_grad_from(inputs, grad_grad=False)[source]
algorithms.RL_Algorithm.optimizers.utils.torch.get_flat_params_from(model)[source]
Parameters

model – model

Returns

the flattened param extracted from the model

algorithms.RL_Algorithm.optimizers.utils.torch.ones(*sizes, out=None, dtype=None, layout=torch.strided, device=None, requires_grad=False) → Tensor

Returns a tensor filled with the scalar value 1, with the shape defined by the variable argument sizes.

Args:
sizes (int…): a sequence of integers defining the shape of the output tensor.

Can be a variable number of arguments or a collection like a list or tuple.

out (Tensor, optional): the output tensor dtype (torch.dtype, optional): the desired data type of returned tensor.

Default: if None, uses a global default (see torch.set_default_tensor_type()).

layout (torch.layout, optional): the desired layout of returned Tensor.

Default: torch.strided.

device (torch.device, optional): the desired device of returned tensor.

Default: if None, uses the current device for the default tensor type (see torch.set_default_tensor_type()). device will be the CPU for CPU tensor types and the current CUDA device for CUDA tensor types.

requires_grad (bool, optional): If autograd should record operations on the

returned tensor. Default: False.

Example:

>>> torch.ones(2, 3)
tensor([[ 1.,  1.,  1.],
        [ 1.,  1.,  1.]])

>>> torch.ones(5)
tensor([ 1.,  1.,  1.,  1.,  1.])
algorithms.RL_Algorithm.optimizers.utils.torch.set_flat_params_to(model, flat_params)[source]
Parameters
  • model – model to load the param

  • flat_params – param to pass

Returns

no return, pass the given param to the model

algorithms.RL_Algorithm.optimizers.utils.torch.tensor(data, dtype=None, device=None, requires_grad=False, pin_memory=False) → Tensor

Constructs a tensor with data.

Warning

torch.tensor() always copies data. If you have a Tensor data and want to avoid a copy, use torch.Tensor.requires_grad_() or torch.Tensor.detach(). If you have a NumPy ndarray and want to avoid a copy, use torch.as_tensor().

Warning

When data is a tensor x, torch.tensor() reads out ‘the data’ from whatever it is passed, and constructs a leaf variable. Therefore torch.tensor(x) is equivalent to x.clone().detach() and torch.tensor(x, requires_grad=True) is equivalent to x.clone().detach().requires_grad_(True). The equivalents using clone() and detach() are recommended.

Args:
data (array_like): Initial data for the tensor. Can be a list, tuple,

NumPy ndarray, scalar, and other types.

dtype (torch.dtype, optional): the desired data type of returned tensor.

Default: if None, infers data type from data.

device (torch.device, optional): the desired device of returned tensor.

Default: if None, uses the current device for the default tensor type (see torch.set_default_tensor_type()). device will be the CPU for CPU tensor types and the current CUDA device for CUDA tensor types.

requires_grad (bool, optional): If autograd should record operations on the

returned tensor. Default: False.

pin_memory (bool, optional): If set, returned tensor would be allocated in

the pinned memory. Works only for CPU tensors. Default: False.

Example:

>>> torch.tensor([[0.1, 1.2], [2.2, 3.1], [4.9, 5.2]])
tensor([[ 0.1000,  1.2000],
        [ 2.2000,  3.1000],
        [ 4.9000,  5.2000]])

>>> torch.tensor([0, 1])  # Type inference on data
tensor([ 0,  1])

>>> torch.tensor([[0.11111, 0.222222, 0.3333333]],
                 dtype=torch.float64,
                 device=torch.device('cuda:0'))  # creates a torch.cuda.DoubleTensor
tensor([[ 0.1111,  0.2222,  0.3333]], dtype=torch.float64, device='cuda:0')

>>> torch.tensor(3.14159)  # Create a scalar (zero-dimensional tensor)
tensor(3.1416)

>>> torch.tensor([])  # Create an empty tensor (of size (0,))
tensor([])
algorithms.RL_Algorithm.optimizers.utils.torch.to_device(device, *args)[source]
algorithms.RL_Algorithm.optimizers.utils.torch.zeros(*sizes, out=None, dtype=None, layout=torch.strided, device=None, requires_grad=False) → Tensor

Returns a tensor filled with the scalar value 0, with the shape defined by the variable argument sizes.

Args:
sizes (int…): a sequence of integers defining the shape of the output tensor.

Can be a variable number of arguments or a collection like a list or tuple.

out (Tensor, optional): the output tensor dtype (torch.dtype, optional): the desired data type of returned tensor.

Default: if None, uses a global default (see torch.set_default_tensor_type()).

layout (torch.layout, optional): the desired layout of returned Tensor.

Default: torch.strided.

device (torch.device, optional): the desired device of returned tensor.

Default: if None, uses the current device for the default tensor type (see torch.set_default_tensor_type()). device will be the CPU for CPU tensor types and the current CUDA device for CUDA tensor types.

requires_grad (bool, optional): If autograd should record operations on the

returned tensor. Default: False.

Example:

>>> torch.zeros(2, 3)
tensor([[ 0.,  0.,  0.],
        [ 0.,  0.,  0.]])

>>> torch.zeros(5)
tensor([ 0.,  0.,  0.,  0.,  0.])

algorithms.RL_Algorithm.optimizers.utils.zfilter module

class algorithms.RL_Algorithm.optimizers.utils.zfilter.RunningStat(shape)[source]

Bases: object

property mean
property n
push(x)[source]
property shape
property std
property var
class algorithms.RL_Algorithm.optimizers.utils.zfilter.ZFilter(shape, demean=True, destd=True, clip=10.0)[source]

Bases: object

y = (x-mean)/std using running estimates of mean,std

Module contents

algorithms.RL_Algorithm.optimizers.utils.ones(*sizes, out=None, dtype=None, layout=torch.strided, device=None, requires_grad=False) → Tensor

Returns a tensor filled with the scalar value 1, with the shape defined by the variable argument sizes.

Args:
sizes (int…): a sequence of integers defining the shape of the output tensor.

Can be a variable number of arguments or a collection like a list or tuple.

out (Tensor, optional): the output tensor dtype (torch.dtype, optional): the desired data type of returned tensor.

Default: if None, uses a global default (see torch.set_default_tensor_type()).

layout (torch.layout, optional): the desired layout of returned Tensor.

Default: torch.strided.

device (torch.device, optional): the desired device of returned tensor.

Default: if None, uses the current device for the default tensor type (see torch.set_default_tensor_type()). device will be the CPU for CPU tensor types and the current CUDA device for CUDA tensor types.

requires_grad (bool, optional): If autograd should record operations on the

returned tensor. Default: False.

Example:

>>> torch.ones(2, 3)
tensor([[ 1.,  1.,  1.],
        [ 1.,  1.,  1.]])

>>> torch.ones(5)
tensor([ 1.,  1.,  1.,  1.,  1.])
algorithms.RL_Algorithm.optimizers.utils.tensor(data, dtype=None, device=None, requires_grad=False, pin_memory=False) → Tensor

Constructs a tensor with data.

Warning

torch.tensor() always copies data. If you have a Tensor data and want to avoid a copy, use torch.Tensor.requires_grad_() or torch.Tensor.detach(). If you have a NumPy ndarray and want to avoid a copy, use torch.as_tensor().

Warning

When data is a tensor x, torch.tensor() reads out ‘the data’ from whatever it is passed, and constructs a leaf variable. Therefore torch.tensor(x) is equivalent to x.clone().detach() and torch.tensor(x, requires_grad=True) is equivalent to x.clone().detach().requires_grad_(True). The equivalents using clone() and detach() are recommended.

Args:
data (array_like): Initial data for the tensor. Can be a list, tuple,

NumPy ndarray, scalar, and other types.

dtype (torch.dtype, optional): the desired data type of returned tensor.

Default: if None, infers data type from data.

device (torch.device, optional): the desired device of returned tensor.

Default: if None, uses the current device for the default tensor type (see torch.set_default_tensor_type()). device will be the CPU for CPU tensor types and the current CUDA device for CUDA tensor types.

requires_grad (bool, optional): If autograd should record operations on the

returned tensor. Default: False.

pin_memory (bool, optional): If set, returned tensor would be allocated in

the pinned memory. Works only for CPU tensors. Default: False.

Example:

>>> torch.tensor([[0.1, 1.2], [2.2, 3.1], [4.9, 5.2]])
tensor([[ 0.1000,  1.2000],
        [ 2.2000,  3.1000],
        [ 4.9000,  5.2000]])

>>> torch.tensor([0, 1])  # Type inference on data
tensor([ 0,  1])

>>> torch.tensor([[0.11111, 0.222222, 0.3333333]],
                 dtype=torch.float64,
                 device=torch.device('cuda:0'))  # creates a torch.cuda.DoubleTensor
tensor([[ 0.1111,  0.2222,  0.3333]], dtype=torch.float64, device='cuda:0')

>>> torch.tensor(3.14159)  # Create a scalar (zero-dimensional tensor)
tensor(3.1416)

>>> torch.tensor([])  # Create an empty tensor (of size (0,))
tensor([])
algorithms.RL_Algorithm.optimizers.utils.zeros(*sizes, out=None, dtype=None, layout=torch.strided, device=None, requires_grad=False) → Tensor

Returns a tensor filled with the scalar value 0, with the shape defined by the variable argument sizes.

Args:
sizes (int…): a sequence of integers defining the shape of the output tensor.

Can be a variable number of arguments or a collection like a list or tuple.

out (Tensor, optional): the output tensor dtype (torch.dtype, optional): the desired data type of returned tensor.

Default: if None, uses a global default (see torch.set_default_tensor_type()).

layout (torch.layout, optional): the desired layout of returned Tensor.

Default: torch.strided.

device (torch.device, optional): the desired device of returned tensor.

Default: if None, uses the current device for the default tensor type (see torch.set_default_tensor_type()). device will be the CPU for CPU tensor types and the current CUDA device for CUDA tensor types.

requires_grad (bool, optional): If autograd should record operations on the

returned tensor. Default: False.

Example:

>>> torch.zeros(2, 3)
tensor([[ 0.,  0.,  0.],
        [ 0.,  0.,  0.]])

>>> torch.zeros(5)
tensor([ 0.,  0.,  0.,  0.,  0.])