algorithms.RL_Algorithm.optimizers.utils package¶

Submodules¶

algorithms.RL_Algorithm.optimizers.utils.math module¶

algorithms.RL_Algorithm.optimizers.utils.math.normal_entropy(std)[source]¶

Parameters: std – std value
Returns: calculate normal distribution entropy

algorithms.RL_Algorithm.optimizers.utils.math.normal_log_density(x, mean, log_std, std)[source]¶

Parameters

x – x var
mean – mean for the normal distribution
log_std – log std
std – std

Returns

log density of x var given the distribution

algorithms.RL_Algorithm.optimizers.utils.replay_memory module¶

class algorithms.RL_Algorithm.optimizers.utils.replay_memory.Memory[source]¶

Bases: object

append(new_memory)[source]¶

push(*args)[source]¶: Saves a transition.

sample(batch_size=None)[source]¶

class algorithms.RL_Algorithm.optimizers.utils.replay_memory.Transition(state, action, mask, next_state, reward)¶

Bases: tuple

property action¶: Alias for field number 1

property mask¶: Alias for field number 2

property next_state¶: Alias for field number 3

property reward¶: Alias for field number 4

property state¶: Alias for field number 0

algorithms.RL_Algorithm.optimizers.utils.tools module¶

algorithms.RL_Algorithm.optimizers.utils.tools.assets_dir()[source]¶

algorithms.RL_Algorithm.optimizers.utils.torch module¶

algorithms.RL_Algorithm.optimizers.utils.torch.compute_flat_grad(output, inputs, filter_input_ids={}, retain_graph=False, create_graph=False)[source]¶

algorithms.RL_Algorithm.optimizers.utils.torch.get_flat_grad_from(inputs, grad_grad=False)[source]¶

algorithms.RL_Algorithm.optimizers.utils.torch.get_flat_params_from(model)[source]¶

Parameters: model – model
Returns: the flattened param extracted from the model

algorithms.RL_Algorithm.optimizers.utils.torch.ones(*sizes, out=None, dtype=None, layout=torch.strided, device=None, requires_grad=False) → Tensor¶

Returns a tensor filled with the scalar value 1, with the shape defined by the variable argument sizes.

Args:

sizes (int…): a sequence of integers defining the shape of the output tensor.: Can be a variable number of arguments or a collection like a list or tuple.

out (Tensor, optional): the output tensor dtype (torch.dtype, optional): the desired data type of returned tensor.

Default: if None, uses a global default (see torch.set_default_tensor_type()).

layout (torch.layout, optional): the desired layout of returned Tensor.: Default: torch.strided.
device (torch.device, optional): the desired device of returned tensor.: Default: if None, uses the current device for the default tensor type (see torch.set_default_tensor_type()). device will be the CPU for CPU tensor types and the current CUDA device for CUDA tensor types.
requires_grad (bool, optional): If autograd should record operations on the: returned tensor. Default: False.

Example:

>>> torch.ones(2, 3)
tensor([[ 1.,  1.,  1.],
        [ 1.,  1.,  1.]])

>>> torch.ones(5)
tensor([ 1.,  1.,  1.,  1.,  1.])

algorithms.RL_Algorithm.optimizers.utils.torch.set_flat_params_to(model, flat_params)[source]¶

Parameters

model – model to load the param
flat_params – param to pass

Returns

no return, pass the given param to the model

algorithms.RL_Algorithm.optimizers.utils.torch.tensor(data, dtype=None, device=None, requires_grad=False, pin_memory=False) → Tensor¶

Constructs a tensor with data.

Warning

torch.tensor() always copies data. If you have a Tensor data and want to avoid a copy, use torch.Tensor.requires_grad_() or torch.Tensor.detach(). If you have a NumPy ndarray and want to avoid a copy, use torch.as_tensor().

Warning

When data is a tensor x, torch.tensor() reads out ‘the data’ from whatever it is passed, and constructs a leaf variable. Therefore torch.tensor(x) is equivalent to x.clone().detach() and torch.tensor(x, requires_grad=True) is equivalent to x.clone().detach().requires_grad_(True). The equivalents using clone() and detach() are recommended.

Args:

data (array_like): Initial data for the tensor. Can be a list, tuple,: NumPy ndarray, scalar, and other types.
dtype (torch.dtype, optional): the desired data type of returned tensor.: Default: if None, infers data type from data.
device (torch.device, optional): the desired device of returned tensor.: Default: if None, uses the current device for the default tensor type (see torch.set_default_tensor_type()). device will be the CPU for CPU tensor types and the current CUDA device for CUDA tensor types.
requires_grad (bool, optional): If autograd should record operations on the: returned tensor. Default: False.
pin_memory (bool, optional): If set, returned tensor would be allocated in: the pinned memory. Works only for CPU tensors. Default: False.

Example:

>>> torch.tensor([[0.1, 1.2], [2.2, 3.1], [4.9, 5.2]])
tensor([[ 0.1000,  1.2000],
        [ 2.2000,  3.1000],
        [ 4.9000,  5.2000]])

>>> torch.tensor([0, 1])  # Type inference on data
tensor([ 0,  1])

>>> torch.tensor([[0.11111, 0.222222, 0.3333333]],
                 dtype=torch.float64,
                 device=torch.device('cuda:0'))  # creates a torch.cuda.DoubleTensor
tensor([[ 0.1111,  0.2222,  0.3333]], dtype=torch.float64, device='cuda:0')

>>> torch.tensor(3.14159)  # Create a scalar (zero-dimensional tensor)
tensor(3.1416)

>>> torch.tensor([])  # Create an empty tensor (of size (0,))
tensor([])

algorithms.RL_Algorithm.optimizers.utils.torch.to_device(device, *args)[source]¶

algorithms.RL_Algorithm.optimizers.utils.torch.zeros(*sizes, out=None, dtype=None, layout=torch.strided, device=None, requires_grad=False) → Tensor¶

Returns a tensor filled with the scalar value 0, with the shape defined by the variable argument sizes.

Args:

sizes (int…): a sequence of integers defining the shape of the output tensor.: Can be a variable number of arguments or a collection like a list or tuple.

out (Tensor, optional): the output tensor dtype (torch.dtype, optional): the desired data type of returned tensor.

Default: if None, uses a global default (see torch.set_default_tensor_type()).

layout (torch.layout, optional): the desired layout of returned Tensor.: Default: torch.strided.
device (torch.device, optional): the desired device of returned tensor.: Default: if None, uses the current device for the default tensor type (see torch.set_default_tensor_type()). device will be the CPU for CPU tensor types and the current CUDA device for CUDA tensor types.
requires_grad (bool, optional): If autograd should record operations on the: returned tensor. Default: False.

Example:

>>> torch.zeros(2, 3)
tensor([[ 0.,  0.,  0.],
        [ 0.,  0.,  0.]])

>>> torch.zeros(5)
tensor([ 0.,  0.,  0.,  0.,  0.])

algorithms.RL_Algorithm.optimizers.utils.zfilter module¶

class algorithms.RL_Algorithm.optimizers.utils.zfilter.RunningStat(shape)[source]¶

Bases: object

property mean¶

property n¶

push(x)[source]¶

property shape¶

property std¶

property var¶

class algorithms.RL_Algorithm.optimizers.utils.zfilter.ZFilter(shape, demean=True, destd=True, clip=10.0)[source]¶

Bases: object

y = (x-mean)/std using running estimates of mean,std

Module contents¶

algorithms.RL_Algorithm.optimizers.utils.ones(*sizes, out=None, dtype=None, layout=torch.strided, device=None, requires_grad=False) → Tensor¶

Returns a tensor filled with the scalar value 1, with the shape defined by the variable argument sizes.

Args:

sizes (int…): a sequence of integers defining the shape of the output tensor.: Can be a variable number of arguments or a collection like a list or tuple.

out (Tensor, optional): the output tensor dtype (torch.dtype, optional): the desired data type of returned tensor.

Default: if None, uses a global default (see torch.set_default_tensor_type()).

layout (torch.layout, optional): the desired layout of returned Tensor.: Default: torch.strided.
device (torch.device, optional): the desired device of returned tensor.: Default: if None, uses the current device for the default tensor type (see torch.set_default_tensor_type()). device will be the CPU for CPU tensor types and the current CUDA device for CUDA tensor types.
requires_grad (bool, optional): If autograd should record operations on the: returned tensor. Default: False.

Example:

>>> torch.ones(2, 3)
tensor([[ 1.,  1.,  1.],
        [ 1.,  1.,  1.]])

>>> torch.ones(5)
tensor([ 1.,  1.,  1.,  1.,  1.])

algorithms.RL_Algorithm.optimizers.utils.tensor(data, dtype=None, device=None, requires_grad=False, pin_memory=False) → Tensor¶

Constructs a tensor with data.

Warning

torch.tensor() always copies data. If you have a Tensor data and want to avoid a copy, use torch.Tensor.requires_grad_() or torch.Tensor.detach(). If you have a NumPy ndarray and want to avoid a copy, use torch.as_tensor().

Warning

When data is a tensor x, torch.tensor() reads out ‘the data’ from whatever it is passed, and constructs a leaf variable. Therefore torch.tensor(x) is equivalent to x.clone().detach() and torch.tensor(x, requires_grad=True) is equivalent to x.clone().detach().requires_grad_(True). The equivalents using clone() and detach() are recommended.

Args:

data (array_like): Initial data for the tensor. Can be a list, tuple,: NumPy ndarray, scalar, and other types.
dtype (torch.dtype, optional): the desired data type of returned tensor.: Default: if None, infers data type from data.
device (torch.device, optional): the desired device of returned tensor.: Default: if None, uses the current device for the default tensor type (see torch.set_default_tensor_type()). device will be the CPU for CPU tensor types and the current CUDA device for CUDA tensor types.
requires_grad (bool, optional): If autograd should record operations on the: returned tensor. Default: False.
pin_memory (bool, optional): If set, returned tensor would be allocated in: the pinned memory. Works only for CPU tensors. Default: False.

Example:

>>> torch.tensor([[0.1, 1.2], [2.2, 3.1], [4.9, 5.2]])
tensor([[ 0.1000,  1.2000],
        [ 2.2000,  3.1000],
        [ 4.9000,  5.2000]])

>>> torch.tensor([0, 1])  # Type inference on data
tensor([ 0,  1])

>>> torch.tensor([[0.11111, 0.222222, 0.3333333]],
                 dtype=torch.float64,
                 device=torch.device('cuda:0'))  # creates a torch.cuda.DoubleTensor
tensor([[ 0.1111,  0.2222,  0.3333]], dtype=torch.float64, device='cuda:0')

>>> torch.tensor(3.14159)  # Create a scalar (zero-dimensional tensor)
tensor(3.1416)

>>> torch.tensor([])  # Create an empty tensor (of size (0,))
tensor([])

algorithms.RL_Algorithm.optimizers.utils.zeros(*sizes, out=None, dtype=None, layout=torch.strided, device=None, requires_grad=False) → Tensor¶

Returns a tensor filled with the scalar value 0, with the shape defined by the variable argument sizes.

Args:

sizes (int…): a sequence of integers defining the shape of the output tensor.: Can be a variable number of arguments or a collection like a list or tuple.

out (Tensor, optional): the output tensor dtype (torch.dtype, optional): the desired data type of returned tensor.

Default: if None, uses a global default (see torch.set_default_tensor_type()).

layout (torch.layout, optional): the desired layout of returned Tensor.: Default: torch.strided.
device (torch.device, optional): the desired device of returned tensor.: Default: if None, uses the current device for the default tensor type (see torch.set_default_tensor_type()). device will be the CPU for CPU tensor types and the current CUDA device for CUDA tensor types.
requires_grad (bool, optional): If autograd should record operations on the: returned tensor. Default: False.

Example:

>>> torch.zeros(2, 3)
tensor([[ 0.,  0.,  0.],
        [ 0.,  0.,  0.]])

>>> torch.zeros(5)
tensor([ 0.,  0.,  0.,  0.,  0.])

algorithms.RL_Algorithm.optimizers.utils package¶

Submodules¶

algorithms.RL_Algorithm.optimizers.utils.math module¶

algorithms.RL_Algorithm.optimizers.utils.replay_memory module¶

algorithms.RL_Algorithm.optimizers.utils.tools module¶

algorithms.RL_Algorithm.optimizers.utils.torch module¶

algorithms.RL_Algorithm.optimizers.utils.zfilter module¶

Module contents¶

AutoEnv

Navigation

Related Topics