SeqActor

Sequential Actor class with LSTM layers for RDPG agent.

SeqActor

 SeqActor (state_dim:int=0, action_dim:int=0, hidden_dim:int=0,
           n_layers:int=0, batch_size:int=0, padding_value:float=0.0,
           tau:float=0.0, lr:float=0.0, ckpt_dir:pathlib.Path=Path('.'),
           ckpt_interval:int=0, logger:Optional[logging.Logger]=None,
           dict_logger:Optional[dict]=None)

*Sequential Actor network for the RDPG algorithm.

Attributes:

- state_dim (int): Dimension1 of the state space.
- action_dim (int): Dimension of the action space.
- hidden_dim (int): Dimension of the hidden layer.
- lr (float): Learning rate for the network.
- ckpt_dir (str): Directory to restore the checkpoint from.*

	Type	Default	Details
state_dim	int	0	dimension of the state space
action_dim	int	0	dimension of the action space
hidden_dim	int	0	dimension of the hidden layer
n_layers	int	0	number of lstm layers
batch_size	int	0	batch size
padding_value	float	0.0	padding value for masking
tau	float	0.0	soft update parameter
lr	float	0.0	learning rate
ckpt_dir	Path	.	checkpoint directory
ckpt_interval	int	0	checkpoint interval
logger	Optional	None	logger
dict_logger	Optional	None	logger dict

source

SeqActor.clone_weights

 SeqActor.clone_weights (moving_net)

Clone weights from a model to another model. only for target critic

source

SeqActor.soft_update

 SeqActor.soft_update (moving_net)

Update the target weights.

source

SeqActor.save_ckpt

 SeqActor.save_ckpt ()

Save the checkpoint.

source

SeqActor.reset_noise

 SeqActor.reset_noise ()

Reset the ou_noise.

source

SeqActor.predict

 SeqActor.predict (states:tensorflow.python.framework.tensor.Tensor,
                   last_actions:tensorflow.python.framework.tensor.Tensor)

*Predict the action given the state. Batch dimension needs to be one.

Args:

states: State, Batch dimension needs to be one.
last_actions: Last action, Batch dimension needs to be one.

Return: Action*

source

SeqActor.predict_step

 SeqActor.predict_step (states, last_actions)

*Predict the action given the state.

For Inferring

Args:

states (tf.Tensor): State, Batch dimension needs to be one.
last_actions (tf.Tensor): State, Batch dimension needs to be one.

Returns:

np.array: Action, ditch the batch dimension*

source

SeqActor.evaluate_actions

 SeqActor.evaluate_actions (states, last_actions)

*Evaluate the action given the state.

For training

Args:

states (tf.Tensor): State, Batch dimension needs to be one.
last_actions (tf.Tensor): State, Batch dimension needs to be one.

Return: np.array: Action, keep the batch dimension*