SeqActor

Sequential Actor class with LSTM layers for RDPG agent.

source

SeqActor

 SeqActor (state_dim:int=0, action_dim:int=0, hidden_dim:int=0,
           n_layers:int=0, batch_size:int=0, padding_value:float=0.0,
           tau:float=0.0, lr:float=0.0, ckpt_dir:pathlib.Path=Path('.'),
           ckpt_interval:int=0, logger:Optional[logging.Logger]=None,
           dict_logger:Optional[dict]=None)

*Sequential Actor network for the RDPG algorithm.

Attributes:

- state_dim (int): Dimension1 of the state space.
- action_dim (int): Dimension of the action space.
- hidden_dim (int): Dimension of the hidden layer.
- lr (float): Learning rate for the network.
- ckpt_dir (str): Directory to restore the checkpoint from.*
Type Default Details
state_dim int 0 dimension of the state space
action_dim int 0 dimension of the action space
hidden_dim int 0 dimension of the hidden layer
n_layers int 0 number of lstm layers
batch_size int 0 batch size
padding_value float 0.0 padding value for masking
tau float 0.0 soft update parameter
lr float 0.0 learning rate
ckpt_dir Path . checkpoint directory
ckpt_interval int 0 checkpoint interval
logger Optional None logger
dict_logger Optional None logger dict

source

SeqActor.clone_weights

 SeqActor.clone_weights (moving_net)

Clone weights from a model to another model. only for target critic


source

SeqActor.soft_update

 SeqActor.soft_update (moving_net)

Update the target weights.


source

SeqActor.save_ckpt

 SeqActor.save_ckpt ()

Save the checkpoint.


source

SeqActor.reset_noise

 SeqActor.reset_noise ()

Reset the ou_noise.


source

SeqActor.predict

 SeqActor.predict (states:tensorflow.python.framework.tensor.Tensor,
                   last_actions:tensorflow.python.framework.tensor.Tensor)

*Predict the action given the state. Batch dimension needs to be one.

Args:

states: State, Batch dimension needs to be one.
last_actions: Last action, Batch dimension needs to be one.

Return: Action*


source

SeqActor.predict_step

 SeqActor.predict_step (states, last_actions)

*Predict the action given the state.

For Inferring

Args:

states (tf.Tensor): State, Batch dimension needs to be one.
last_actions (tf.Tensor): State, Batch dimension needs to be one.

Returns:

np.array: Action, ditch the batch dimension*

source

SeqActor.evaluate_actions

 SeqActor.evaluate_actions (states, last_actions)

*Evaluate the action given the state.

For training

Args:

states (tf.Tensor): State, Batch dimension needs to be one.
last_actions (tf.Tensor): State, Batch dimension needs to be one.

Return: np.array: Action, keep the batch dimension*