SeqActor
Sequential Actor class with LSTM layers for RDPG agent.
SeqActor
SeqActor (state_dim:int=0, action_dim:int=0, hidden_dim:int=0, n_layers:int=0, batch_size:int=0, padding_value:float=0.0, tau:float=0.0, lr:float=0.0, ckpt_dir:pathlib.Path=Path('.'), ckpt_interval:int=0, logger:Optional[logging.Logger]=None, dict_logger:Optional[dict]=None)
*Sequential Actor network for the RDPG algorithm.
Attributes:
- state_dim (int): Dimension1 of the state space.
- action_dim (int): Dimension of the action space.
- hidden_dim (int): Dimension of the hidden layer.
- lr (float): Learning rate for the network.
- ckpt_dir (str): Directory to restore the checkpoint from.*
Type | Default | Details | |
---|---|---|---|
state_dim | int | 0 | dimension of the state space |
action_dim | int | 0 | dimension of the action space |
hidden_dim | int | 0 | dimension of the hidden layer |
n_layers | int | 0 | number of lstm layers |
batch_size | int | 0 | batch size |
padding_value | float | 0.0 | padding value for masking |
tau | float | 0.0 | soft update parameter |
lr | float | 0.0 | learning rate |
ckpt_dir | Path | . | checkpoint directory |
ckpt_interval | int | 0 | checkpoint interval |
logger | Optional | None | logger |
dict_logger | Optional | None | logger dict |
SeqActor.clone_weights
SeqActor.clone_weights (moving_net)
Clone weights from a model to another model. only for target critic
SeqActor.soft_update
SeqActor.soft_update (moving_net)
Update the target weights.
SeqActor.save_ckpt
SeqActor.save_ckpt ()
Save the checkpoint.
SeqActor.reset_noise
SeqActor.reset_noise ()
Reset the ou_noise.
SeqActor.predict
SeqActor.predict (states:tensorflow.python.framework.tensor.Tensor, last_actions:tensorflow.python.framework.tensor.Tensor)
*Predict the action given the state. Batch dimension needs to be one.
Args:
states: State, Batch dimension needs to be one.
last_actions: Last action, Batch dimension needs to be one.
Return: Action*
SeqActor.predict_step
SeqActor.predict_step (states, last_actions)
*Predict the action given the state.
For Inferring
Args:
states (tf.Tensor): State, Batch dimension needs to be one.
last_actions (tf.Tensor): State, Batch dimension needs to be one.
Returns:
np.array: Action, ditch the batch dimension*
SeqActor.evaluate_actions
SeqActor.evaluate_actions (states, last_actions)
*Evaluate the action given the state.
For training
Args:
states (tf.Tensor): State, Batch dimension needs to be one.
last_actions (tf.Tensor): State, Batch dimension needs to be one.
Return: np.array: Action, keep the batch dimension*