SeqActor
    Sequential Actor class with LSTM layers for RDPG agent.
  
SeqActor
SeqActor (state_dim:int=0, action_dim:int=0, hidden_dim:int=0, n_layers:int=0, batch_size:int=0, padding_value:float=0.0, tau:float=0.0, lr:float=0.0, ckpt_dir:pathlib.Path=Path('.'), ckpt_interval:int=0, logger:Optional[logging.Logger]=None, dict_logger:Optional[dict]=None)
*Sequential Actor network for the RDPG algorithm.
Attributes:
- state_dim (int): Dimension1 of the state space.
- action_dim (int): Dimension of the action space.
- hidden_dim (int): Dimension of the hidden layer.
- lr (float): Learning rate for the network.
- ckpt_dir (str): Directory to restore the checkpoint from.*| Type | Default | Details | |
|---|---|---|---|
| state_dim | int | 0 | dimension of the state space | 
| action_dim | int | 0 | dimension of the action space | 
| hidden_dim | int | 0 | dimension of the hidden layer | 
| n_layers | int | 0 | number of lstm layers | 
| batch_size | int | 0 | batch size | 
| padding_value | float | 0.0 | padding value for masking | 
| tau | float | 0.0 | soft update parameter | 
| lr | float | 0.0 | learning rate | 
| ckpt_dir | Path | . | checkpoint directory | 
| ckpt_interval | int | 0 | checkpoint interval | 
| logger | Optional | None | logger | 
| dict_logger | Optional | None | logger dict | 
SeqActor.clone_weights
SeqActor.clone_weights (moving_net)
Clone weights from a model to another model. only for target critic
SeqActor.soft_update
SeqActor.soft_update (moving_net)
Update the target weights.
SeqActor.save_ckpt
SeqActor.save_ckpt ()
Save the checkpoint.
SeqActor.reset_noise
SeqActor.reset_noise ()
Reset the ou_noise.
SeqActor.predict
SeqActor.predict (states:tensorflow.python.framework.tensor.Tensor, last_actions:tensorflow.python.framework.tensor.Tensor)
*Predict the action given the state. Batch dimension needs to be one.
Args:
states: State, Batch dimension needs to be one.
last_actions: Last action, Batch dimension needs to be one.Return: Action*
SeqActor.predict_step
SeqActor.predict_step (states, last_actions)
*Predict the action given the state.
For Inferring
Args:
states (tf.Tensor): State, Batch dimension needs to be one.
last_actions (tf.Tensor): State, Batch dimension needs to be one.Returns:
np.array: Action, ditch the batch dimension*SeqActor.evaluate_actions
SeqActor.evaluate_actions (states, last_actions)
*Evaluate the action given the state.
For training
Args:
states (tf.Tensor): State, Batch dimension needs to be one.
last_actions (tf.Tensor): State, Batch dimension needs to be one.Return: np.array: Action, keep the batch dimension*