mlx.nn.TransformerDecoderLayer#
- class TransformerDecoderLayer(dims: int, num_heads: int, mlp_dims: int | None = None, dropout: float = 0.0, activation: ~typing.Callable[[~typing.Any], ~typing.Any] = <mlx.gc_func object>, norm_first: bool = True)#
Methods