block_zoo.transformer package¶
Submodules¶
block_zoo.transformer.MLP module¶
-
class
block_zoo.transformer.MLP.
MLP
(layer_conf)[source]¶ Bases:
torch.nn.modules.module.Module
MLP layer
Parameters: layer_conf (MLPConf) – configuration of a layer
-
class
block_zoo.transformer.MLP.
MLPConf
(**kwargs)[source]¶ Bases:
block_zoo.BaseLayer.BaseConf
Configuration of MLP layer
Parameters: dropout (float) – the dropout of MLP layer -
declare
()[source]¶ Define things like “input_ranks” and “num_of_inputs”, which are certain with regard to your layer
num_of_input is N(N>0) means this layer accepts N inputs;
num_of_input is -1 means this layer accepts any number of inputs;
The rank here is not the same as matrix rank:
For a scalar, its rank is 0;
For a vector, its rank is 1;
For a matrix, its rank is 2;
For a cube of numbers, its rank is 3.
… For instance, the rank of (batch size, sequence length, hidden_dim) is 3.
if num_of_input > 0:
len(input_ranks) should be equal to num_of_inputelif num_of_input == -1:
input_ranks should be a list with only one element and the rank of all the inputs should be equal to that element.NOTE: when we build the model, if num_of_input is -1, we would replace it with the real number of inputs and replace input_ranks with a list of real input_ranks.
Returns: None
-
default
()[source]¶ Define the default hyper parameters here. You can define these hyper parameters in your configuration file as well.
Returns: None
-
block_zoo.transformer.MultiHeadAttention module¶
-
class
block_zoo.transformer.MultiHeadAttention.
MultiHeadAttention
(layer_conf)[source]¶ Bases:
torch.nn.modules.module.Module
MultiHeadAttention Layer
Parameters: layer_conf (MultiHeadAttentionConf) – configuration of a layer
-
class
block_zoo.transformer.MultiHeadAttention.
MultiHeadAttentionConf
(**kwargs)[source]¶ Bases:
block_zoo.BaseLayer.BaseConf
Configuration of MultiHeadAttention Layer
Parameters: -
declare
()[source]¶ Define things like “input_ranks” and “num_of_inputs”, which are certain with regard to your layer
num_of_input is N(N>0) means this layer accepts N inputs;
num_of_input is -1 means this layer accepts any number of inputs;
The rank here is not the same as matrix rank:
For a scalar, its rank is 0;
For a vector, its rank is 1;
For a matrix, its rank is 2;
For a cube of numbers, its rank is 3.
… For instance, the rank of (batch size, sequence length, hidden_dim) is 3.
if num_of_input > 0:
len(input_ranks) should be equal to num_of_inputelif num_of_input == -1:
input_ranks should be a list with only one element and the rank of all the inputs should be equal to that element.NOTE: when we build the model, if num_of_input is -1, we would replace it with the real number of inputs and replace input_ranks with a list of real input_ranks.
Returns: None
-
default
()[source]¶ Define the default hyper parameters here. You can define these hyper parameters in your configuration file as well.
Returns: None
-