block_zoo.attentions package¶
Submodules¶
block_zoo.attentions.Attention module¶
-
class
block_zoo.attentions.Attention.
Attention
(layer_conf)[source]¶ Bases:
block_zoo.BaseLayer.BaseLayer
Attention layer
Given sequences X and Y, match sequence Y to each element in X. * o_i = sum(alpha_j * y_j) for i in X * alpha_j = softmax(y_j * x_i)
Parameters: layer_conf (AttentionConf) – configuration of a layer
-
class
block_zoo.attentions.Attention.
AttentionConf
(**kwargs)[source]¶ Bases:
block_zoo.BaseLayer.BaseConf
Configuration for Attention layer
-
declare
()[source]¶ Define things like “input_ranks” and “num_of_inputs”, which are certain with regard to your layer
num_of_input is N(N>0) means this layer accepts N inputs;
num_of_input is -1 means this layer accepts any number of inputs;
The rank here is not the same as matrix rank:
For a scalar, its rank is 0;
For a vector, its rank is 1;
For a matrix, its rank is 2;
For a cube of numbers, its rank is 3.
… For instance, the rank of (batch size, sequence length, hidden_dim) is 3.
if num_of_input > 0:
len(input_ranks) should be equal to num_of_inputelif num_of_input == -1:
input_ranks should be a list with only one element and the rank of all the inputs should be equal to that element.NOTE: when we build the model, if num_of_input is -1, we would replace it with the real number of inputs and replace input_ranks with a list of real input_ranks.
Returns: None
-
default
()[source]¶ Define the default hyper parameters here. You can define these hyper parameters in your configuration file as well.
Returns: None
-
block_zoo.attentions.BiAttFlow module¶
-
class
block_zoo.attentions.BiAttFlow.
BiAttFlow
(layer_conf)[source]¶ Bases:
block_zoo.BaseLayer.BaseLayer
implement AttentionFlow layer for BiDAF [paper]: https://arxiv.org/pdf/1611.01603.pdf
Parameters: layer_conf (AttentionFlowConf) – configuration of the AttentionFlowConf -
forward
(content, content_len, query, query_len=None)[source]¶ implement the attention flow layer of BiDAF model
Parameters: - (Tensor) (query) – [batch_size, content_seq_len, dim]
- content_len – [batch_size]
- (Tensor) – [batch_size, query_seq_len, dim]
- query_len – [batch_size]
Returns: the tensor has same shape as content
-
-
class
block_zoo.attentions.BiAttFlow.
BiAttFlowConf
(**kwargs)[source]¶ Bases:
block_zoo.BaseLayer.BaseConf
Configuration for AttentionFlow layer
Parameters: attention_dropout (float) – dropout rate of attention matrix dropout operation -
declare
()[source]¶ Define things like “input_ranks” and “num_of_inputs”, which are certain with regard to your layer
num_of_input is N(N>0) means this layer accepts N inputs;
num_of_input is -1 means this layer accepts any number of inputs;
The rank here is not the same as matrix rank:
For a scalar, its rank is 0;
For a vector, its rank is 1;
For a matrix, its rank is 2;
For a cube of numbers, its rank is 3.
… For instance, the rank of (batch size, sequence length, hidden_dim) is 3.
if num_of_input > 0:
len(input_ranks) should be equal to num_of_inputelif num_of_input == -1:
input_ranks should be a list with only one element and the rank of all the inputs should be equal to that element.NOTE: when we build the model, if num_of_input is -1, we would replace it with the real number of inputs and replace input_ranks with a list of real input_ranks.
Returns: None
-
default
()[source]¶ Define the default hyper parameters here. You can define these hyper parameters in your configuration file as well.
Returns: None
-
block_zoo.attentions.BilinearAttention module¶
-
class
block_zoo.attentions.BilinearAttention.
BilinearAttention
(layer_conf)[source]¶ Bases:
block_zoo.BaseLayer.BaseLayer
BilinearAttention layer for DrQA [paper] https://arxiv.org/abs/1704.00051 [GitHub] https://github.com/facebookresearch/DrQA :param layer_conf: configuration of a layer :type layer_conf: BilinearAttentionConf
-
class
block_zoo.attentions.BilinearAttention.
BilinearAttentionConf
(**kwargs)[source]¶ Bases:
block_zoo.BaseLayer.BaseConf
Configuration for Bilinear attention layer
-
declare
()[source]¶ Define things like “input_ranks” and “num_of_inputs”, which are certain with regard to your layer
num_of_input is N(N>0) means this layer accepts N inputs;
num_of_input is -1 means this layer accepts any number of inputs;
The rank here is not the same as matrix rank:
For a scalar, its rank is 0;
For a vector, its rank is 1;
For a matrix, its rank is 2;
For a cube of numbers, its rank is 3.
… For instance, the rank of (batch size, sequence length, hidden_dim) is 3.
if num_of_input > 0:
len(input_ranks) should be equal to num_of_inputelif num_of_input == -1:
input_ranks should be a list with only one element and the rank of all the inputs should be equal to that element.NOTE: when we build the model, if num_of_input is -1, we would replace it with the real number of inputs and replace input_ranks with a list of real input_ranks.
Returns: None
-
default
()[source]¶ Define the default hyper parameters here. You can define these hyper parameters in your configuration file as well.
Returns: None
-
block_zoo.attentions.FullAttention module¶
-
class
block_zoo.attentions.FullAttention.
FullAttention
(layer_conf)[source]¶ Bases:
block_zoo.BaseLayer.BaseLayer
Full-aware fusion of: Via, U., With, T., & To, P. (2018). Fusion Net: Fusing Via Fully-Aware Attention with Application to Machine Comprehension, 1–17.
-
forward
(string1, string1_len, string2, string2_len, string1_HoW, string1_How_len, string2_HoW, string2_HoW_len)[source]¶ To get representation of string1, we use string1 and string2 to obtain attention weights and use string2 to represent string1
Note: actually, the semantic information of string1 is not used, we only need string1’s seq_len information
Parameters: - string1 – [batch size, seq_len, input_dim1]
- string1_len – [batch_size]
- string2 – [batch size, seq_len, input_dim2]
- string2_len – [batch_size]
- string1_HoW – [batch size, seq_len, att_dim1]
- string1_HoW_len – [batch_size]
- string2_HoW – [batch size, seq_len, att_dim2]
- string2_HoW_len – [batch_size]
Returns: string1’s representation string1_len
-
-
class
block_zoo.attentions.FullAttention.
FullAttentionConf
(**kwargs)[source]¶ Bases:
block_zoo.BaseLayer.BaseConf
-
declare
()[source]¶ Define things like “input_ranks” and “num_of_inputs”, which are certain with regard to your layer
num_of_input is N(N>0) means this layer accepts N inputs;
num_of_input is -1 means this layer accepts any number of inputs;
The rank here is not the same as matrix rank:
For a scalar, its rank is 0;
For a vector, its rank is 1;
For a matrix, its rank is 2;
For a cube of numbers, its rank is 3.
… For instance, the rank of (batch size, sequence length, hidden_dim) is 3.
if num_of_input > 0:
len(input_ranks) should be equal to num_of_inputelif num_of_input == -1:
input_ranks should be a list with only one element and the rank of all the inputs should be equal to that element.NOTE: when we build the model, if num_of_input is -1, we would replace it with the real number of inputs and replace input_ranks with a list of real input_ranks.
Returns: None
-
default
()[source]¶ Define the default hyper parameters here. You can define these hyper parameters in your configuration file as well.
Returns: None
-
inference
()[source]¶ Inference things like output_dim, which may relies on defined hyper parameter such as hidden dim and input_dim
Returns: None
-
verify
()[source]¶ Define some necessary varification for your layer when we define the model.
If you define your own layer and rewrite this funciton, please add “super(YourLayerConf, self).verify()” at the beginning
Returns: None
-
verify_before_inference
()[source]¶ Some conditions must be fulfilled, otherwise there would be errors when calling inference()
- The difference between verify_before_inference() and verify() is that:
- verify_before_inference() is called before inference() while verify() is called after inference().
Returns: None
-
block_zoo.attentions.LinearAttention module¶
-
class
block_zoo.attentions.LinearAttention.
LinearAttention
(layer_conf)[source]¶ Bases:
block_zoo.BaseLayer.BaseLayer
Linear attention. Combinate the original sequence along the sequence_length dimension.
Parameters: layer_conf (LinearAttentionConf) – configuration of a layer -
forward
(string, string_len=None)[source]¶ process inputs
Parameters: - string (Variable) – (batch_size, sequence_length, dim)
- string_len (ndarray or None) – [batch_size]
Returns: - if keep_dim == False:
Output dimention: (batch_size, dim)
- else:
just reweight along the sequence_length dimension: (batch_size, sequence_length, dim)
Return type: Variable
-
-
class
block_zoo.attentions.LinearAttention.
LinearAttentionConf
(**kwargs)[source]¶ Bases:
block_zoo.BaseLayer.BaseConf
Configuration for Linear attention layer
Parameters: keep_dim (bool) – Whether to sum up the sequence representation along the sequence axis. if False, the layer would return (batch_size, dim) if True, the layer would keep the same dimension as input, thus return (batch_size, sequence_length, dim) -
declare
()[source]¶ Define things like “input_ranks” and “num_of_inputs”, which are certain with regard to your layer
num_of_input is N(N>0) means this layer accepts N inputs;
num_of_input is -1 means this layer accepts any number of inputs;
The rank here is not the same as matrix rank:
For a scalar, its rank is 0;
For a vector, its rank is 1;
For a matrix, its rank is 2;
For a cube of numbers, its rank is 3.
… For instance, the rank of (batch size, sequence length, hidden_dim) is 3.
if num_of_input > 0:
len(input_ranks) should be equal to num_of_inputelif num_of_input == -1:
input_ranks should be a list with only one element and the rank of all the inputs should be equal to that element.NOTE: when we build the model, if num_of_input is -1, we would replace it with the real number of inputs and replace input_ranks with a list of real input_ranks.
Returns: None
-
default
()[source]¶ Define the default hyper parameters here. You can define these hyper parameters in your configuration file as well.
Returns: None
-
inference
()[source]¶ Inference things like output_dim, which may relies on defined hyper parameter such as hidden dim and input_dim
Returns: None
-
verify
()[source]¶ Define some necessary varification for your layer when we define the model.
If you define your own layer and rewrite this funciton, please add “super(YourLayerConf, self).verify()” at the beginning
Returns: None
-
verify_before_inference
()[source]¶ Some conditions must be fulfilled, otherwise there would be errors when calling inference()
- The difference between verify_before_inference() and verify() is that:
- verify_before_inference() is called before inference() while verify() is called after inference().
Returns: None
-
block_zoo.attentions.MatchAttention module¶
-
class
block_zoo.attentions.MatchAttention.
MatchAttention
(layer_conf)[source]¶ Bases:
block_zoo.BaseLayer.BaseLayer
MatchAttention layer for DrQA [paper] https://arxiv.org/abs/1704.00051 [GitHub] https://github.com/facebookresearch/DrQA
Given sequences X and Y, match sequence Y to each element in X. * o_i = sum(alpha_j * y_j) for i in X * alpha_j = softmax(y_j * x_i)
Parameters: layer_conf (MatchAttentionConf) – configuration of a layer
-
class
block_zoo.attentions.MatchAttention.
MatchAttentionConf
(**kwargs)[source]¶ Bases:
block_zoo.BaseLayer.BaseConf
Configuration for MatchAttention layer
-
declare
()[source]¶ Define things like “input_ranks” and “num_of_inputs”, which are certain with regard to your layer
num_of_input is N(N>0) means this layer accepts N inputs;
num_of_input is -1 means this layer accepts any number of inputs;
The rank here is not the same as matrix rank:
For a scalar, its rank is 0;
For a vector, its rank is 1;
For a matrix, its rank is 2;
For a cube of numbers, its rank is 3.
… For instance, the rank of (batch size, sequence length, hidden_dim) is 3.
if num_of_input > 0:
len(input_ranks) should be equal to num_of_inputelif num_of_input == -1:
input_ranks should be a list with only one element and the rank of all the inputs should be equal to that element.NOTE: when we build the model, if num_of_input is -1, we would replace it with the real number of inputs and replace input_ranks with a list of real input_ranks.
Returns: None
-
default
()[source]¶ Define the default hyper parameters here. You can define these hyper parameters in your configuration file as well.
Returns: None
-
block_zoo.attentions.Seq2SeqAttention module¶
-
class
block_zoo.attentions.Seq2SeqAttention.
Seq2SeqAttention
(layer_conf)[source]¶ Bases:
block_zoo.BaseLayer.BaseLayer
Linear layer
Parameters: layer_conf (LinearConf) – configuration of a layer -
forward
(string, string_len, string2, string2_len=None)[source]¶ - utilize both string2 and string itself to generate attention weights to represent string.
- There are two steps:
- get a string2 to string attention to represent string.
- get a string to string attention to represent string it self.
- merge the two representation above.
Parameters: Returns: has the same shape as string.
Return type: Variable
-
-
class
block_zoo.attentions.Seq2SeqAttention.
Seq2SeqAttentionConf
(**kwargs)[source]¶ Bases:
block_zoo.BaseLayer.BaseConf
Configuration for Seq2SeqAttention layer
-
declare
()[source]¶ Define things like “input_ranks” and “num_of_inputs”, which are certain with regard to your layer
num_of_input is N(N>0) means this layer accepts N inputs;
num_of_input is -1 means this layer accepts any number of inputs;
The rank here is not the same as matrix rank:
For a scalar, its rank is 0;
For a vector, its rank is 1;
For a matrix, its rank is 2;
For a cube of numbers, its rank is 3.
… For instance, the rank of (batch size, sequence length, hidden_dim) is 3.
if num_of_input > 0:
len(input_ranks) should be equal to num_of_inputelif num_of_input == -1:
input_ranks should be a list with only one element and the rank of all the inputs should be equal to that element.NOTE: when we build the model, if num_of_input is -1, we would replace it with the real number of inputs and replace input_ranks with a list of real input_ranks.
Returns: None
-
default
()[source]¶ Define the default hyper parameters here. You can define these hyper parameters in your configuration file as well.
Returns: None
-