Supporting Modules#
- class recwizard.modules.kgsf.graph_utils.GraphConvolution(in_features, out_features, bias=True)[source]#
Simple GCN layer, similar to https://arxiv.org/abs/1609.02907
- __init__(in_features, out_features, bias=True)[source]#
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(input, adj)[source]#
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class recwizard.modules.kgsf.graph_utils.GCN(ninp, nhid, dropout=0.5)[source]#
- class recwizard.modules.kgsf.graph_utils.GraphAttentionLayer(in_features, out_features, dropout, alpha, concat=True)[source]#
Simple GAT layer, similar to https://arxiv.org/abs/1710.10903
- __init__(in_features, out_features, dropout, alpha, concat=True)[source]#
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(input, adj)[source]#
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class recwizard.modules.kgsf.graph_utils.SelfAttentionLayer(dim, da, alpha=0.2, dropout=0.5)[source]#
- __init__(dim, da, alpha=0.2, dropout=0.5)[source]#
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(h)[source]#
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class recwizard.modules.kgsf.graph_utils.SelfAttentionLayer_batch(dim, da, alpha=0.2, dropout=0.5)[source]#
- __init__(dim, da, alpha=0.2, dropout=0.5)[source]#
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(h, mask)[source]#
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class recwizard.modules.kgsf.graph_utils.SelfAttentionLayer2(dim, da)[source]#
- __init__(dim, da)[source]#
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(h)[source]#
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class recwizard.modules.kgsf.graph_utils.BiAttention(input_size, dropout)[source]#
- __init__(input_size, dropout)[source]#
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(*input: Any) None #
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class recwizard.modules.kgsf.graph_utils.GAT(nfeat, nhid, nclass, dropout, alpha, nheads)[source]#
-
- forward(x, adj)[source]#
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class recwizard.modules.kgsf.graph_utils.SpecialSpmmFunction(*args, **kwargs)[source]#
Special function for only sparse region backpropataion layer.
- static forward(ctx, indices, values, shape, b)[source]#
This function is to be overridden by all subclasses. There are two ways to define forward:
Usage 1 (Combined forward and ctx):
@staticmethod def forward(ctx: Any, *args: Any, **kwargs: Any) -> Any: pass
It must accept a context ctx as the first argument, followed by any number of arguments (tensors or other types).
See combining-forward-context for more details
Usage 2 (Separate forward and ctx):
@staticmethod def forward(*args: Any, **kwargs: Any) -> Any: pass @staticmethod def setup_context(ctx: Any, inputs: Tuple[Any, ...], output: Any) -> None: pass
The forward no longer accepts a ctx argument.
Instead, you must also override the
torch.autograd.Function.setup_context()
staticmethod to handle setting up thectx
object.output
is the output of the forward,inputs
are a Tuple of inputs to the forward.See extending-autograd for more details
The context can be used to store arbitrary data that can be then retrieved during the backward pass. Tensors should not be stored directly on ctx (though this is not currently enforced for backward compatibility). Instead, tensors should be saved either with
ctx.save_for_backward()
if they are intended to be used inbackward
(equivalently,vjp
) orctx.save_for_forward()
if they are intended to be used for injvp
.
- static backward(ctx, grad_output)[source]#
Defines a formula for differentiating the operation with backward mode automatic differentiation (alias to the vjp function).
This function is to be overridden by all subclasses.
It must accept a context
ctx
as the first argument, followed by as many outputs as theforward()
returned (None will be passed in for non tensor outputs of the forward function), and it should return as many tensors, as there were inputs toforward()
. Each argument is the gradient w.r.t the given output, and each returned value should be the gradient w.r.t. the corresponding input. If an input is not a Tensor or is a Tensor not requiring grads, you can just pass None as a gradient for that input.The context can be used to retrieve tensors saved during the forward pass. It also has an attribute
ctx.needs_input_grad
as a tuple of booleans representing whether each input needs gradient. E.g.,backward()
will havectx.needs_input_grad[0] = True
if the first input toforward()
needs gradient computated w.r.t. the output.
- class recwizard.modules.kgsf.graph_utils.SpecialSpmm(*args, **kwargs)[source]#
- forward(indices, values, shape, b)[source]#
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class recwizard.modules.kgsf.graph_utils.SpGraphAttentionLayer(in_features, out_features, dropout, alpha, concat=True)[source]#
Sparse version GAT layer, similar to https://arxiv.org/abs/1710.10903
- __init__(in_features, out_features, dropout, alpha, concat=True)[source]#
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(input, adj)[source]#
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class recwizard.modules.kgsf.graph_utils.SpGAT(nfeat, nhid, nclass, dropout, alpha, nheads)[source]#
-
- forward(x, adj)[source]#
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- recwizard.modules.kgsf.transformer_utils._normalize(tensor, norm_layer)[source]#
Broadcast layer norm
- recwizard.modules.kgsf.transformer_utils._build_encoder(opt, dictionary, embedding=None, padding_idx=None, reduction=True, n_positions=1024)[source]#
- recwizard.modules.kgsf.transformer_utils._build_encoder4kg(opt, padding_idx=None, reduction=True, n_positions=1024)[source]#
- recwizard.modules.kgsf.transformer_utils._build_encoder_mask(opt, dictionary, embedding=None, padding_idx=None, reduction=True, n_positions=1024)[source]#
- recwizard.modules.kgsf.transformer_utils._build_decoder(opt, dictionary, embedding=None, padding_idx=None, n_positions=1024)[source]#
- recwizard.modules.kgsf.transformer_utils._build_decoder4kg(opt, dictionary, embedding=None, padding_idx=None, n_positions=1024)[source]#
- class recwizard.modules.kgsf.transformer_utils.BasicAttention(dim=1, attn='cosine')[source]#
- __init__(dim=1, attn='cosine')[source]#
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(xs, ys)[source]#
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class recwizard.modules.kgsf.transformer_utils.MultiHeadAttention(n_heads, dim, dropout=0)[source]#
- __init__(n_heads, dim, dropout=0)[source]#
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(query, key=None, value=None, mask=None)[source]#
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class recwizard.modules.kgsf.transformer_utils.TransformerFFN(dim, dim_hidden, relu_dropout=0)[source]#
- __init__(dim, dim_hidden, relu_dropout=0)[source]#
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(x)[source]#
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class recwizard.modules.kgsf.transformer_utils.TransformerResponseWrapper(transformer, hdim)[source]#
Transformer response rapper. Pushes input through transformer and MLP
- __init__(transformer, hdim)[source]#
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(*args)[source]#
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class recwizard.modules.kgsf.transformer_utils.TransformerEncoder4kg(n_heads, n_layers, embedding_size, ffn_size, dropout=0.0, attention_dropout=0.0, relu_dropout=0.0, padding_idx=0, learn_positional_embeddings=False, embeddings_scale=False, reduction=True, n_positions=1024)[source]#
Transformer encoder module.
- Parameters:
n_heads (int) – the number of multihead attention heads.
n_layers (int) – number of transformer layers.
embedding_size (int) – the embedding sizes. Must be a multiple of n_heads.
ffn_size (int) – the size of the hidden layer in the FFN
embedding – an embedding matrix for the bottom layer of the transformer. If none, one is created for this encoder.
dropout (float) – Dropout used around embeddings and before layer layer normalizations. This is used in Vaswani 2017 and works well on large datasets.
attention_dropout (float) – Dropout performed after the multhead attention softmax. This is not used in Vaswani 2017.
relu_attention (float) – Dropout used after the ReLU in the FFN. Not used in Vaswani 2017, but used in Tensor2Tensor.
padding_idx (int) – Reserved padding index in the embeddings matrix.
learn_positional_embeddings (bool) – If off, sinusoidal embeddings are used. If on, position embeddings are learned from scratch.
embeddings_scale (bool) – Scale embeddings relative to their dimensionality. Found useful in fairseq.
reduction (bool) – If true, returns the mean vector for the entire encoding sequence.
n_positions (int) – Size of the position embeddings matrix.
- __init__(n_heads, n_layers, embedding_size, ffn_size, dropout=0.0, attention_dropout=0.0, relu_dropout=0.0, padding_idx=0, learn_positional_embeddings=False, embeddings_scale=False, reduction=True, n_positions=1024)[source]#
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- class recwizard.modules.kgsf.transformer_utils.TransformerEncoderLayer(n_heads, embedding_size, ffn_size, attention_dropout=0.0, relu_dropout=0.0, dropout=0.0)[source]#
- __init__(n_heads, embedding_size, ffn_size, attention_dropout=0.0, relu_dropout=0.0, dropout=0.0)[source]#
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(tensor, mask)[source]#
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class recwizard.modules.kgsf.transformer_utils.TransformerEncoder(n_heads, n_layers, embedding_size, ffn_size, vocabulary_size, embedding=None, dropout=0.0, attention_dropout=0.0, relu_dropout=0.0, padding_idx=0, learn_positional_embeddings=False, embeddings_scale=False, reduction=True, n_positions=1024)[source]#
Transformer encoder module.
- Parameters:
n_heads (int) – the number of multihead attention heads.
n_layers (int) – number of transformer layers.
embedding_size (int) – the embedding sizes. Must be a multiple of n_heads.
ffn_size (int) – the size of the hidden layer in the FFN
embedding – an embedding matrix for the bottom layer of the transformer. If none, one is created for this encoder.
dropout (float) – Dropout used around embeddings and before layer layer normalizations. This is used in Vaswani 2017 and works well on large datasets.
attention_dropout (float) – Dropout performed after the multhead attention softmax. This is not used in Vaswani 2017.
relu_attention (float) – Dropout used after the ReLU in the FFN. Not used in Vaswani 2017, but used in Tensor2Tensor.
padding_idx (int) – Reserved padding index in the embeddings matrix.
learn_positional_embeddings (bool) – If off, sinusoidal embeddings are used. If on, position embeddings are learned from scratch.
embeddings_scale (bool) – Scale embeddings relative to their dimensionality. Found useful in fairseq.
reduction (bool) – If true, returns the mean vector for the entire encoding sequence.
n_positions (int) – Size of the position embeddings matrix.
- __init__(n_heads, n_layers, embedding_size, ffn_size, vocabulary_size, embedding=None, dropout=0.0, attention_dropout=0.0, relu_dropout=0.0, padding_idx=0, learn_positional_embeddings=False, embeddings_scale=False, reduction=True, n_positions=1024)[source]#
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- class recwizard.modules.kgsf.transformer_utils.TransformerEncoder_mask(n_heads, n_layers, embedding_size, ffn_size, vocabulary_size, embedding=None, dropout=0.0, attention_dropout=0.0, relu_dropout=0.0, padding_idx=0, learn_positional_embeddings=False, embeddings_scale=False, reduction=True, n_positions=1024)[source]#
Transformer encoder module.
- Parameters:
n_heads (int) – the number of multihead attention heads.
n_layers (int) – number of transformer layers.
embedding_size (int) – the embedding sizes. Must be a multiple of n_heads.
ffn_size (int) – the size of the hidden layer in the FFN
embedding – an embedding matrix for the bottom layer of the transformer. If none, one is created for this encoder.
dropout (float) – Dropout used around embeddings and before layer layer normalizations. This is used in Vaswani 2017 and works well on large datasets.
attention_dropout (float) – Dropout performed after the multhead attention softmax. This is not used in Vaswani 2017.
relu_attention (float) – Dropout used after the ReLU in the FFN. Not used in Vaswani 2017, but used in Tensor2Tensor.
padding_idx (int) – Reserved padding index in the embeddings matrix.
learn_positional_embeddings (bool) – If off, sinusoidal embeddings are used. If on, position embeddings are learned from scratch.
embeddings_scale (bool) – Scale embeddings relative to their dimensionality. Found useful in fairseq.
reduction (bool) – If true, returns the mean vector for the entire encoding sequence.
n_positions (int) – Size of the position embeddings matrix.
- __init__(n_heads, n_layers, embedding_size, ffn_size, vocabulary_size, embedding=None, dropout=0.0, attention_dropout=0.0, relu_dropout=0.0, padding_idx=0, learn_positional_embeddings=False, embeddings_scale=False, reduction=True, n_positions=1024)[source]#
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- class recwizard.modules.kgsf.transformer_utils.TransformerDecoderLayer(n_heads, embedding_size, ffn_size, attention_dropout=0.0, relu_dropout=0.0, dropout=0.0)[source]#
- __init__(n_heads, embedding_size, ffn_size, attention_dropout=0.0, relu_dropout=0.0, dropout=0.0)[source]#
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(x, encoder_output, encoder_mask)[source]#
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class recwizard.modules.kgsf.transformer_utils.TransformerDecoder(n_heads, n_layers, embedding_size, ffn_size, vocabulary_size, embedding=None, dropout=0.0, attention_dropout=0.0, relu_dropout=0.0, embeddings_scale=True, learn_positional_embeddings=False, padding_idx=None, n_positions=1024)[source]#
Transformer Decoder layer.
- Parameters:
n_heads (int) – the number of multihead attention heads.
n_layers (int) – number of transformer layers.
embedding_size (int) – the embedding sizes. Must be a multiple of n_heads.
ffn_size (int) – the size of the hidden layer in the FFN
embedding – an embedding matrix for the bottom layer of the transformer. If none, one is created for this encoder.
dropout (float) – Dropout used around embeddings and before layer layer normalizations. This is used in Vaswani 2017 and works well on large datasets.
attention_dropout (float) – Dropout performed after the multhead attention softmax. This is not used in Vaswani 2017.
relu_attention (float) – Dropout used after the ReLU in the FFN. Not used in Vaswani 2017, but used in Tensor2Tensor.
padding_idx (int) – Reserved padding index in the embeddings matrix.
learn_positional_embeddings (bool) – If off, sinusoidal embeddings are used. If on, position embeddings are learned from scratch.
embeddings_scale (bool) – Scale embeddings relative to their dimensionality. Found useful in fairseq.
n_positions (int) – Size of the position embeddings matrix.
- __init__(n_heads, n_layers, embedding_size, ffn_size, vocabulary_size, embedding=None, dropout=0.0, attention_dropout=0.0, relu_dropout=0.0, embeddings_scale=True, learn_positional_embeddings=False, padding_idx=None, n_positions=1024)[source]#
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(input, encoder_state, incr_state=None)[source]#
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class recwizard.modules.kgsf.transformer_utils.TransformerDecoderLayerKG(n_heads, embedding_size, ffn_size, attention_dropout=0.0, relu_dropout=0.0, dropout=0.0)[source]#
- __init__(n_heads, embedding_size, ffn_size, attention_dropout=0.0, relu_dropout=0.0, dropout=0.0)[source]#
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(x, encoder_output, encoder_mask, kg_encoder_output, kg_encoder_mask, db_encoder_output, db_encoder_mask)[source]#
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class recwizard.modules.kgsf.transformer_utils.TransformerDecoder(n_heads, n_layers, embedding_size, ffn_size, vocabulary_size, embedding=None, dropout=0.0, attention_dropout=0.0, relu_dropout=0.0, embeddings_scale=True, learn_positional_embeddings=False, padding_idx=None, n_positions=1024)[source]#
Transformer Decoder layer.
- Parameters:
n_heads (int) – the number of multihead attention heads.
n_layers (int) – number of transformer layers.
embedding_size (int) – the embedding sizes. Must be a multiple of n_heads.
ffn_size (int) – the size of the hidden layer in the FFN
embedding – an embedding matrix for the bottom layer of the transformer. If none, one is created for this encoder.
dropout (float) – Dropout used around embeddings and before layer layer normalizations. This is used in Vaswani 2017 and works well on large datasets.
attention_dropout (float) – Dropout performed after the multhead attention softmax. This is not used in Vaswani 2017.
relu_attention (float) – Dropout used after the ReLU in the FFN. Not used in Vaswani 2017, but used in Tensor2Tensor.
padding_idx (int) – Reserved padding index in the embeddings matrix.
learn_positional_embeddings (bool) – If off, sinusoidal embeddings are used. If on, position embeddings are learned from scratch.
embeddings_scale (bool) – Scale embeddings relative to their dimensionality. Found useful in fairseq.
n_positions (int) – Size of the position embeddings matrix.
- __init__(n_heads, n_layers, embedding_size, ffn_size, vocabulary_size, embedding=None, dropout=0.0, attention_dropout=0.0, relu_dropout=0.0, embeddings_scale=True, learn_positional_embeddings=False, padding_idx=None, n_positions=1024)[source]#
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(input, encoder_state, incr_state=None)[source]#
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class recwizard.modules.kgsf.transformer_utils.TransformerMemNetModel(opt, dictionary)[source]#
Model which takes context, memories, candidates and encodes them
- __init__(opt, dictionary)[source]#
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(xs, mems, cands)[source]#
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class recwizard.modules.kgsf.transformer_utils.TorchGeneratorModel(padding_idx=0, start_idx=1, end_idx=2, unknown_idx=3, input_dropout=0, longest_label=1)[source]#
This Interface expects you to implement model with the following reqs:
- Attribute model.encoder:
takes input returns tuple (enc_out, enc_hidden, attn_mask)
- Attribute model.decoder:
takes decoder params and returns decoder outputs after attn
- Attribute model.output:
takes decoder outputs and returns distr over dictionary
- __init__(padding_idx=0, start_idx=1, end_idx=2, unknown_idx=3, input_dropout=0, longest_label=1)[source]#
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- decode_greedy(encoder_states, bsz, maxlen)[source]#
Greedy search
- Parameters:
bsz (int) – Batch size. Because encoder_states is model-specific, it cannot infer this automatically.
encoder_states (Model specific) – Output of the encoder model.
maxlen (int) – Maximum decoding length
- Returns:
pair (logits, choices) of the greedy decode
- Return type:
(FloatTensor[bsz, maxlen, vocab], LongTensor[bsz, maxlen])
- decode_forced(encoder_states, ys)[source]#
Decode with a fixed, true sequence, computing loss. Useful for training, or ranking fixed candidates.
- Parameters:
ys (LongTensor[bsz, time]) – the prediction targets. Contains both the start and end tokens.
encoder_states (model specific) – Output of the encoder. Model specific types.
- Returns:
pair (logits, choices) containing the logits and MLE predictions
- Return type:
(FloatTensor[bsz, ys, vocab], LongTensor[bsz, ys])
- reorder_encoder_states(encoder_states, indices)[source]#
Reorder encoder states according to a new set of indices.
This is an abstract method, and must be implemented by the user.
Its purpose is to provide beam search with a model-agnostic interface for beam search. For example, this method is used to sort hypotheses, expand beams, etc.
For example, assume that encoder_states is an bsz x 1 tensor of values
indices = [0, 2, 2] encoder_states = [[0.1] [0.2] [0.3]]
then the output will be
output = [[0.1] [0.3] [0.3]]
- Parameters:
encoder_states (model specific) – output from encoder. type is model specific.
indices (list[int]) – the indices to select over. The user must support non-tensor inputs.
- Returns:
The re-ordered encoder states. It should be of the same type as encoder states, and it must be a valid input to the decoder.
- Return type:
model specific
- reorder_decoder_incremental_state(incremental_state, inds)[source]#
Reorder incremental state for the decoder.
Used to expand selected beams in beam_search. Unlike reorder_encoder_states, implementing this method is optional. However, without incremental decoding, decoding a single beam becomes O(n^2) instead of O(n), which can make beam search impractically slow.
In order to fall back to non-incremental decoding, just return None from this method.
- Parameters:
incremental_state (model specific) – second output of model.decoder
inds (LongTensor[n]) – indices to select and reorder over.
- Returns:
The re-ordered decoder incremental states. It should be the same type as incremental_state, and usable as an input to the decoder. This method should return None if the model does not support incremental decoding.
- Return type:
model specific
- forward(*xs, ys=None, cand_params=None, prev_enc=None, maxlen=None, bsz=None)[source]#
Get output predictions from the model.
- Parameters:
xs (LongTensor[bsz, seqlen]) – input to the encoder
ys (LongTensor[bsz, outlen]) – Expected output from the decoder. Used for teacher forcing to calculate loss.
prev_enc – if you know you’ll pass in the same xs multiple times, you can pass in the encoder output from the last forward pass to skip recalcuating the same encoder output.
maxlen – max number of tokens to decode. if not set, will use the length of the longest label this model has seen. ignored when ys is not None.
bsz – if ys is not provided, then you must specify the bsz for greedy decoding.
- Returns:
(scores, candidate_scores, encoder_states) tuple
scores contains the model’s predicted token scores. (FloatTensor[bsz, seqlen, num_features])
candidate_scores are the score the model assigned to each candidate. (FloatTensor[bsz, num_cands])
encoder_states are the output of model.encoder. Model specific types. Feed this back in to skip encoding on the next call.
- recwizard.modules.kgsf.utils.neginf(dtype)[source]#
Returns a representable finite number near -inf for a dtype.
- recwizard.modules.kgsf.utils._create_embeddings(dictionary, embedding_size, padding_idx)[source]#
Create and initialize word embeddings.