Supporting Modules#

recwizard.modules.kgsf.graph_utils.kaiming_reset_parameters(linear_module)[source]#
class recwizard.modules.kgsf.graph_utils.GraphConvolution(in_features, out_features, bias=True)[source]#

Simple GCN layer, similar to https://arxiv.org/abs/1609.02907

__init__(in_features, out_features, bias=True)[source]#

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(input, adj)[source]#

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

__repr__()[source]#

Return repr(self).

class recwizard.modules.kgsf.graph_utils.GCN(ninp, nhid, dropout=0.5)[source]#
__init__(ninp, nhid, dropout=0.5)[source]#

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(x, adj)[source]#

x: shape (|V|, |D|); adj: shape(|V|, |V|)

class recwizard.modules.kgsf.graph_utils.GraphAttentionLayer(in_features, out_features, dropout, alpha, concat=True)[source]#

Simple GAT layer, similar to https://arxiv.org/abs/1710.10903

__init__(in_features, out_features, dropout, alpha, concat=True)[source]#

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(input, adj)[source]#

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

__repr__()[source]#

Return repr(self).

class recwizard.modules.kgsf.graph_utils.SelfAttentionLayer(dim, da, alpha=0.2, dropout=0.5)[source]#
__init__(dim, da, alpha=0.2, dropout=0.5)[source]#

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(h)[source]#

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class recwizard.modules.kgsf.graph_utils.SelfAttentionLayer_batch(dim, da, alpha=0.2, dropout=0.5)[source]#
__init__(dim, da, alpha=0.2, dropout=0.5)[source]#

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(h, mask)[source]#

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class recwizard.modules.kgsf.graph_utils.SelfAttentionLayer2(dim, da)[source]#
__init__(dim, da)[source]#

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(h)[source]#

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class recwizard.modules.kgsf.graph_utils.BiAttention(input_size, dropout)[source]#
__init__(input_size, dropout)[source]#

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(*input: Any) None#

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class recwizard.modules.kgsf.graph_utils.GAT(nfeat, nhid, nclass, dropout, alpha, nheads)[source]#
__init__(nfeat, nhid, nclass, dropout, alpha, nheads)[source]#

Dense version of GAT.

forward(x, adj)[source]#

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class recwizard.modules.kgsf.graph_utils.SpecialSpmmFunction(*args, **kwargs)[source]#

Special function for only sparse region backpropataion layer.

static forward(ctx, indices, values, shape, b)[source]#

This function is to be overridden by all subclasses. There are two ways to define forward:

Usage 1 (Combined forward and ctx):

@staticmethod
def forward(ctx: Any, *args: Any, **kwargs: Any) -> Any:
    pass
  • It must accept a context ctx as the first argument, followed by any number of arguments (tensors or other types).

  • See combining-forward-context for more details

Usage 2 (Separate forward and ctx):

@staticmethod
def forward(*args: Any, **kwargs: Any) -> Any:
    pass

@staticmethod
def setup_context(ctx: Any, inputs: Tuple[Any, ...], output: Any) -> None:
    pass
  • The forward no longer accepts a ctx argument.

  • Instead, you must also override the torch.autograd.Function.setup_context() staticmethod to handle setting up the ctx object. output is the output of the forward, inputs are a Tuple of inputs to the forward.

  • See extending-autograd for more details

The context can be used to store arbitrary data that can be then retrieved during the backward pass. Tensors should not be stored directly on ctx (though this is not currently enforced for backward compatibility). Instead, tensors should be saved either with ctx.save_for_backward() if they are intended to be used in backward (equivalently, vjp) or ctx.save_for_forward() if they are intended to be used for in jvp.

static backward(ctx, grad_output)[source]#

Defines a formula for differentiating the operation with backward mode automatic differentiation (alias to the vjp function).

This function is to be overridden by all subclasses.

It must accept a context ctx as the first argument, followed by as many outputs as the forward() returned (None will be passed in for non tensor outputs of the forward function), and it should return as many tensors, as there were inputs to forward(). Each argument is the gradient w.r.t the given output, and each returned value should be the gradient w.r.t. the corresponding input. If an input is not a Tensor or is a Tensor not requiring grads, you can just pass None as a gradient for that input.

The context can be used to retrieve tensors saved during the forward pass. It also has an attribute ctx.needs_input_grad as a tuple of booleans representing whether each input needs gradient. E.g., backward() will have ctx.needs_input_grad[0] = True if the first input to forward() needs gradient computated w.r.t. the output.

class recwizard.modules.kgsf.graph_utils.SpecialSpmm(*args, **kwargs)[source]#
forward(indices, values, shape, b)[source]#

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class recwizard.modules.kgsf.graph_utils.SpGraphAttentionLayer(in_features, out_features, dropout, alpha, concat=True)[source]#

Sparse version GAT layer, similar to https://arxiv.org/abs/1710.10903

__init__(in_features, out_features, dropout, alpha, concat=True)[source]#

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(input, adj)[source]#

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

__repr__()[source]#

Return repr(self).

class recwizard.modules.kgsf.graph_utils.SpGAT(nfeat, nhid, nclass, dropout, alpha, nheads)[source]#
__init__(nfeat, nhid, nclass, dropout, alpha, nheads)[source]#

Sparse version of GAT.

forward(x, adj)[source]#

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

recwizard.modules.kgsf.graph_utils._add_neighbors(kg, g, seed_set, hop)[source]#
recwizard.modules.kgsf.transformer_utils._normalize(tensor, norm_layer)[source]#

Broadcast layer norm

recwizard.modules.kgsf.transformer_utils._build_encoder(opt, dictionary, embedding=None, padding_idx=None, reduction=True, n_positions=1024)[source]#
recwizard.modules.kgsf.transformer_utils._build_encoder4kg(opt, padding_idx=None, reduction=True, n_positions=1024)[source]#
recwizard.modules.kgsf.transformer_utils._build_encoder_mask(opt, dictionary, embedding=None, padding_idx=None, reduction=True, n_positions=1024)[source]#
recwizard.modules.kgsf.transformer_utils._build_decoder(opt, dictionary, embedding=None, padding_idx=None, n_positions=1024)[source]#
recwizard.modules.kgsf.transformer_utils._build_decoder4kg(opt, dictionary, embedding=None, padding_idx=None, n_positions=1024)[source]#
recwizard.modules.kgsf.transformer_utils.create_position_codes(n_pos, dim, out)[source]#
class recwizard.modules.kgsf.transformer_utils.BasicAttention(dim=1, attn='cosine')[source]#
__init__(dim=1, attn='cosine')[source]#

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(xs, ys)[source]#

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class recwizard.modules.kgsf.transformer_utils.MultiHeadAttention(n_heads, dim, dropout=0)[source]#
__init__(n_heads, dim, dropout=0)[source]#

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(query, key=None, value=None, mask=None)[source]#

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class recwizard.modules.kgsf.transformer_utils.TransformerFFN(dim, dim_hidden, relu_dropout=0)[source]#
__init__(dim, dim_hidden, relu_dropout=0)[source]#

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(x)[source]#

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class recwizard.modules.kgsf.transformer_utils.TransformerResponseWrapper(transformer, hdim)[source]#

Transformer response rapper. Pushes input through transformer and MLP

__init__(transformer, hdim)[source]#

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(*args)[source]#

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class recwizard.modules.kgsf.transformer_utils.TransformerEncoder4kg(n_heads, n_layers, embedding_size, ffn_size, dropout=0.0, attention_dropout=0.0, relu_dropout=0.0, padding_idx=0, learn_positional_embeddings=False, embeddings_scale=False, reduction=True, n_positions=1024)[source]#

Transformer encoder module.

Parameters:
  • n_heads (int) – the number of multihead attention heads.

  • n_layers (int) – number of transformer layers.

  • embedding_size (int) – the embedding sizes. Must be a multiple of n_heads.

  • ffn_size (int) – the size of the hidden layer in the FFN

  • embedding – an embedding matrix for the bottom layer of the transformer. If none, one is created for this encoder.

  • dropout (float) – Dropout used around embeddings and before layer layer normalizations. This is used in Vaswani 2017 and works well on large datasets.

  • attention_dropout (float) – Dropout performed after the multhead attention softmax. This is not used in Vaswani 2017.

  • relu_attention (float) – Dropout used after the ReLU in the FFN. Not used in Vaswani 2017, but used in Tensor2Tensor.

  • padding_idx (int) – Reserved padding index in the embeddings matrix.

  • learn_positional_embeddings (bool) – If off, sinusoidal embeddings are used. If on, position embeddings are learned from scratch.

  • embeddings_scale (bool) – Scale embeddings relative to their dimensionality. Found useful in fairseq.

  • reduction (bool) – If true, returns the mean vector for the entire encoding sequence.

  • n_positions (int) – Size of the position embeddings matrix.

__init__(n_heads, n_layers, embedding_size, ffn_size, dropout=0.0, attention_dropout=0.0, relu_dropout=0.0, padding_idx=0, learn_positional_embeddings=False, embeddings_scale=False, reduction=True, n_positions=1024)[source]#

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(input, mask)[source]#

input data is a FloatTensor of shape [batch, seq_len, dim] mask is a ByteTensor of shape [batch, seq_len], filled with 1 when inside the sequence and 0 outside.

class recwizard.modules.kgsf.transformer_utils.TransformerEncoderLayer(n_heads, embedding_size, ffn_size, attention_dropout=0.0, relu_dropout=0.0, dropout=0.0)[source]#
__init__(n_heads, embedding_size, ffn_size, attention_dropout=0.0, relu_dropout=0.0, dropout=0.0)[source]#

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(tensor, mask)[source]#

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class recwizard.modules.kgsf.transformer_utils.TransformerEncoder(n_heads, n_layers, embedding_size, ffn_size, vocabulary_size, embedding=None, dropout=0.0, attention_dropout=0.0, relu_dropout=0.0, padding_idx=0, learn_positional_embeddings=False, embeddings_scale=False, reduction=True, n_positions=1024)[source]#

Transformer encoder module.

Parameters:
  • n_heads (int) – the number of multihead attention heads.

  • n_layers (int) – number of transformer layers.

  • embedding_size (int) – the embedding sizes. Must be a multiple of n_heads.

  • ffn_size (int) – the size of the hidden layer in the FFN

  • embedding – an embedding matrix for the bottom layer of the transformer. If none, one is created for this encoder.

  • dropout (float) – Dropout used around embeddings and before layer layer normalizations. This is used in Vaswani 2017 and works well on large datasets.

  • attention_dropout (float) – Dropout performed after the multhead attention softmax. This is not used in Vaswani 2017.

  • relu_attention (float) – Dropout used after the ReLU in the FFN. Not used in Vaswani 2017, but used in Tensor2Tensor.

  • padding_idx (int) – Reserved padding index in the embeddings matrix.

  • learn_positional_embeddings (bool) – If off, sinusoidal embeddings are used. If on, position embeddings are learned from scratch.

  • embeddings_scale (bool) – Scale embeddings relative to their dimensionality. Found useful in fairseq.

  • reduction (bool) – If true, returns the mean vector for the entire encoding sequence.

  • n_positions (int) – Size of the position embeddings matrix.

__init__(n_heads, n_layers, embedding_size, ffn_size, vocabulary_size, embedding=None, dropout=0.0, attention_dropout=0.0, relu_dropout=0.0, padding_idx=0, learn_positional_embeddings=False, embeddings_scale=False, reduction=True, n_positions=1024)[source]#

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(input)[source]#

input data is a FloatTensor of shape [batch, seq_len, dim] mask is a ByteTensor of shape [batch, seq_len], filled with 1 when inside the sequence and 0 outside.

class recwizard.modules.kgsf.transformer_utils.TransformerEncoder_mask(n_heads, n_layers, embedding_size, ffn_size, vocabulary_size, embedding=None, dropout=0.0, attention_dropout=0.0, relu_dropout=0.0, padding_idx=0, learn_positional_embeddings=False, embeddings_scale=False, reduction=True, n_positions=1024)[source]#

Transformer encoder module.

Parameters:
  • n_heads (int) – the number of multihead attention heads.

  • n_layers (int) – number of transformer layers.

  • embedding_size (int) – the embedding sizes. Must be a multiple of n_heads.

  • ffn_size (int) – the size of the hidden layer in the FFN

  • embedding – an embedding matrix for the bottom layer of the transformer. If none, one is created for this encoder.

  • dropout (float) – Dropout used around embeddings and before layer layer normalizations. This is used in Vaswani 2017 and works well on large datasets.

  • attention_dropout (float) – Dropout performed after the multhead attention softmax. This is not used in Vaswani 2017.

  • relu_attention (float) – Dropout used after the ReLU in the FFN. Not used in Vaswani 2017, but used in Tensor2Tensor.

  • padding_idx (int) – Reserved padding index in the embeddings matrix.

  • learn_positional_embeddings (bool) – If off, sinusoidal embeddings are used. If on, position embeddings are learned from scratch.

  • embeddings_scale (bool) – Scale embeddings relative to their dimensionality. Found useful in fairseq.

  • reduction (bool) – If true, returns the mean vector for the entire encoding sequence.

  • n_positions (int) – Size of the position embeddings matrix.

__init__(n_heads, n_layers, embedding_size, ffn_size, vocabulary_size, embedding=None, dropout=0.0, attention_dropout=0.0, relu_dropout=0.0, padding_idx=0, learn_positional_embeddings=False, embeddings_scale=False, reduction=True, n_positions=1024)[source]#

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(input, m_emb)[source]#

input data is a FloatTensor of shape [batch, seq_len, dim] mask is a ByteTensor of shape [batch, seq_len], filled with 1 when inside the sequence and 0 outside.

class recwizard.modules.kgsf.transformer_utils.TransformerDecoderLayer(n_heads, embedding_size, ffn_size, attention_dropout=0.0, relu_dropout=0.0, dropout=0.0)[source]#
__init__(n_heads, embedding_size, ffn_size, attention_dropout=0.0, relu_dropout=0.0, dropout=0.0)[source]#

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(x, encoder_output, encoder_mask)[source]#

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class recwizard.modules.kgsf.transformer_utils.TransformerDecoder(n_heads, n_layers, embedding_size, ffn_size, vocabulary_size, embedding=None, dropout=0.0, attention_dropout=0.0, relu_dropout=0.0, embeddings_scale=True, learn_positional_embeddings=False, padding_idx=None, n_positions=1024)[source]#

Transformer Decoder layer.

Parameters:
  • n_heads (int) – the number of multihead attention heads.

  • n_layers (int) – number of transformer layers.

  • embedding_size (int) – the embedding sizes. Must be a multiple of n_heads.

  • ffn_size (int) – the size of the hidden layer in the FFN

  • embedding – an embedding matrix for the bottom layer of the transformer. If none, one is created for this encoder.

  • dropout (float) – Dropout used around embeddings and before layer layer normalizations. This is used in Vaswani 2017 and works well on large datasets.

  • attention_dropout (float) – Dropout performed after the multhead attention softmax. This is not used in Vaswani 2017.

  • relu_attention (float) – Dropout used after the ReLU in the FFN. Not used in Vaswani 2017, but used in Tensor2Tensor.

  • padding_idx (int) – Reserved padding index in the embeddings matrix.

  • learn_positional_embeddings (bool) – If off, sinusoidal embeddings are used. If on, position embeddings are learned from scratch.

  • embeddings_scale (bool) – Scale embeddings relative to their dimensionality. Found useful in fairseq.

  • n_positions (int) – Size of the position embeddings matrix.

__init__(n_heads, n_layers, embedding_size, ffn_size, vocabulary_size, embedding=None, dropout=0.0, attention_dropout=0.0, relu_dropout=0.0, embeddings_scale=True, learn_positional_embeddings=False, padding_idx=None, n_positions=1024)[source]#

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(input, encoder_state, incr_state=None)[source]#

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class recwizard.modules.kgsf.transformer_utils.TransformerDecoderLayerKG(n_heads, embedding_size, ffn_size, attention_dropout=0.0, relu_dropout=0.0, dropout=0.0)[source]#
__init__(n_heads, embedding_size, ffn_size, attention_dropout=0.0, relu_dropout=0.0, dropout=0.0)[source]#

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(x, encoder_output, encoder_mask, kg_encoder_output, kg_encoder_mask, db_encoder_output, db_encoder_mask)[source]#

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class recwizard.modules.kgsf.transformer_utils.TransformerDecoder(n_heads, n_layers, embedding_size, ffn_size, vocabulary_size, embedding=None, dropout=0.0, attention_dropout=0.0, relu_dropout=0.0, embeddings_scale=True, learn_positional_embeddings=False, padding_idx=None, n_positions=1024)[source]#

Transformer Decoder layer.

Parameters:
  • n_heads (int) – the number of multihead attention heads.

  • n_layers (int) – number of transformer layers.

  • embedding_size (int) – the embedding sizes. Must be a multiple of n_heads.

  • ffn_size (int) – the size of the hidden layer in the FFN

  • embedding – an embedding matrix for the bottom layer of the transformer. If none, one is created for this encoder.

  • dropout (float) – Dropout used around embeddings and before layer layer normalizations. This is used in Vaswani 2017 and works well on large datasets.

  • attention_dropout (float) – Dropout performed after the multhead attention softmax. This is not used in Vaswani 2017.

  • relu_attention (float) – Dropout used after the ReLU in the FFN. Not used in Vaswani 2017, but used in Tensor2Tensor.

  • padding_idx (int) – Reserved padding index in the embeddings matrix.

  • learn_positional_embeddings (bool) – If off, sinusoidal embeddings are used. If on, position embeddings are learned from scratch.

  • embeddings_scale (bool) – Scale embeddings relative to their dimensionality. Found useful in fairseq.

  • n_positions (int) – Size of the position embeddings matrix.

__init__(n_heads, n_layers, embedding_size, ffn_size, vocabulary_size, embedding=None, dropout=0.0, attention_dropout=0.0, relu_dropout=0.0, embeddings_scale=True, learn_positional_embeddings=False, padding_idx=None, n_positions=1024)[source]#

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(input, encoder_state, incr_state=None)[source]#

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class recwizard.modules.kgsf.transformer_utils.TransformerMemNetModel(opt, dictionary)[source]#

Model which takes context, memories, candidates and encodes them

__init__(opt, dictionary)[source]#

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(xs, mems, cands)[source]#

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class recwizard.modules.kgsf.transformer_utils.TorchGeneratorModel(padding_idx=0, start_idx=1, end_idx=2, unknown_idx=3, input_dropout=0, longest_label=1)[source]#

This Interface expects you to implement model with the following reqs:

Attribute model.encoder:

takes input returns tuple (enc_out, enc_hidden, attn_mask)

Attribute model.decoder:

takes decoder params and returns decoder outputs after attn

Attribute model.output:

takes decoder outputs and returns distr over dictionary

__init__(padding_idx=0, start_idx=1, end_idx=2, unknown_idx=3, input_dropout=0, longest_label=1)[source]#

Initializes internal Module state, shared by both nn.Module and ScriptModule.

_starts(bsz)[source]#

Return bsz start tokens.

decode_greedy(encoder_states, bsz, maxlen)[source]#

Greedy search

Parameters:
  • bsz (int) – Batch size. Because encoder_states is model-specific, it cannot infer this automatically.

  • encoder_states (Model specific) – Output of the encoder model.

  • maxlen (int) – Maximum decoding length

Returns:

pair (logits, choices) of the greedy decode

Return type:

(FloatTensor[bsz, maxlen, vocab], LongTensor[bsz, maxlen])

decode_forced(encoder_states, ys)[source]#

Decode with a fixed, true sequence, computing loss. Useful for training, or ranking fixed candidates.

Parameters:
  • ys (LongTensor[bsz, time]) – the prediction targets. Contains both the start and end tokens.

  • encoder_states (model specific) – Output of the encoder. Model specific types.

Returns:

pair (logits, choices) containing the logits and MLE predictions

Return type:

(FloatTensor[bsz, ys, vocab], LongTensor[bsz, ys])

reorder_encoder_states(encoder_states, indices)[source]#

Reorder encoder states according to a new set of indices.

This is an abstract method, and must be implemented by the user.

Its purpose is to provide beam search with a model-agnostic interface for beam search. For example, this method is used to sort hypotheses, expand beams, etc.

For example, assume that encoder_states is an bsz x 1 tensor of values

indices = [0, 2, 2]
encoder_states = [[0.1]
                  [0.2]
                  [0.3]]

then the output will be

output = [[0.1]
          [0.3]
          [0.3]]
Parameters:
  • encoder_states (model specific) – output from encoder. type is model specific.

  • indices (list[int]) – the indices to select over. The user must support non-tensor inputs.

Returns:

The re-ordered encoder states. It should be of the same type as encoder states, and it must be a valid input to the decoder.

Return type:

model specific

reorder_decoder_incremental_state(incremental_state, inds)[source]#

Reorder incremental state for the decoder.

Used to expand selected beams in beam_search. Unlike reorder_encoder_states, implementing this method is optional. However, without incremental decoding, decoding a single beam becomes O(n^2) instead of O(n), which can make beam search impractically slow.

In order to fall back to non-incremental decoding, just return None from this method.

Parameters:
  • incremental_state (model specific) – second output of model.decoder

  • inds (LongTensor[n]) – indices to select and reorder over.

Returns:

The re-ordered decoder incremental states. It should be the same type as incremental_state, and usable as an input to the decoder. This method should return None if the model does not support incremental decoding.

Return type:

model specific

forward(*xs, ys=None, cand_params=None, prev_enc=None, maxlen=None, bsz=None)[source]#

Get output predictions from the model.

Parameters:
  • xs (LongTensor[bsz, seqlen]) – input to the encoder

  • ys (LongTensor[bsz, outlen]) – Expected output from the decoder. Used for teacher forcing to calculate loss.

  • prev_enc – if you know you’ll pass in the same xs multiple times, you can pass in the encoder output from the last forward pass to skip recalcuating the same encoder output.

  • maxlen – max number of tokens to decode. if not set, will use the length of the longest label this model has seen. ignored when ys is not None.

  • bsz – if ys is not provided, then you must specify the bsz for greedy decoding.

Returns:

(scores, candidate_scores, encoder_states) tuple

  • scores contains the model’s predicted token scores. (FloatTensor[bsz, seqlen, num_features])

  • candidate_scores are the score the model assigned to each candidate. (FloatTensor[bsz, num_cands])

  • encoder_states are the output of model.encoder. Model specific types. Feed this back in to skip encoding on the next call.

recwizard.modules.kgsf.utils.neginf(dtype)[source]#

Returns a representable finite number near -inf for a dtype.

recwizard.modules.kgsf.utils._create_embeddings(dictionary, embedding_size, padding_idx)[source]#

Create and initialize word embeddings.

recwizard.modules.kgsf.utils._create_entity_embeddings(entity_num, embedding_size, padding_idx)[source]#

Create and initialize word embeddings.

recwizard.modules.kgsf.utils._edge_list(kg, n_entity, hop)[source]#
recwizard.modules.kgsf.utils._concept_edge_list4GCN()[source]#
recwizard.modules.kgsf.utils.seed_everything(seed: int)[source]#