Supporting Modules#

class recwizard.modules.redial.autorec.AutoRec(n_movies, layer_sizes, g, f)[source]#

User-based Autoencoder for Collaborative Filtering

__init__(n_movies, layer_sizes, g, f)[source]#
Parameters:

config – config for PreTrainedModel

load_checkpoint(checkpoint, verbose=True, strict=True, LOAD_PREFIX='')[source]#

Load a checkpoint from a file.

Parameters:
  • checkpoint (str) – the path to the checkpoint file.

  • verbose (bool) – whether to print the message when loading the checkpoint.

  • strict (bool) – the strict argument passed to load_state_dict.

forward(input, additional_context=None, range01=True)[source]#
Parameters:
  • input – (batch, n_movies)

  • additional_context – potential information to add to user representation (batch, user_rep_size)

  • range01 – If true, apply sigmoid to the output

Returns:

output recommendations (batch, n_movies)

class recwizard.modules.redial.autorec.UserEncoder(layer_sizes, n_movies, f)[source]#
__init__(layer_sizes, n_movies, f)[source]#
Parameters:
  • layer_sizes – list giving the size of each layer

  • n_movies

  • f

forward(input, raw_last_layer=False)[source]#

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class recwizard.modules.redial.autorec.ReconstructionLoss[source]#
__init__()[source]#

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(input, target)[source]#

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

normalize_loss_reset(loss)[source]#

returns loss divided by nb of observed targets, and reset nb_observed_targets to 0 :param loss: Total summed loss :return: mean loss

recwizard.modules.redial.beam_search.get_best_beam(beams, normalization_alpha=0)[source]#
recwizard.modules.redial.beam_search.n_gram_repeats(sequence, n)[source]#

Returns true if sequence contains twice the same n-gram :param sequence: :param n: :return:

class recwizard.modules.redial.beam_search.Beam(sequence, likelihood, mentioned_movies=None)[source]#
__init__(sequence, likelihood, mentioned_movies=None)[source]#
__str__()[source]#

Return str(self).

normalized_score(alpha)[source]#

Get score with a length penalty following Wu et al ‘Google’s neural machine translation system: Bridging the gap between human and machine translation’ :param alpha: :return:

class recwizard.modules.redial.beam_search.BeamSearch[source]#
static update_beams(beams, beam_size, probabilities, n_gram_block=None)[source]#

One step of beam search :param n_gram_block: :param probabilities: list of beam_size probability tensors (one for each beam) :return: list of the new beams.

class recwizard.modules.redial.hrnn.HRNN(sentence_encoder_model, sentence_encoder_hidden_size, sentence_encoder_num_layers, conversation_encoder_hidden_size, conversation_encoder_num_layers, use_movie_occurrences, conv_bidirectional=False, return_all=True, return_sentence_representations=False, use_dropout=False)[source]#

Hierarchical Recurrent Neural Network

params.keys() [‘use_gensen’, ‘use_movie_occurrences’, ‘sentence_encoder_hidden_size’, ‘conversation_encoder_hidden_size’, ‘sentence_encoder_num_layers’, ‘conversation_encoder_num_layers’, ‘use_dropout’, [‘embedding_dimension’]]

Input: Input[“dialogue”] (batch, max_conv_length, max_utterance_length) Long Tensor

Input[“senders”] (batch, max_conv_length) Float Tensor Input[“lengths”] (batch, max_conv_length) list (optional) Input[“movie_occurrences”] (batch, max_conv_length, max_utterance_length) for word occurence

(batch, max_conv_length) for sentence occurrence. Float Tensor

__init__(sentence_encoder_model, sentence_encoder_hidden_size, sentence_encoder_num_layers, conversation_encoder_hidden_size, conversation_encoder_num_layers, use_movie_occurrences, conv_bidirectional=False, return_all=True, return_sentence_representations=False, use_dropout=False)[source]#

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(input_ids, attention_mask, senders, movie_occurrences, conversation_lengths, **kwargs)[source]#

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class recwizard.modules.redial.hrnn_for_classification.HRNNForClassification(hrnn_params, output_classes, return_liked_probability=True, multiple_items_per_example=True)[source]#
__init__(hrnn_params, output_classes, return_liked_probability=True, multiple_items_per_example=True)[source]#
Parameters:
  • return_liked_probability

  • multiple_items_per_example – should be set to True when each conversation corresponds an example (e.g. when generate output)

  • example (Should be set to False in training because each item corresponds an) –

forward(input_ids, attention_mask, senders, movie_occurrences, conversation_lengths)[source]#
Parameters:
  • input_ids

  • attention_mask

  • senders

  • movie_occurrences

  • conversation_lengths

Returns:

class recwizard.modules.redial.hrnn_for_classification.RedialSentimentAnalysisLoss(class_weight, use_targets)[source]#
__init__(class_weight, use_targets)[source]#

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(output, target)[source]#

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class recwizard.modules.redial.modeling_redial_gen.DecoderGRU(hidden_size, context_size, num_layers, word_embedding, peephole)[source]#

Conditioned GRU. The context vector is used as an initial hidden state at each layer of the GRU

__init__(hidden_size, context_size, num_layers, word_embedding, peephole)[source]#

Initializes internal Module state, shared by both nn.Module and ScriptModule.

set_pretrained_embeddings(embedding_matrix)[source]#

Set embedding weights.

forward(input_sequence, lengths, context=None, state=None)[source]#

If not peephole, use the context vector as initial hidden state at each layer. If peephole, concatenate context to embeddings at each time step instead. If context is not provided, assume that a state is given (for generation)

Parameters:
  • input_sequence – (batch_size, seq_len)

  • lengths – (batch_size)

  • context – (batch, hidden_size) vector on which to condition

  • state – (batch, num_layers, hidden_size) gru state

Returns:

ouptut predictions (batch_size, seq_len, hidden_size) [, h_n (batch, num_layers, hidden_size)]

class recwizard.modules.redial.modeling_redial_gen.SwitchingDecoder(hidden_size, context_size, num_layers, peephole, word_embedding=None)[source]#

Decoder that takes the recommendations into account. A switch choses whether to output a movie or a word

__init__(hidden_size, context_size, num_layers, peephole, word_embedding=None)[source]#

Initializes internal Module state, shared by both nn.Module and ScriptModule.

set_pretrained_embeddings(embedding_matrix)[source]#

Set embedding weights.

forward(input, lengths, context, movie_recommendations, log_probabilities, sample_movies, forbid_movies=None, temperature=1)[source]#
Parameters:
  • input – (batch, max_utterance_length)

  • log_probabilities

  • temperature

  • lengths

  • context – (batch, hidden_size)

  • movie_recommendations – (batch, n_movies) the movie recommendations that condition the utterances.

  • [0 (Not necessarily in) –

  • range (1]) –

  • sample_movies – (for generation) If true, sample a movie for each utterance, returning one-hot vectors

  • forbid_movies – (for generation) If provided, specifies movies that cannot be sampled

Returns:

[log] probabilities (batch, max_utterance_length, vocab + n_movies)

replace_movie_with_words(tokens, tokenizer)[source]#

If the ID corresponds to a movie, returns the sequence of tokens that correspond to this movie name :param tokens: list of token ids :param tokenizer: tokenizer used to encode the movie names

Returns: modified sequence

generate(initial_sequence=None, tokenizer=None, beam_size=10, max_seq_length=50, temperature=1, forbid_movies=None, **kwargs)[source]#

Beam search sentence generation :param initial_sequence: list giving the initial sequence of tokens :param kwargs: additional parameters to pass to model forward pass (e.g. a conditioning context)

Returns:

The best beam

class recwizard.modules.redial.tokenizer_rnn.NLTKTokenizer(language='english')[source]#
__init__(language='english')[source]#
recwizard.modules.redial.tokenizer_rnn.get_tokenizer(name='redial')[source]#
recwizard.modules.redial.tokenizer_rnn.RnnTokenizer(vocab, name='redial')[source]#

Return a tokenizer for RNN models from the given vocabulary :param vocab: list of words :type vocab: List[str] :param name: name of the tokenizer. Used to cache the tokenizer :type name: str

Returns:

PreTrainedTokenizerFast