Llama#

class recwizard.modules.llm.modeling_llama_gen.LlamaGen(config: LLMConfig, prompt=None, model_name=None, debug=False, **kwargs)[source]#

The generator implemented based on OpanAI’s GPT models.

__init__(config: LLMConfig, prompt=None, model_name=None, debug=False, **kwargs)[source]#

Initializes the instance based on the config file.

Parameters:
  • config (ChatgptGenConfig) – The config file.

  • prompt (str, optional) – A prompt to override the prompt from config file.

  • model_name (str, optional) – The specified GPT model’s name.

classmethod from_pretrained(pretrained_model_name_or_path, config=None, prompt=None, model_name=None)[source]#

Get an instance of this class.

Parameters:
  • config

  • pretrained_model_name_or_path

  • prompt (str, optional) – The prompt to override the prompt from config file.

  • model_name (str, optional) – The specified GPT model’s name.

Returns:

the instance.

save_pretrained(save_directory: str | PathLike, push_to_hub: bool = False, **kwargs)[source]#

Save a model and its configuration file to a directory, so that it can be re-loaded using the [~PreTrainedModel.from_pretrained] class method.

Parameters:
  • save_directory (str or os.PathLike) – Directory to which to save. Will be created if it doesn’t exist.

  • is_main_process (bool, optional, defaults to True) – Whether the process calling this is the main process or not. Useful when in distributed training like TPUs and need to call this function on all processes. In this case, set is_main_process=True only on the main process to avoid race conditions.

  • state_dict (nested dictionary of torch.Tensor) – The state dictionary of the model to save. Will default to self.state_dict(), but can be used to only save parts of the model or if special precautions need to be taken when recovering the state dictionary of a model (like when using model parallelism).

  • save_function (Callable) – The function to use to save the state dictionary. Useful on distributed training like TPUs when one need to replace torch.save by another method.

  • push_to_hub (bool, optional, defaults to False) – Whether or not to push your model to the Hugging Face model hub after saving it. You can specify the repository you want to push to with repo_id (will default to the name of save_directory in your namespace).

  • max_shard_size (int or str, optional, defaults to “10GB”) –

    The maximum size for a checkpoint before being sharded. Checkpoints shard will then be each of size lower than this size. If expressed as a string, needs to be digits followed by a unit (like “5MB”).

    <Tip warning={true}>

    If a single weight of the model is bigger than max_shard_size, it will be in its own checkpoint shard which will be bigger than max_shard_size.

    </Tip>

  • safe_serialization (bool, optional, defaults to False) – Whether to save the model using safetensors or the traditional PyTorch way (that uses pickle).

  • variant (str, optional) – If specified, weights are saved in the format pytorch_model.<variant>.bin.

  • token (str or bool, optional) – The token to use as HTTP bearer authorization for remote files. If True, or not specified, will use the token generated when running huggingface-cli login (stored in ~/.huggingface).

  • save_peft_format (bool, optional, defaults to True) – For backward compatibility with PEFT library, in case adapter weights are attached to the model, all keys of the state dict of adapters needs to be pre-pended with base_model.model. Advanced users can disable this behaviours by setting save_peft_format to False.

  • kwargs (Dict[str, Any], optional) – Additional key word arguments passed along to the [~utils.PushToHubMixin.push_to_hub] method.

classmethod get_tokenizer(**kwargs)[source]#

Get a tokenizer.

Returns:

the tokenizer.

Return type:

(ChatgptTokenizer)

response(**kwargs)#

Generate a template to response the processed user’s input.

Parameters:
  • raw_input (str) – The user’s raw input.

  • tokenizer (BaseTokenizer, optional) – A tokenizer to process the raw input.

  • recs (list, optional) – The recommended movies.

  • max_tokens (int) – The maximum number of tokens used for ChatGPT API.

  • temperature (float) – The temperature value used for ChatGPT API.

  • model_name (str, optional) – The specified GPT model’s name.

  • return_dict (bool) – Whether to return a dict or a list.

Returns:

The template to response the processed user’s input.

Return type:

str

class recwizard.modules.llm.tokenizer_llama.LlamaTokenizer(**kwargs)[source]#

The tokenizer for the generator based on OpenAI’s GPT models.

__init__(**kwargs)[source]#

Initializes the instance of this tokenizer.

preprocess(raw_input, **kwargs)[source]#

Process the raw input by extracting the pure text.

Parameters:

context (str) – The raw input.

Returns: