LLM Predictors
Init params.
- class llama_index.llm_predictor.HuggingFaceLLMPredictor(max_input_size: int = 4096, max_new_tokens: int = 256, temperature: float = 0.7, do_sample: bool = False, system_prompt: str = '', query_wrapper_prompt: ~llama_index.prompts.prompts.SimpleInputPrompt = <llama_index.prompts.prompts.SimpleInputPrompt object>, tokenizer_name: str = 'StabilityAI/stablelm-tuned-alpha-3b', model_name: str = 'StabilityAI/stablelm-tuned-alpha-3b', model: ~typing.Optional[~typing.Any] = None, tokenizer: ~typing.Optional[~typing.Any] = None, device_map: str = 'auto', stopping_ids: ~typing.Optional[~typing.List[int]] = None, tokenizer_kwargs: ~typing.Optional[dict] = None, model_kwargs: ~typing.Optional[dict] = None, callback_manager: ~typing.Optional[~llama_index.callbacks.base.CallbackManager] = None)
Huggingface Specific LLM predictor class.
Wrapper around an LLMPredictor to provide streamlined access to HuggingFace models.
- 参数
llm (Optional[langchain.llms.base.LLM]) -- LLM from Langchain to use for predictions. Defaults to OpenAI's text-davinci-003 model. Please see Langchain's LLM Page for more details.
retry_on_throttling (bool) -- Whether to retry on rate limit errors. Defaults to true.
- async apredict(prompt: Prompt, **prompt_args: Any) Tuple[str, str]
Async predict the answer to a query.
- 参数
prompt (Prompt) -- Prompt to use for prediction.
- 返回
Tuple of the predicted answer and the formatted prompt.
- 返回类型
Tuple[str, str]
- get_llm_metadata() LLMMetadata
Get LLM metadata.
- property last_token_usage: int
Get the last token usage.
- predict(prompt: Prompt, **prompt_args: Any) Tuple[str, str]
Predict the answer to a query.
- 参数
prompt (Prompt) -- Prompt to use for prediction.
- 返回
Tuple of the predicted answer and the formatted prompt.
- 返回类型
Tuple[str, str]
- stream(prompt: Prompt, **prompt_args: Any) Tuple[Generator, str]
Stream the answer to a query.
NOTE: this is a beta feature. Will try to build or use better abstractions about response handling.
- 参数
prompt (Prompt) -- Prompt to use for prediction.
- 返回
The predicted answer.
- 返回类型
str
- property total_tokens_used: int
Get the total tokens used so far.
- class llama_index.llm_predictor.LLMPredictor(llm: Optional[BaseLanguageModel] = None, retry_on_throttling: bool = True, cache: Optional[BaseCache] = None, callback_manager: Optional[CallbackManager] = None)
LLM predictor class.
Wrapper around an LLMChain from Langchain.
- 参数
llm (Optional[langchain.llms.base.LLM]) --
LLM from Langchain to use for predictions. Defaults to OpenAI's text-davinci-003 model. Please see Langchain's LLM Page for more details.
retry_on_throttling (bool) -- Whether to retry on rate limit errors. Defaults to true.
cache (Optional[langchain.cache.BaseCache]) -- use cached result for LLM
- async apredict(prompt: Prompt, **prompt_args: Any) Tuple[str, str]
Async predict the answer to a query.
- 参数
prompt (Prompt) -- Prompt to use for prediction.
- 返回
Tuple of the predicted answer and the formatted prompt.
- 返回类型
Tuple[str, str]
- get_llm_metadata() LLMMetadata
Get LLM metadata.
- property last_token_usage: int
Get the last token usage.
- property llm: BaseLanguageModel
Get LLM.
- predict(prompt: Prompt, **prompt_args: Any) Tuple[str, str]
Predict the answer to a query.
- 参数
prompt (Prompt) -- Prompt to use for prediction.
- 返回
Tuple of the predicted answer and the formatted prompt.
- 返回类型
Tuple[str, str]
- stream(prompt: Prompt, **prompt_args: Any) Tuple[Generator, str]
Stream the answer to a query.
NOTE: this is a beta feature. Will try to build or use better abstractions about response handling.
- 参数
prompt (Prompt) -- Prompt to use for prediction.
- 返回
The predicted answer.
- 返回类型
str
- property total_tokens_used: int
Get the total tokens used so far.
- class llama_index.llm_predictor.StructuredLLMPredictor(llm: Optional[BaseLanguageModel] = None, retry_on_throttling: bool = True, cache: Optional[BaseCache] = None, callback_manager: Optional[CallbackManager] = None)
Structured LLM predictor class.
- 参数
llm_predictor (BaseLLMPredictor) -- LLM Predictor to use.
- async apredict(prompt: Prompt, **prompt_args: Any) Tuple[str, str]
Async predict the answer to a query.
- 参数
prompt (Prompt) -- Prompt to use for prediction.
- 返回
Tuple of the predicted answer and the formatted prompt.
- 返回类型
Tuple[str, str]
- get_llm_metadata() LLMMetadata
Get LLM metadata.
- property last_token_usage: int
Get the last token usage.
- property llm: BaseLanguageModel
Get LLM.
- predict(prompt: Prompt, **prompt_args: Any) Tuple[str, str]
Predict the answer to a query.
- 参数
prompt (Prompt) -- Prompt to use for prediction.
- 返回
Tuple of the predicted answer and the formatted prompt.
- 返回类型
Tuple[str, str]
- stream(prompt: Prompt, **prompt_args: Any) Tuple[Generator, str]
Stream the answer to a query.
NOTE: this is a beta feature. Will try to build or use better abstractions about response handling.
- 参数
prompt (Prompt) -- Prompt to use for prediction.
- 返回
The predicted answer.
- 返回类型
str
- property total_tokens_used: int
Get the total tokens used so far.