LLM Predictors

Init params.

class llama_index.llm_predictor.HuggingFaceLLMPredictor(max_input_size: int = 4096, max_new_tokens: int = 256, temperature: float = 0.7, do_sample: bool = False, system_prompt: str = '', query_wrapper_prompt: ~llama_index.prompts.prompts.SimpleInputPrompt = <llama_index.prompts.prompts.SimpleInputPrompt object>, tokenizer_name: str = 'StabilityAI/stablelm-tuned-alpha-3b', model_name: str = 'StabilityAI/stablelm-tuned-alpha-3b', model: ~typing.Optional[~typing.Any] = None, tokenizer: ~typing.Optional[~typing.Any] = None, device_map: str = 'auto', stopping_ids: ~typing.Optional[~typing.List[int]] = None, tokenizer_kwargs: ~typing.Optional[dict] = None, model_kwargs: ~typing.Optional[dict] = None, callback_manager: ~typing.Optional[~llama_index.callbacks.base.CallbackManager] = None)

Huggingface Specific LLM predictor class.

Wrapper around an LLMPredictor to provide streamlined access to HuggingFace models.

参数

llm (Optional[langchain.llms.base.LLM]) -- LLM from Langchain to use for predictions. Defaults to OpenAI's text-davinci-003 model. Please see Langchain's LLM Page for more details.
retry_on_throttling (bool) -- Whether to retry on rate limit errors. Defaults to true.

async apredict(prompt: Prompt, **prompt_args: Any) → Tuple[str, str]

Async predict the answer to a query.

参数: prompt (Prompt) -- Prompt to use for prediction.
返回: Tuple of the predicted answer and the formatted prompt.
返回类型: Tuple[str, str]

get_llm_metadata() → LLMMetadata: Get LLM metadata.

property last_token_usage: int: Get the last token usage.

predict(prompt: Prompt, **prompt_args: Any) → Tuple[str, str]

Predict the answer to a query.

参数: prompt (Prompt) -- Prompt to use for prediction.
返回: Tuple of the predicted answer and the formatted prompt.
返回类型: Tuple[str, str]

stream(prompt: Prompt, **prompt_args: Any) → Tuple[Generator, str]

Stream the answer to a query.

NOTE: this is a beta feature. Will try to build or use better abstractions about response handling.

参数: prompt (Prompt) -- Prompt to use for prediction.
返回: The predicted answer.
返回类型: str

property total_tokens_used: int: Get the total tokens used so far.

class llama_index.llm_predictor.LLMPredictor(llm: Optional[BaseLanguageModel] = None, retry_on_throttling: bool = True, cache: Optional[BaseCache] = None, callback_manager: Optional[CallbackManager] = None)

LLM predictor class.

Wrapper around an LLMChain from Langchain.

参数

llm (Optional[langchain.llms.base.LLM]) --
LLM from Langchain to use for predictions. Defaults to OpenAI's text-davinci-003 model. Please see Langchain's LLM Page for more details.
retry_on_throttling (bool) -- Whether to retry on rate limit errors. Defaults to true.
cache (Optional[langchain.cache.BaseCache]) -- use cached result for LLM

async apredict(prompt: Prompt, **prompt_args: Any) → Tuple[str, str]

Async predict the answer to a query.

参数: prompt (Prompt) -- Prompt to use for prediction.
返回: Tuple of the predicted answer and the formatted prompt.
返回类型: Tuple[str, str]

get_llm_metadata() → LLMMetadata: Get LLM metadata.

property last_token_usage: int: Get the last token usage.

property llm: BaseLanguageModel: Get LLM.

predict(prompt: Prompt, **prompt_args: Any) → Tuple[str, str]

Predict the answer to a query.

参数: prompt (Prompt) -- Prompt to use for prediction.
返回: Tuple of the predicted answer and the formatted prompt.
返回类型: Tuple[str, str]

stream(prompt: Prompt, **prompt_args: Any) → Tuple[Generator, str]

Stream the answer to a query.

NOTE: this is a beta feature. Will try to build or use better abstractions about response handling.

参数: prompt (Prompt) -- Prompt to use for prediction.
返回: The predicted answer.
返回类型: str

property total_tokens_used: int: Get the total tokens used so far.

class llama_index.llm_predictor.StructuredLLMPredictor(llm: Optional[BaseLanguageModel] = None, retry_on_throttling: bool = True, cache: Optional[BaseCache] = None, callback_manager: Optional[CallbackManager] = None)

Structured LLM predictor class.

参数: llm_predictor (BaseLLMPredictor) -- LLM Predictor to use.

async apredict(prompt: Prompt, **prompt_args: Any) → Tuple[str, str]

Async predict the answer to a query.

参数: prompt (Prompt) -- Prompt to use for prediction.
返回: Tuple of the predicted answer and the formatted prompt.
返回类型: Tuple[str, str]

get_llm_metadata() → LLMMetadata: Get LLM metadata.

property last_token_usage: int: Get the last token usage.

property llm: BaseLanguageModel: Get LLM.

predict(prompt: Prompt, **prompt_args: Any) → Tuple[str, str]

Predict the answer to a query.

参数: prompt (Prompt) -- Prompt to use for prediction.
返回: Tuple of the predicted answer and the formatted prompt.
返回类型: Tuple[str, str]

stream(prompt: Prompt, **prompt_args: Any) → Tuple[Generator, str]

Stream the answer to a query.

NOTE: this is a beta feature. Will try to build or use better abstractions about response handling.

参数: prompt (Prompt) -- Prompt to use for prediction.
返回: The predicted answer.
返回类型: str

property total_tokens_used: int: Get the total tokens used so far.