响应合成器

class llama_index.indices.query.response_synthesis.ResponseSynthesizer(response_builder: Optional[BaseResponseBuilder], response_mode: ResponseMode, response_kwargs: Optional[Dict] = None, optimizer: Optional[BaseTokenUsageOptimizer] = None, node_postprocessors: Optional[List[BaseNodePostprocessor]] = None, callback_manager: Optional[CallbackManager] = None, verbose: bool = False)

Response synthesize class.

This class is responsible for synthesizing a response given a list of nodes. The way in which the response is synthesized depends on the response mode.

参数

response_builder (Optional[BaseResponseBuilder]) -- A response builder object.
response_mode (ResponseMode) -- A response mode.
response_kwargs (Optional[Dict]) -- A dictionary of response kwargs.
optimizer (Optional[BaseTokenUsageOptimizer]) -- A token usage optimizer.
node_postprocessors (Optional[List[BaseNodePostprocessor]]) -- A list of node postprocessors.
callback_manager (Optional[CallbackManager]) -- A callback manager.
verbose (bool) -- Whether to print debug statements.

classmethod from_args(service_context: Optional[ServiceContext] = None, streaming: bool = False, use_async: bool = False, text_qa_template: Optional[Prompt] = None, refine_template: Optional[Prompt] = None, simple_template: Optional[Prompt] = None, response_mode: ResponseMode = ResponseMode.COMPACT, response_kwargs: Optional[Dict] = None, node_postprocessors: Optional[List[BaseNodePostprocessor]] = None, callback_manager: Optional[CallbackManager] = None, optimizer: Optional[BaseTokenUsageOptimizer] = None, verbose: bool = False) → ResponseSynthesizer

Initialize response synthesizer from args.

参数

service_context (Optional[ServiceContext]) -- A service context.
streaming (bool) -- Whether to stream the response.
use_async (bool) -- Whether to use async.
text_qa_template (Optional[QuestionAnswerPrompt]) -- A text QA template.
refine_template (Optional[RefinePrompt]) -- A refine template.
simple_template (Optional[SimpleInputPrompt]) -- A simple template.
response_mode (ResponseMode) -- A response mode.
response_kwargs (Optional[Dict]) -- A dictionary of response kwargs.
node_postprocessors (Optional[List[BaseNodePostprocessor]]) -- A list of node postprocessors.
callback_manager (Optional[CallbackManager]) -- A callback manager.
optimizer (Optional[BaseTokenUsageOptimizer]) -- A token usage optimizer.
verbose (bool) -- Whether to print debug statements.

class llama_index.indices.response.type.ResponseMode(value)

Response modes of the response builder (and synthesizer).

ACCUMULATE = 'accumulate': Synthesize a response for each text chunk, and then return the concatenation.

COMPACT = 'compact': Compact and refine mode first combine text chunks into larger consolidated chunks that more fully utilize the available context window, then refine answers across them. This mode is faster than refine since we make fewer calls to the LLM.

COMPACT_ACCUMULATE = 'compact_accumulate': Compact and accumulate mode first combine text chunks into larger consolidated chunks that more fully utilize the available context window, then accumulate answers for each of them and finally return the concatenation. This mode is faster than accumulate since we make fewer calls to the LLM.

GENERATION = 'generation': Ignore context, just use LLM to generate a response.

NO_TEXT = 'no_text': Return the retrieved context nodes, without synthesizing a final response.

REFINE = 'refine': Refine is an iterative way of generating a response. We first use the context in the first node, along with the query, to generate an initial answer. We then pass this answer, the query, and the context of the second node as input into a “refine prompt” to generate a refined answer. We refine through N-1 nodes, where N is the total number of nodes.

SIMPLE_SUMMARIZE = 'simple_summarize': Merge all text chunks into one, and make a LLM call. This will fail if the merged text chunk exceeds the context window size.

TREE_SUMMARIZE = 'tree_summarize': Build a tree index over the set of candidate nodes, with a summary prompt seeded with the query. The tree is built in a bottoms-up fashion, and in the end the root node is returned as the response