_Ref-Indices-StructStore:

结构化存储索引

Structured store indices.

class llama_index.indices.struct_store.GPTNLPandasQueryEngine(index: GPTPandasIndex, instruction_str: Optional[str] = None, output_processor: Optional[Callable] = None, pandas_prompt: Optional[Prompt] = None, output_kwargs: Optional[dict] = None, head: int = 5, verbose: bool = False, **kwargs: Any)

GPT Pandas query.

Convert natural language to Pandas python code.

参数
  • df (pd.DataFrame) -- Pandas dataframe to use.

  • instruction_str (Optional[str]) -- Instruction string to use.

  • output_processor (Optional[Callable[[str], str]]) -- Output processor. A callable that takes in the output string, pandas DataFrame, and any output kwargs and returns a string.

  • pandas_prompt (Optional[PandasPrompt]) -- Pandas prompt to use.

  • head (int) -- Number of rows to show in the table context.

class llama_index.indices.struct_store.GPTNLStructStoreQueryEngine(index: GPTSQLStructStoreIndex, text_to_sql_prompt: Optional[Prompt] = None, context_query_kwargs: Optional[dict] = None, synthesize_response: bool = True, response_synthesis_prompt: Optional[Prompt] = None, **kwargs: Any)

GPT natural language query engine over a structured database.

Given a natural language query, we will extract the query to SQL. Runs raw SQL over a GPTSQLStructStoreIndex. No LLM calls are made during the SQL execution. NOTE: this query cannot work with composed indices - if the index contains subindices, those subindices will not be queried.

参数
  • index (GPTSQLStructStoreIndex) -- A GPT SQL Struct Store Index

  • text_to_sql_prompt (Optional[Prompt]) -- A Text to SQL Prompt to use for the query. Defaults to DEFAULT_TEXT_TO_SQL_PROMPT.

  • context_query_kwargs (Optional[dict]) -- Keyword arguments for the context query. Defaults to {}.

  • synthesize_response (bool) -- Whether to synthesize a response from the query results. Defaults to True.

  • response_synthesis_prompt (Optional[Prompt]) -- A Response Synthesis Prompt to use for the query. Defaults to DEFAULT_RESPONSE_SYNTHESIS_PROMPT.

property service_context: ServiceContext

Get service context.

class llama_index.indices.struct_store.GPTPandasIndex(df: DataFrame, nodes: Optional[Sequence[Node]] = None, index_struct: Optional[PandasStructTable] = None, **kwargs: Any)

Base GPT Pandas Index.

The GPTPandasStructStoreIndex is an index that stores a Pandas dataframe under the hood. Currently index "construction" is not supported.

During query time, the user can either specify a raw SQL query or a natural language query to retrieve their data.

参数

pandas_df (Optional[pd.DataFrame]) -- Pandas dataframe to use. See Ref-Struct-Store for more details.

delete_nodes(doc_ids: List[str], delete_from_docstore: bool = False, **delete_kwargs: Any) None

Delete a list of nodes from the index.

参数

doc_ids (List[str]) -- A list of doc_ids from the nodes to delete

delete_ref_doc(ref_doc_id: str, delete_from_docstore: bool = False, **delete_kwargs: Any) None

Delete a document and it's nodes by using ref_doc_id.

classmethod from_documents(documents: Sequence[Document], storage_context: Optional[StorageContext] = None, service_context: Optional[ServiceContext] = None, **kwargs: Any) IndexType

Create index from documents.

参数

documents (Optional[Sequence[BaseDocument]]) -- List of documents to build the index from.

property index_id: str

Get the index struct.

insert(document: Document, **insert_kwargs: Any) None

Insert a document.

property ref_doc_info: Dict[str, RefDocInfo]

Retrieve a dict mapping of ingested documents and their nodes+metadata.

refresh(documents: Sequence[Document], **update_kwargs: Any) List[bool]

Refresh an index with documents that have changed.

This allows users to save LLM and Embedding model calls, while only updating documents that have any changes in text or extra_info. It will also insert any documents that previously were not stored.

refresh_ref_docs(documents: Sequence[Document], **update_kwargs: Any) List[bool]

Refresh an index with documents that have changed.

This allows users to save LLM and Embedding model calls, while only updating documents that have any changes in text or extra_info. It will also insert any documents that previously were not stored.

set_index_id(index_id: str) None

Set the index id.

NOTE: if you decide to set the index_id on the index_struct manually, you will need to explicitly call add_index_struct on the index_store to update the index store.

参数

index_id (str) -- Index id to set.

update(document: Document, **update_kwargs: Any) None

Update a document and it's corresponding nodes.

This is equivalent to deleting the document and then inserting it again.

参数
  • document (Union[BaseDocument, BaseGPTIndex]) -- document to update

  • insert_kwargs (Dict) -- kwargs to pass to insert

  • delete_kwargs (Dict) -- kwargs to pass to delete

update_ref_doc(document: Document, **update_kwargs: Any) None

Update a document and it's corresponding nodes.

This is equivalent to deleting the document and then inserting it again.

参数
  • document (Union[BaseDocument, BaseGPTIndex]) -- document to update

  • insert_kwargs (Dict) -- kwargs to pass to insert

  • delete_kwargs (Dict) -- kwargs to pass to delete

class llama_index.indices.struct_store.GPTSQLStructStoreIndex(nodes: Optional[Sequence[Node]] = None, index_struct: Optional[SQLStructTable] = None, service_context: Optional[ServiceContext] = None, sql_database: Optional[SQLDatabase] = None, table_name: Optional[str] = None, table: Optional[Table] = None, ref_doc_id_column: Optional[str] = None, sql_context_container: Optional[SQLContextContainer] = None, **kwargs: Any)

Base GPT SQL Struct Store Index.

The GPTSQLStructStoreIndex is an index that uses a SQL database under the hood. During index construction, the data can be inferred from unstructured documents given a schema extract prompt, or it can be pre-loaded in the database.

During query time, the user can either specify a raw SQL query or a natural language query to retrieve their data.

参数
  • documents (Optional[Sequence[DOCUMENTS_INPUT]]) -- Documents to index. NOTE: in the SQL index, this is an optional field.

  • sql_database (Optional[SQLDatabase]) -- SQL database to use, including table names to specify. See Ref-Struct-Store for more details.

  • table_name (Optional[str]) -- Name of the table to use for extracting data. Either table_name or table must be specified.

  • table (Optional[Table]) -- SQLAlchemy Table object to use. Specifying the Table object explicitly, instead of the table name, allows you to pass in a view. Either table_name or table must be specified.

  • sql_context_container (Optional[SQLContextContainer]) -- SQL context container. an be generated from a SQLContextContainerBuilder. See Ref-Struct-Store for more details.

delete_nodes(doc_ids: List[str], delete_from_docstore: bool = False, **delete_kwargs: Any) None

Delete a list of nodes from the index.

参数

doc_ids (List[str]) -- A list of doc_ids from the nodes to delete

delete_ref_doc(ref_doc_id: str, delete_from_docstore: bool = False, **delete_kwargs: Any) None

Delete a document and it's nodes by using ref_doc_id.

classmethod from_documents(documents: Sequence[Document], storage_context: Optional[StorageContext] = None, service_context: Optional[ServiceContext] = None, **kwargs: Any) IndexType

Create index from documents.

参数

documents (Optional[Sequence[BaseDocument]]) -- List of documents to build the index from.

property index_id: str

Get the index struct.

insert(document: Document, **insert_kwargs: Any) None

Insert a document.

property ref_doc_info: Dict[str, RefDocInfo]

Retrieve a dict mapping of ingested documents and their nodes+metadata.

refresh(documents: Sequence[Document], **update_kwargs: Any) List[bool]

Refresh an index with documents that have changed.

This allows users to save LLM and Embedding model calls, while only updating documents that have any changes in text or extra_info. It will also insert any documents that previously were not stored.

refresh_ref_docs(documents: Sequence[Document], **update_kwargs: Any) List[bool]

Refresh an index with documents that have changed.

This allows users to save LLM and Embedding model calls, while only updating documents that have any changes in text or extra_info. It will also insert any documents that previously were not stored.

set_index_id(index_id: str) None

Set the index id.

NOTE: if you decide to set the index_id on the index_struct manually, you will need to explicitly call add_index_struct on the index_store to update the index store.

参数

index_id (str) -- Index id to set.

update(document: Document, **update_kwargs: Any) None

Update a document and it's corresponding nodes.

This is equivalent to deleting the document and then inserting it again.

参数
  • document (Union[BaseDocument, BaseGPTIndex]) -- document to update

  • insert_kwargs (Dict) -- kwargs to pass to insert

  • delete_kwargs (Dict) -- kwargs to pass to delete

update_ref_doc(document: Document, **update_kwargs: Any) None

Update a document and it's corresponding nodes.

This is equivalent to deleting the document and then inserting it again.

参数
  • document (Union[BaseDocument, BaseGPTIndex]) -- document to update

  • insert_kwargs (Dict) -- kwargs to pass to insert

  • delete_kwargs (Dict) -- kwargs to pass to delete

class llama_index.indices.struct_store.GPTSQLStructStoreQueryEngine(index: GPTSQLStructStoreIndex, sql_context_container: Optional[SQLContextContainerBuilder] = None, **kwargs: Any)

GPT SQL query engine over a structured database.

Runs raw SQL over a GPTSQLStructStoreIndex. No LLM calls are made here. NOTE: this query cannot work with composed indices - if the index contains subindices, those subindices will not be queried.

class llama_index.indices.struct_store.SQLContextContainerBuilder(sql_database: SQLDatabase, context_dict: Optional[Dict[str, str]] = None, context_str: Optional[str] = None)

SQLContextContainerBuilder.

Build a SQLContextContainer that can be passed to the SQL index during index construction or during query-time.

NOTE: if context_str is specified, that will be used as context instead of context_dict

参数
  • sql_database (SQLDatabase) -- SQL database

  • context_dict (Optional[Dict[str, str]]) -- context dict

build_context_container(ignore_db_schema: bool = False) SQLContextContainer

Build index structure.

derive_index_from_context(index_cls: Type[BaseGPTIndex], ignore_db_schema: bool = False, **index_kwargs: Any) BaseGPTIndex

Derive index from context.

classmethod from_documents(documents_dict: Dict[str, List[BaseDocument]], sql_database: SQLDatabase, **context_builder_kwargs: Any) SQLContextContainerBuilder

Build context from documents.

query_index_for_context(index: BaseGPTIndex, query_str: Union[str, QueryBundle], query_tmpl: Optional[str] = 'Please return the relevant tables (including the full schema) for the following query: {orig_query_str}', store_context_str: bool = True, **index_kwargs: Any) str

Query index for context.

A simple wrapper around the index.query call which injects a query template to specifically fetch table information, and can store a context_str.

参数
  • index (BaseGPTIndex) -- index data structure

  • query_str (QueryType) -- query string

  • query_tmpl (Optional[str]) -- query template

  • store_context_str (bool) -- store context_str