Table Index

Building the Keyword Table Index

Keyword Table Index Data Structures.

class llama_index.indices.keyword_table.GPTKeywordTableIndex(nodes: Optional[Sequence[Node]] = None, index_struct: Optional[KeywordTable] = None, service_context: Optional[ServiceContext] = None, keyword_extract_template: Optional[KeywordExtractPrompt] = None, max_keywords_per_chunk: int = 10, use_async: bool = False, **kwargs: Any)

GPT Keyword Table Index.

This index uses a GPT model to extract keywords from the text.

delete(doc_id: str, **delete_kwargs: Any) None

Delete a document from the index.

All nodes in the index related to the index will be deleted.

参数

doc_id (str) -- document id

property docstore: BaseDocumentStore

Get the docstore corresponding to the index.

classmethod from_documents(documents: Sequence[Document], storage_context: Optional[StorageContext] = None, service_context: Optional[ServiceContext] = None, **kwargs: Any) IndexType

Create index from documents.

参数

documents (Optional[Sequence[BaseDocument]]) -- List of documents to build the index from.

property index_id: str

Get the index struct.

property index_struct: IS

Get the index struct.

index_struct_cls

KeywordTable 的别名

insert(document: Document, **insert_kwargs: Any) None

Insert a document.

refresh(documents: Sequence[Document], **update_kwargs: Any) List[bool]

Refresh an index with documents that have changed.

This allows users to save LLM and Embedding model calls, while only updating documents that have any changes in text or extra_info. It will also insert any documents that previously were not stored.

set_index_id(index_id: str) None

Set the index id.

NOTE: if you decide to set the index_id on the index_struct manually, you will need to explicitly call add_index_struct on the index_store to update the index store.

参数

index_id (str) -- Index id to set.

update(document: Document, **update_kwargs: Any) None

Update a document.

This is equivalent to deleting the document and then inserting it again.

参数
  • document (Union[BaseDocument, BaseGPTIndex]) -- document to update

  • insert_kwargs (Dict) -- kwargs to pass to insert

  • delete_kwargs (Dict) -- kwargs to pass to delete

class llama_index.indices.keyword_table.GPTRAKEKeywordTableIndex(nodes: Optional[Sequence[Node]] = None, index_struct: Optional[KeywordTable] = None, service_context: Optional[ServiceContext] = None, keyword_extract_template: Optional[KeywordExtractPrompt] = None, max_keywords_per_chunk: int = 10, use_async: bool = False, **kwargs: Any)

GPT RAKE Keyword Table Index.

This index uses a RAKE keyword extractor to extract keywords from the text.

delete(doc_id: str, **delete_kwargs: Any) None

Delete a document from the index.

All nodes in the index related to the index will be deleted.

参数

doc_id (str) -- document id

property docstore: BaseDocumentStore

Get the docstore corresponding to the index.

classmethod from_documents(documents: Sequence[Document], storage_context: Optional[StorageContext] = None, service_context: Optional[ServiceContext] = None, **kwargs: Any) IndexType

Create index from documents.

参数

documents (Optional[Sequence[BaseDocument]]) -- List of documents to build the index from.

property index_id: str

Get the index struct.

property index_struct: IS

Get the index struct.

index_struct_cls

KeywordTable 的别名

insert(document: Document, **insert_kwargs: Any) None

Insert a document.

refresh(documents: Sequence[Document], **update_kwargs: Any) List[bool]

Refresh an index with documents that have changed.

This allows users to save LLM and Embedding model calls, while only updating documents that have any changes in text or extra_info. It will also insert any documents that previously were not stored.

set_index_id(index_id: str) None

Set the index id.

NOTE: if you decide to set the index_id on the index_struct manually, you will need to explicitly call add_index_struct on the index_store to update the index store.

参数

index_id (str) -- Index id to set.

update(document: Document, **update_kwargs: Any) None

Update a document.

This is equivalent to deleting the document and then inserting it again.

参数
  • document (Union[BaseDocument, BaseGPTIndex]) -- document to update

  • insert_kwargs (Dict) -- kwargs to pass to insert

  • delete_kwargs (Dict) -- kwargs to pass to delete

class llama_index.indices.keyword_table.GPTSimpleKeywordTableIndex(nodes: Optional[Sequence[Node]] = None, index_struct: Optional[KeywordTable] = None, service_context: Optional[ServiceContext] = None, keyword_extract_template: Optional[KeywordExtractPrompt] = None, max_keywords_per_chunk: int = 10, use_async: bool = False, **kwargs: Any)

GPT Simple Keyword Table Index.

This index uses a simple regex extractor to extract keywords from the text.

delete(doc_id: str, **delete_kwargs: Any) None

Delete a document from the index.

All nodes in the index related to the index will be deleted.

参数

doc_id (str) -- document id

property docstore: BaseDocumentStore

Get the docstore corresponding to the index.

classmethod from_documents(documents: Sequence[Document], storage_context: Optional[StorageContext] = None, service_context: Optional[ServiceContext] = None, **kwargs: Any) IndexType

Create index from documents.

参数

documents (Optional[Sequence[BaseDocument]]) -- List of documents to build the index from.

property index_id: str

Get the index struct.

property index_struct: IS

Get the index struct.

index_struct_cls

KeywordTable 的别名

insert(document: Document, **insert_kwargs: Any) None

Insert a document.

refresh(documents: Sequence[Document], **update_kwargs: Any) List[bool]

Refresh an index with documents that have changed.

This allows users to save LLM and Embedding model calls, while only updating documents that have any changes in text or extra_info. It will also insert any documents that previously were not stored.

set_index_id(index_id: str) None

Set the index id.

NOTE: if you decide to set the index_id on the index_struct manually, you will need to explicitly call add_index_struct on the index_store to update the index store.

参数

index_id (str) -- Index id to set.

update(document: Document, **update_kwargs: Any) None

Update a document.

This is equivalent to deleting the document and then inserting it again.

参数
  • document (Union[BaseDocument, BaseGPTIndex]) -- document to update

  • insert_kwargs (Dict) -- kwargs to pass to insert

  • delete_kwargs (Dict) -- kwargs to pass to delete

class llama_index.indices.keyword_table.KeywordTableGPTRetriever(index: BaseGPTKeywordTableIndex, keyword_extract_template: Optional[KeywordExtractPrompt] = None, query_keyword_extract_template: Optional[QueryKeywordExtractPrompt] = None, max_keywords_per_query: int = 10, num_chunks_per_query: int = 10, **kwargs: Any)

Keyword Table Index GPT Retriever.

Extracts keywords using GPT. Set when using retriever_mode="default".

See BaseGPTKeywordTableQuery for arguments.

retrieve(str_or_query_bundle: Union[str, QueryBundle]) List[NodeWithScore]

Retrieve nodes given query.

参数

str_or_query_bundle (QueryType) -- Either a query string or a QueryBundle object.

class llama_index.indices.keyword_table.KeywordTableRAKERetriever(index: BaseGPTKeywordTableIndex, keyword_extract_template: Optional[KeywordExtractPrompt] = None, query_keyword_extract_template: Optional[QueryKeywordExtractPrompt] = None, max_keywords_per_query: int = 10, num_chunks_per_query: int = 10, **kwargs: Any)

Keyword Table Index RAKE Retriever.

Extracts keywords using RAKE keyword extractor. Set when retriever_mode="rake".

See BaseGPTKeywordTableQuery for arguments.

retrieve(str_or_query_bundle: Union[str, QueryBundle]) List[NodeWithScore]

Retrieve nodes given query.

参数

str_or_query_bundle (QueryType) -- Either a query string or a QueryBundle object.

class llama_index.indices.keyword_table.KeywordTableSimpleRetriever(index: BaseGPTKeywordTableIndex, keyword_extract_template: Optional[KeywordExtractPrompt] = None, query_keyword_extract_template: Optional[QueryKeywordExtractPrompt] = None, max_keywords_per_query: int = 10, num_chunks_per_query: int = 10, **kwargs: Any)

Keyword Table Index Simple Retriever.

Extracts keywords using simple regex-based keyword extractor. Set when retriever_mode="simple".

See BaseGPTKeywordTableQuery for arguments.

retrieve(str_or_query_bundle: Union[str, QueryBundle]) List[NodeWithScore]

Retrieve nodes given query.

参数

str_or_query_bundle (QueryType) -- Either a query string or a QueryBundle object.