Vector Store
Vector stores.
- class llama_index.vector_stores.ChatGPTRetrievalPluginClient(endpoint_url: str, bearer_token: Optional[str] = None, retries: Optional[Retry] = None, batch_size: int = 100, **kwargs: Any)
ChatGPT Retrieval Plugin Client.
In this client, we make use of the endpoints defined by ChatGPT.
- 参数
endpoint_url (str) -- URL of the ChatGPT Retrieval Plugin.
bearer_token (Optional[str]) -- Bearer token for the ChatGPT Retrieval Plugin.
retries (Optional[Retry]) -- Retry object for the ChatGPT Retrieval Plugin.
batch_size (int) -- Batch size for the ChatGPT Retrieval Plugin.
- add(embedding_results: List[NodeWithEmbedding]) List[str]
Add embedding_results to index.
- property client: None
Get client.
- delete(ref_doc_id: str, **delete_kwargs: Any) None
Delete nodes using with ref_doc_id.
- 参数
ref_doc_id (str) -- The doc_id of the document to delete.
- query(query: VectorStoreQuery, **kwargs: Any) VectorStoreQueryResult
Get nodes for response.
- class llama_index.vector_stores.ChromaVectorStore(chroma_collection: Any, **kwargs: Any)
Chroma vector store.
In this vector store, embeddings are stored within a ChromaDB collection.
During query time, the index uses ChromaDB to query for the top k most similar nodes.
- 参数
chroma_collection (chromadb.api.models.Collection.Collection) -- ChromaDB collection instance
- add(embedding_results: List[NodeWithEmbedding]) List[str]
Add embedding results to index.
- Args
embedding_results: List[NodeWithEmbedding]: list of embedding results
- property client: Any
Return client.
- delete(ref_doc_id: str, **delete_kwargs: Any) None
Delete nodes using with ref_doc_id.
- 参数
ref_doc_id (str) -- The doc_id of the document to delete.
- query(query: VectorStoreQuery, **kwargs: Any) VectorStoreQueryResult
Query index for top k most similar nodes.
- 参数
query_embedding (List[float]) -- query embedding
similarity_top_k (int) -- top k most similar nodes
- class llama_index.vector_stores.DeepLakeVectorStore(dataset_path: str = 'llama_index', token: Optional[str] = None, read_only: Optional[bool] = False, ingestion_batch_size: int = 1024, ingestion_num_workers: int = 4, overwrite: bool = False)
The DeepLake Vector Store.
In this vector store we store the text, its embedding and a few pieces of its metadata in a deeplake dataset. This implemnetation allows the use of an already existing deeplake dataset if it is one that was created this vector store. It also supports creating a new one if the dataset doesnt exist or if overwrite is set to True.
- 参数
deeplake_path (str, optional) -- Path to the deeplake dataset, where data will be
"llama_index". (stored. Defaults to) --
overwrite (bool, optional) -- Whether to overwrite existing dataset with same name. Defaults to False.
token (str, optional) -- the deeplake token that allows you to access the dataset with proper access. Defaults to None.
read_only (bool, optional) -- Whether to open the dataset with read only mode.
ingestion_batch_size (bool, 1024) -- used for controlling batched data injestion to deeplake dataset. Defaults to 1024.
injestion_num_workers (int, 1) -- number of workers to use during data injestion. Defaults to 4.
overwrite -- Whether to overwrite existing dataset with the new dataset with the same name.
- 抛出
ImportError -- Unable to import deeplake.
UserNotLoggedinException -- When user is not logged in with credentials or token.
TokenPermissionError -- When dataset does not exist or user doesn't have enough permissions to modify the dataset.
InvalidTokenException -- If the specified token is invalid
- 返回
Vectorstore that supports add, delete, and query.
- 返回类型
DeepLakeVectorstore
- add(embedding_results: List[NodeWithEmbedding]) List[str]
Add the embeddings and their nodes into DeepLake.
- 参数
embedding_results (List[NodeWithEmbedding]) -- The embeddings and their data to insert.
- 抛出
UserNotLoggedinException -- When user is not logged in with credentials or token.
TokenPermissionError -- When dataset does not exist or user doesn't have enough permissions to modify the dataset.
InvalidTokenException -- If the specified token is invalid
- 返回
List of ids inserted.
- 返回类型
List[str]
- property client: None
Get client.
- delete(ref_doc_id: str, **delete_kwargs: Any) None
Delete nodes using with ref_doc_id.
- 参数
ref_doc_id (str) -- The doc_id of the document to delete.
- query(query: VectorStoreQuery, **kwargs: Any) VectorStoreQueryResult
Query index for top k most similar nodes.
- 参数
query_embedding (List[float]) -- query embedding
similarity_top_k (int) -- top k most similar nodes
- class llama_index.vector_stores.FaissVectorStore(faiss_index: Any)
Faiss Vector Store.
Embeddings are stored within a Faiss index.
During query time, the index uses Faiss to query for the top k embeddings, and returns the corresponding indices.
- 参数
faiss_index (faiss.Index) -- Faiss index instance
- add(embedding_results: List[NodeWithEmbedding]) List[str]
Add embedding results to index.
NOTE: in the Faiss vector store, we do not store text in Faiss.
- Args
embedding_results: List[NodeWithEmbedding]: list of embedding results
- property client: Any
Return the faiss index.
- delete(ref_doc_id: str, **delete_kwargs: Any) None
Delete nodes using with ref_doc_id.
- 参数
ref_doc_id (str) -- The doc_id of the document to delete.
- persist(persist_path: str = './storage/vector_store.json', fs: Optional[AbstractFileSystem] = None) None
Save to file.
This method saves the vector store to disk.
- 参数
persist_path (str) -- The save_path of the file.
- query(query: VectorStoreQuery, **kwargs: Any) VectorStoreQueryResult
Query index for top k most similar nodes.
- 参数
query_embedding (List[float]) -- query embedding
similarity_top_k (int) -- top k most similar nodes
- class llama_index.vector_stores.LanceDBVectorStore(uri: str, table_name: str = 'vectors', nprobes: int = 20, refine_factor: Optional[int] = None, **kwargs: Any)
The LanceDB Vector Store.
- Stores text and embeddings in LanceDB. The vector store will open an existing
LanceDB dataset or create the dataset if it does not exist.
- 参数
uri (str, required) -- Location where LanceDB will store its files.
table_name (str, optional) -- The table name where the embeddings will be stored. Defaults to "vectors".
nprobes (int, optional) -- The number of probes used. A higher number makes search more accurate but also slower. Defaults to 20.
refine_factor -- (int, optional): Refine the results by reading extra elements and re-ranking them in memory. Defaults to None
- 抛出
ImportError -- Unable to import lancedb.
- 返回
- VectorStore that supports creating LanceDB datasets and
querying it.
- 返回类型
- add(embedding_results: List[NodeWithEmbedding]) List[str]
Add embedding results to vector store.
- property client: None
Get client.
- delete(ref_doc_id: str, **delete_kwargs: Any) None
Delete nodes using with ref_doc_id.
- 参数
ref_doc_id (str) -- The doc_id of the document to delete.
- query(query: VectorStoreQuery, **kwargs: Any) VectorStoreQueryResult
Query index for top k most similar nodes.
- class llama_index.vector_stores.MetalVectorStore(api_key: str, client_id: str, index_id: str)
- add(embedding_results: List[NodeWithEmbedding]) List[str]
Add embedding results to index.
- Args
embedding_results: List[NodeEmbeddingResult]: list of embedding results
- property client: Any
Return Metal client.
- delete(ref_doc_id: str, **delete_kwargs: Any) None
Delete nodes using with ref_doc_id.
- 参数
ref_doc_id (str) -- The doc_id of the document to delete.
- query(query: VectorStoreQuery, **kwargs: Any) VectorStoreQueryResult
Query vector store.
- class llama_index.vector_stores.MilvusVectorStore(collection_name: str = 'llamalection', index_params: Optional[dict] = None, search_params: Optional[dict] = None, dim: Optional[int] = None, host: str = 'localhost', port: int = 19530, user: str = '', password: str = '', use_secure: bool = False, overwrite: bool = False, **kwargs: Any)
The Milvus Vector Store.
In this vector store we store the text, its embedding and a few pieces of its metadata in a Milvus collection. This implemnetation allows the use of an already existing collection if it is one that was created this vector store. It also supports creating a new one if the collection doesnt exist or if overwrite is set to True.
- 参数
collection_name (str, optional) -- The name of the collection where data will be stored. Defaults to "llamalection".
index_params (dict, optional) -- The index parameters for Milvus, if none are provided an HNSW index will be used. Defaults to None.
search_params (dict, optional) -- The search parameters for a Milvus query. If none are provided, default params will be generated. Defaults to None.
dim (int, optional) -- The dimension of the embeddings. If it is not provided, collection creation will be done on first insert. Defaults to None.
host (str, optional) -- The host address of Milvus. Defaults to "localhost".
port (int, optional) -- The port of Milvus. Defaults to 19530.
user (str, optional) -- The username for RBAC. Defaults to "".
password (str, optional) -- The password for RBAC. Defaults to "".
use_secure (bool, optional) -- Use https. Required for Zilliz Cloud. Defaults to False.
overwrite (bool, optional) -- Whether to overwrite existing collection with same name. Defaults to False.
- 抛出
ImportError -- Unable to import pymilvus.
MilvusException -- Error communicating with Milvus, more can be found in logging under Debug.
- 返回
Vectorstore that supports add, delete, and query.
- 返回类型
MilvusVectorstore
- add(embedding_results: List[NodeWithEmbedding]) List[str]
Add the embeddings and their nodes into Milvus.
- 参数
embedding_results (List[NodeWithEmbedding]) -- The embeddings and their data to insert.
- 抛出
MilvusException -- Failed to insert data.
- 返回
List of ids inserted.
- 返回类型
List[str]
- property client: Any
Get client.
- delete(ref_doc_id: str, **delete_kwargs: Any) None
Delete nodes using with ref_doc_id.
- 参数
ref_doc_id (str) -- The doc_id of the document to delete.
- 抛出
MilvusException -- Failed to delete the doc.
- query(query: VectorStoreQuery, **kwargs: Any) VectorStoreQueryResult
Query index for top k most similar nodes.
- 参数
query_embedding (List[float]) -- query embedding
similarity_top_k (int) -- top k most similar nodes
doc_ids (Optional[List[str]]) -- list of doc_ids to filter by
- class llama_index.vector_stores.MyScaleVectorStore(myscale_client: Optional[Any] = None, table: str = 'llama_index', database: str = 'default', index_type: str = 'IVFFLAT', metric: str = 'cosine', batch_size: int = 32, index_params: Optional[dict] = None, search_params: Optional[dict] = None, service_context: Optional[ServiceContext] = None, **kwargs: Any)
MyScale Vector Store.
In this vector store, embeddings and docs are stored within an existing MyScale cluster.
During query time, the index uses MyScale to query for the top k most similar nodes.
- 参数
myscale_client (httpclient) -- clickhouse-connect httpclient of an existing MyScale cluster.
table (str, optional) -- The name of the MyScale table where data will be stored. Defaults to "llama_index".
database (str, optional) -- The name of the MyScale database where data will be stored. Defaults to "default".
index_type (str, optional) -- The type of the MyScale vector index. Defaults to "IVFFLAT".
metric (str, optional) -- The metric type of the MyScale vector index. Defaults to "cosine".
batch_size (int, optional) -- the size of documents to insert. Defaults to 32.
index_params (dict, optional) -- The index parameters for MyScale. Defaults to None.
search_params (dict, optional) -- The search parameters for a MyScale query. Defaults to None.
service_context (ServiceContext, optional) -- Vector store service context. Defaults to None
- add(embedding_results: List[NodeWithEmbedding]) List[str]
Add embedding results to index.
- Args
embedding_results: List[NodeWithEmbedding]: list of embedding results
- property client: Any
Get client.
- delete(ref_doc_id: str, **delete_kwargs: Any) None
Delete nodes using with ref_doc_id.
- 参数
ref_doc_id (str) -- The doc_id of the document to delete.
- drop() None
Drop MyScale Index and table
- query(query: VectorStoreQuery, **kwargs: Any) VectorStoreQueryResult
Query index for top k most similar nodes.
- 参数
query (VectorStoreQuery) -- query
- class llama_index.vector_stores.OpensearchVectorClient(endpoint: str, index: str, dim: int, embedding_field: str = 'embedding', text_field: str = 'content', extra_info_field: str = 'extra_info', method: Optional[dict] = None, auth: Optional[dict] = None)
Object encapsulating an Opensearch index that has vector search enabled.
If the index does not yet exist, it is created during init. Therefore, the underlying index is assumed to either: 1) not exist yet or 2) be created due to previous usage of this class.
- 参数
endpoint (str) -- URL (http/https) of elasticsearch endpoint
index (str) -- Name of the elasticsearch index
dim (int) -- Dimension of the vector
embedding_field (str) -- Name of the field in the index to store embedding array in.
text_field (str) -- Name of the field to grab text from
method (Optional[dict]) -- Opensearch "method" JSON obj for configuring the KNN index. This includes engine, metric, and other config params. Defaults to: {"name": "hnsw", "space_type": "l2", "engine": "faiss", "parameters": {"ef_construction": 256, "m": 48}}
- delete_doc_id(doc_id: str) None
Delete a document.
- 参数
doc_id (str) -- document id
- do_approx_knn(query_embedding: List[float], k: int) VectorStoreQueryResult
Do approximate knn.
- index_results(results: List[NodeWithEmbedding]) List[str]
Store results in the index.
- class llama_index.vector_stores.OpensearchVectorStore(client: OpensearchVectorClient)
Elasticsearch/Opensearch vector store.
- 参数
client (OpensearchVectorClient) -- Vector index client to use for data insertion/querying.
- add(embedding_results: List[NodeWithEmbedding]) List[str]
Add embedding results to index.
- Args
embedding_results: List[NodeWithEmbedding]: list of embedding results
- property client: Any
Get client.
- delete(ref_doc_id: str, **delete_kwargs: Any) None
Delete nodes using with ref_doc_id.
- 参数
ref_doc_id (str) -- The doc_id of the document to delete.
- query(query: VectorStoreQuery, **kwargs: Any) VectorStoreQueryResult
Query index for top k most similar nodes.
- 参数
query_embedding (List[float]) -- query embedding
similarity_top_k (int) -- top k most similar nodes
- class llama_index.vector_stores.PineconeVectorStore(pinecone_index: Optional[Any] = None, index_name: Optional[str] = None, environment: Optional[str] = None, namespace: Optional[str] = None, insert_kwargs: Optional[Dict] = None, add_sparse_vector: bool = False, tokenizer: Optional[Callable] = None, **kwargs: Any)
Pinecone Vector Store.
In this vector store, embeddings and docs are stored within a Pinecone index.
During query time, the index uses Pinecone to query for the top k most similar nodes.
- 参数
pinecone_index (Optional[pinecone.Index]) -- Pinecone index instance
insert_kwargs (Optional[Dict]) -- insert kwargs during upsert call.
add_sparse_vector (bool) -- whether to add sparse vector to index.
tokenizer (Optional[Callable]) -- tokenizer to use to generate sparse
- add(embedding_results: List[NodeWithEmbedding]) List[str]
Add embedding results to index.
- Args
embedding_results: List[NodeWithEmbedding]: list of embedding results
- property client: Any
Return Pinecone client.
- delete(ref_doc_id: str, **delete_kwargs: Any) None
Delete nodes using with ref_doc_id.
- 参数
ref_doc_id (str) -- The doc_id of the document to delete.
- query(query: VectorStoreQuery, **kwargs: Any) VectorStoreQueryResult
Query index for top k most similar nodes.
- 参数
query_embedding (List[float]) -- query embedding
similarity_top_k (int) -- top k most similar nodes
- class llama_index.vector_stores.QdrantVectorStore(collection_name: str, client: Optional[Any] = None, **kwargs: Any)
Qdrant Vector Store.
In this vector store, embeddings and docs are stored within a Qdrant collection.
During query time, the index uses Qdrant to query for the top k most similar nodes.
- 参数
collection_name -- (str): name of the Qdrant collection
client (Optional[Any]) -- QdrantClient instance from qdrant-client package
- add(embedding_results: List[NodeWithEmbedding]) List[str]
Add embedding results to index.
- Args
embedding_results: List[NodeWithEmbedding]: list of embedding results
- property client: Any
Return the Qdrant client.
- delete(ref_doc_id: str, **delete_kwargs: Any) None
Delete nodes using with ref_doc_id.
- 参数
ref_doc_id (str) -- The doc_id of the document to delete.
- query(query: VectorStoreQuery, **kwargs: Any) VectorStoreQueryResult
Query index for top k most similar nodes.
- 参数
query (VectorStoreQuery) -- query
- class llama_index.vector_stores.RedisVectorStore(index_name: str, index_prefix: str = 'llama_index', index_args: Optional[Dict[str, Any]] = None, redis_url: str = 'redis://localhost:6379', overwrite: bool = False, **kwargs: Any)
- add(embedding_results: List[NodeWithEmbedding]) List[str]
Add embedding results to the index.
- 参数
embedding_results (List[NodeWithEmbedding]) -- List of embedding results to add to the index.
- 返回
List of ids of the documents added to the index.
- 返回类型
List[str]
- 抛出
ValueError -- If the index already exists and overwrite is False.
- property client: RedisType
Return the redis client instance
- delete(ref_doc_id: str, **delete_kwargs: Any) None
Delete nodes using with ref_doc_id.
- 参数
ref_doc_id (str) -- The doc_id of the document to delete.
- delete_index() None
Delete the index and all documents.
- persist(persist_path: str, fs: Optional[AbstractFileSystem] = None, in_background: bool = True) None
Persist the vector store to disk.
- 参数
persist_path (str) -- Path to persist the vector store to. (doesn't apply)
in_background (bool, optional) -- Persist in background. Defaults to True.
fs (fsspec.AbstractFileSystem, optional) -- Filesystem to persist to. (doesn't apply)
- 抛出
redis.exceptions.RedisError -- If there is an error persisting the index to disk.
- query(query: VectorStoreQuery, **kwargs: Any) VectorStoreQueryResult
Query the index.
- 参数
query (VectorStoreQuery) -- query object
- 返回
query result
- 返回类型
- 抛出
ValueError -- If query.query_embedding is None.
redis.exceptions.RedisError -- If there is an error querying the index.
redis.exceptions.TimeoutError -- If there is a timeout querying the index.
- class llama_index.vector_stores.SimpleVectorStore(data: Optional[SimpleVectorStoreData] = None, fs: Optional[AbstractFileSystem] = None, **kwargs: Any)
Simple Vector Store.
In this vector store, embeddings are stored within a simple, in-memory dictionary.
- 参数
simple_vector_store_data_dict (Optional[dict]) -- data dict containing the embeddings and doc_ids. See SimpleVectorStoreData for more details.
- add(embedding_results: List[NodeWithEmbedding]) List[str]
Add embedding_results to index.
- property client: None
Get client.
- delete(ref_doc_id: str, **delete_kwargs: Any) None
Delete nodes using with ref_doc_id.
- 参数
ref_doc_id (str) -- The doc_id of the document to delete.
- classmethod from_persist_dir(persist_dir: str = './storage', fs: Optional[AbstractFileSystem] = None) SimpleVectorStore
Load from persist dir.
- classmethod from_persist_path(persist_path: str, fs: Optional[AbstractFileSystem] = None) SimpleVectorStore
Create a SimpleKVStore from a persist directory.
- get(text_id: str) List[float]
Get embedding.
- persist(persist_path: str = './storage/vector_store.json', fs: Optional[AbstractFileSystem] = None) None
Persist the SimpleVectorStore to a directory.
- query(query: VectorStoreQuery, **kwargs: Any) VectorStoreQueryResult
Get nodes for response.
- class llama_index.vector_stores.SupabaseVectorStore(postgres_connection_string: str, collection_name: str, dimension: int = 1536, **kwargs: Any)
Supbabase Vector.
In this vector store, embeddings are stored in Postgres table using pgvector.
During query time, the index uses pgvector/Supabase to query for the top k most similar nodes.
- 参数
postgres_connection_string (str) -- postgres connection string
collection_name (str) -- name of the collection to store the embeddings in
- add(embedding_results: List[NodeWithEmbedding]) List[str]
Add embedding results to index.
- Args
embedding_results: List[NodeWithEmbedding]: list of embedding results
- property client: None
Get client.
- delete(ref_doc_id: str, **delete_kwargs: Any) None
Delete doc.
- 参数
doc_id (str) -- document id
- query(query: VectorStoreQuery, **kwargs: Any) VectorStoreQueryResult
Query index for top k most similar nodes.
- 参数
query (List[float]) -- query embedding
- class llama_index.vector_stores.WeaviateVectorStore(weaviate_client: Optional[Any] = None, class_prefix: Optional[str] = None, **kwargs: Any)
Weaviate vector store.
In this vector store, embeddings and docs are stored within a Weaviate collection.
During query time, the index uses Weaviate to query for the top k most similar nodes.
- 参数
weaviate_client (weaviate.Client) -- WeaviateClient instance from weaviate-client package
class_prefix (Optional[str]) -- prefix for Weaviate classes
- add(embedding_results: List[NodeWithEmbedding]) List[str]
Add embedding results to index.
- Args
embedding_results: List[NodeWithEmbedding]: list of embedding results
- property client: Any
Get client.
- delete(ref_doc_id: str, **delete_kwargs: Any) None
Delete nodes using with ref_doc_id.
- 参数
ref_doc_id (str) -- The doc_id of the document to delete.
- query(query: VectorStoreQuery, **kwargs: Any) VectorStoreQueryResult
Query index for top k most similar nodes.
向量存储
Vector stores.
- class llama_index.vector_stores.ChatGPTRetrievalPluginClient(endpoint_url: str, bearer_token: Optional[str] = None, retries: Optional[Retry] = None, batch_size: int = 100, **kwargs: Any)
ChatGPT Retrieval Plugin Client.
In this client, we make use of the endpoints defined by ChatGPT.
- 参数
endpoint_url (str) -- URL of the ChatGPT Retrieval Plugin.
bearer_token (Optional[str]) -- Bearer token for the ChatGPT Retrieval Plugin.
retries (Optional[Retry]) -- Retry object for the ChatGPT Retrieval Plugin.
batch_size (int) -- Batch size for the ChatGPT Retrieval Plugin.
- add(embedding_results: List[NodeWithEmbedding]) List[str]
Add embedding_results to index.
- property client: None
Get client.
- delete(ref_doc_id: str, **delete_kwargs: Any) None
Delete nodes using with ref_doc_id.
- 参数
ref_doc_id (str) -- The doc_id of the document to delete.
- query(query: VectorStoreQuery, **kwargs: Any) VectorStoreQueryResult
Get nodes for response.
- class llama_index.vector_stores.ChromaVectorStore(chroma_collection: Any, **kwargs: Any)
Chroma vector store.
In this vector store, embeddings are stored within a ChromaDB collection.
During query time, the index uses ChromaDB to query for the top k most similar nodes.
- 参数
chroma_collection (chromadb.api.models.Collection.Collection) -- ChromaDB collection instance
- add(embedding_results: List[NodeWithEmbedding]) List[str]
Add embedding results to index.
- Args
embedding_results: List[NodeWithEmbedding]: list of embedding results
- property client: Any
Return client.
- delete(ref_doc_id: str, **delete_kwargs: Any) None
Delete nodes using with ref_doc_id.
- 参数
ref_doc_id (str) -- The doc_id of the document to delete.
- query(query: VectorStoreQuery, **kwargs: Any) VectorStoreQueryResult
Query index for top k most similar nodes.
- 参数
query_embedding (List[float]) -- query embedding
similarity_top_k (int) -- top k most similar nodes
- class llama_index.vector_stores.DeepLakeVectorStore(dataset_path: str = 'llama_index', token: Optional[str] = None, read_only: Optional[bool] = False, ingestion_batch_size: int = 1024, ingestion_num_workers: int = 4, overwrite: bool = False)
The DeepLake Vector Store.
In this vector store we store the text, its embedding and a few pieces of its metadata in a deeplake dataset. This implemnetation allows the use of an already existing deeplake dataset if it is one that was created this vector store. It also supports creating a new one if the dataset doesnt exist or if overwrite is set to True.
- 参数
deeplake_path (str, optional) -- Path to the deeplake dataset, where data will be
"llama_index". (stored. Defaults to) --
overwrite (bool, optional) -- Whether to overwrite existing dataset with same name. Defaults to False.
token (str, optional) -- the deeplake token that allows you to access the dataset with proper access. Defaults to None.
read_only (bool, optional) -- Whether to open the dataset with read only mode.
ingestion_batch_size (bool, 1024) -- used for controlling batched data injestion to deeplake dataset. Defaults to 1024.
injestion_num_workers (int, 1) -- number of workers to use during data injestion. Defaults to 4.
overwrite -- Whether to overwrite existing dataset with the new dataset with the same name.
- 抛出
ImportError -- Unable to import deeplake.
UserNotLoggedinException -- When user is not logged in with credentials or token.
TokenPermissionError -- When dataset does not exist or user doesn't have enough permissions to modify the dataset.
InvalidTokenException -- If the specified token is invalid
- 返回
Vectorstore that supports add, delete, and query.
- 返回类型
DeepLakeVectorstore
- add(embedding_results: List[NodeWithEmbedding]) List[str]
Add the embeddings and their nodes into DeepLake.
- 参数
embedding_results (List[NodeWithEmbedding]) -- The embeddings and their data to insert.
- 抛出
UserNotLoggedinException -- When user is not logged in with credentials or token.
TokenPermissionError -- When dataset does not exist or user doesn't have enough permissions to modify the dataset.
InvalidTokenException -- If the specified token is invalid
- 返回
List of ids inserted.
- 返回类型
List[str]
- property client: None
Get client.
- delete(ref_doc_id: str, **delete_kwargs: Any) None
Delete nodes using with ref_doc_id.
- 参数
ref_doc_id (str) -- The doc_id of the document to delete.
- query(query: VectorStoreQuery, **kwargs: Any) VectorStoreQueryResult
Query index for top k most similar nodes.
- 参数
query_embedding (List[float]) -- query embedding
similarity_top_k (int) -- top k most similar nodes
- class llama_index.vector_stores.FaissVectorStore(faiss_index: Any)
Faiss Vector Store.
Embeddings are stored within a Faiss index.
During query time, the index uses Faiss to query for the top k embeddings, and returns the corresponding indices.
- 参数
faiss_index (faiss.Index) -- Faiss index instance
- add(embedding_results: List[NodeWithEmbedding]) List[str]
Add embedding results to index.
NOTE: in the Faiss vector store, we do not store text in Faiss.
- Args
embedding_results: List[NodeWithEmbedding]: list of embedding results
- property client: Any
Return the faiss index.
- delete(ref_doc_id: str, **delete_kwargs: Any) None
Delete nodes using with ref_doc_id.
- 参数
ref_doc_id (str) -- The doc_id of the document to delete.
- persist(persist_path: str = './storage/vector_store.json', fs: Optional[AbstractFileSystem] = None) None
Save to file.
This method saves the vector store to disk.
- 参数
persist_path (str) -- The save_path of the file.
- query(query: VectorStoreQuery, **kwargs: Any) VectorStoreQueryResult
Query index for top k most similar nodes.
- 参数
query_embedding (List[float]) -- query embedding
similarity_top_k (int) -- top k most similar nodes
- class llama_index.vector_stores.LanceDBVectorStore(uri: str, table_name: str = 'vectors', nprobes: int = 20, refine_factor: Optional[int] = None, **kwargs: Any)
The LanceDB Vector Store.
- Stores text and embeddings in LanceDB. The vector store will open an existing
LanceDB dataset or create the dataset if it does not exist.
- 参数
uri (str, required) -- Location where LanceDB will store its files.
table_name (str, optional) -- The table name where the embeddings will be stored. Defaults to "vectors".
nprobes (int, optional) -- The number of probes used. A higher number makes search more accurate but also slower. Defaults to 20.
refine_factor -- (int, optional): Refine the results by reading extra elements and re-ranking them in memory. Defaults to None
- 抛出
ImportError -- Unable to import lancedb.
- 返回
- VectorStore that supports creating LanceDB datasets and
querying it.
- 返回类型
- add(embedding_results: List[NodeWithEmbedding]) List[str]
Add embedding results to vector store.
- property client: None
Get client.
- delete(ref_doc_id: str, **delete_kwargs: Any) None
Delete nodes using with ref_doc_id.
- 参数
ref_doc_id (str) -- The doc_id of the document to delete.
- query(query: VectorStoreQuery, **kwargs: Any) VectorStoreQueryResult
Query index for top k most similar nodes.
- class llama_index.vector_stores.MetalVectorStore(api_key: str, client_id: str, index_id: str)
- add(embedding_results: List[NodeWithEmbedding]) List[str]
Add embedding results to index.
- Args
embedding_results: List[NodeEmbeddingResult]: list of embedding results
- property client: Any
Return Metal client.
- delete(ref_doc_id: str, **delete_kwargs: Any) None
Delete nodes using with ref_doc_id.
- 参数
ref_doc_id (str) -- The doc_id of the document to delete.
- query(query: VectorStoreQuery, **kwargs: Any) VectorStoreQueryResult
Query vector store.
- class llama_index.vector_stores.MilvusVectorStore(collection_name: str = 'llamalection', index_params: Optional[dict] = None, search_params: Optional[dict] = None, dim: Optional[int] = None, host: str = 'localhost', port: int = 19530, user: str = '', password: str = '', use_secure: bool = False, overwrite: bool = False, **kwargs: Any)
The Milvus Vector Store.
In this vector store we store the text, its embedding and a few pieces of its metadata in a Milvus collection. This implemnetation allows the use of an already existing collection if it is one that was created this vector store. It also supports creating a new one if the collection doesnt exist or if overwrite is set to True.
- 参数
collection_name (str, optional) -- The name of the collection where data will be stored. Defaults to "llamalection".
index_params (dict, optional) -- The index parameters for Milvus, if none are provided an HNSW index will be used. Defaults to None.
search_params (dict, optional) -- The search parameters for a Milvus query. If none are provided, default params will be generated. Defaults to None.
dim (int, optional) -- The dimension of the embeddings. If it is not provided, collection creation will be done on first insert. Defaults to None.
host (str, optional) -- The host address of Milvus. Defaults to "localhost".
port (int, optional) -- The port of Milvus. Defaults to 19530.
user (str, optional) -- The username for RBAC. Defaults to "".
password (str, optional) -- The password for RBAC. Defaults to "".
use_secure (bool, optional) -- Use https. Required for Zilliz Cloud. Defaults to False.
overwrite (bool, optional) -- Whether to overwrite existing collection with same name. Defaults to False.
- 抛出
ImportError -- Unable to import pymilvus.
MilvusException -- Error communicating with Milvus, more can be found in logging under Debug.
- 返回
Vectorstore that supports add, delete, and query.
- 返回类型
MilvusVectorstore
- add(embedding_results: List[NodeWithEmbedding]) List[str]
Add the embeddings and their nodes into Milvus.
- 参数
embedding_results (List[NodeWithEmbedding]) -- The embeddings and their data to insert.
- 抛出
MilvusException -- Failed to insert data.
- 返回
List of ids inserted.
- 返回类型
List[str]
- property client: Any
Get client.
- delete(ref_doc_id: str, **delete_kwargs: Any) None
Delete nodes using with ref_doc_id.
- 参数
ref_doc_id (str) -- The doc_id of the document to delete.
- 抛出
MilvusException -- Failed to delete the doc.
- query(query: VectorStoreQuery, **kwargs: Any) VectorStoreQueryResult
Query index for top k most similar nodes.
- 参数
query_embedding (List[float]) -- query embedding
similarity_top_k (int) -- top k most similar nodes
doc_ids (Optional[List[str]]) -- list of doc_ids to filter by
- class llama_index.vector_stores.MyScaleVectorStore(myscale_client: Optional[Any] = None, table: str = 'llama_index', database: str = 'default', index_type: str = 'IVFFLAT', metric: str = 'cosine', batch_size: int = 32, index_params: Optional[dict] = None, search_params: Optional[dict] = None, service_context: Optional[ServiceContext] = None, **kwargs: Any)
MyScale Vector Store.
In this vector store, embeddings and docs are stored within an existing MyScale cluster.
During query time, the index uses MyScale to query for the top k most similar nodes.
- 参数
myscale_client (httpclient) -- clickhouse-connect httpclient of an existing MyScale cluster.
table (str, optional) -- The name of the MyScale table where data will be stored. Defaults to "llama_index".
database (str, optional) -- The name of the MyScale database where data will be stored. Defaults to "default".
index_type (str, optional) -- The type of the MyScale vector index. Defaults to "IVFFLAT".
metric (str, optional) -- The metric type of the MyScale vector index. Defaults to "cosine".
batch_size (int, optional) -- the size of documents to insert. Defaults to 32.
index_params (dict, optional) -- The index parameters for MyScale. Defaults to None.
search_params (dict, optional) -- The search parameters for a MyScale query. Defaults to None.
service_context (ServiceContext, optional) -- Vector store service context. Defaults to None
- add(embedding_results: List[NodeWithEmbedding]) List[str]
Add embedding results to index.
- Args
embedding_results: List[NodeWithEmbedding]: list of embedding results
- property client: Any
Get client.
- delete(ref_doc_id: str, **delete_kwargs: Any) None
Delete nodes using with ref_doc_id.
- 参数
ref_doc_id (str) -- The doc_id of the document to delete.
- drop() None
Drop MyScale Index and table
- query(query: VectorStoreQuery, **kwargs: Any) VectorStoreQueryResult
Query index for top k most similar nodes.
- 参数
query (VectorStoreQuery) -- query
- class llama_index.vector_stores.OpensearchVectorClient(endpoint: str, index: str, dim: int, embedding_field: str = 'embedding', text_field: str = 'content', extra_info_field: str = 'extra_info', method: Optional[dict] = None, auth: Optional[dict] = None)
Object encapsulating an Opensearch index that has vector search enabled.
If the index does not yet exist, it is created during init. Therefore, the underlying index is assumed to either: 1) not exist yet or 2) be created due to previous usage of this class.
- 参数
endpoint (str) -- URL (http/https) of elasticsearch endpoint
index (str) -- Name of the elasticsearch index
dim (int) -- Dimension of the vector
embedding_field (str) -- Name of the field in the index to store embedding array in.
text_field (str) -- Name of the field to grab text from
method (Optional[dict]) -- Opensearch "method" JSON obj for configuring the KNN index. This includes engine, metric, and other config params. Defaults to: {"name": "hnsw", "space_type": "l2", "engine": "faiss", "parameters": {"ef_construction": 256, "m": 48}}
- delete_doc_id(doc_id: str) None
Delete a document.
- 参数
doc_id (str) -- document id
- do_approx_knn(query_embedding: List[float], k: int) VectorStoreQueryResult
Do approximate knn.
- index_results(results: List[NodeWithEmbedding]) List[str]
Store results in the index.
- class llama_index.vector_stores.OpensearchVectorStore(client: OpensearchVectorClient)
Elasticsearch/Opensearch vector store.
- 参数
client (OpensearchVectorClient) -- Vector index client to use for data insertion/querying.
- add(embedding_results: List[NodeWithEmbedding]) List[str]
Add embedding results to index.
- Args
embedding_results: List[NodeWithEmbedding]: list of embedding results
- property client: Any
Get client.
- delete(ref_doc_id: str, **delete_kwargs: Any) None
Delete nodes using with ref_doc_id.
- 参数
ref_doc_id (str) -- The doc_id of the document to delete.
- query(query: VectorStoreQuery, **kwargs: Any) VectorStoreQueryResult
Query index for top k most similar nodes.
- 参数
query_embedding (List[float]) -- query embedding
similarity_top_k (int) -- top k most similar nodes
- class llama_index.vector_stores.PineconeVectorStore(pinecone_index: Optional[Any] = None, index_name: Optional[str] = None, environment: Optional[str] = None, namespace: Optional[str] = None, insert_kwargs: Optional[Dict] = None, add_sparse_vector: bool = False, tokenizer: Optional[Callable] = None, **kwargs: Any)
Pinecone Vector Store.
In this vector store, embeddings and docs are stored within a Pinecone index.
During query time, the index uses Pinecone to query for the top k most similar nodes.
- 参数
pinecone_index (Optional[pinecone.Index]) -- Pinecone index instance
insert_kwargs (Optional[Dict]) -- insert kwargs during upsert call.
add_sparse_vector (bool) -- whether to add sparse vector to index.
tokenizer (Optional[Callable]) -- tokenizer to use to generate sparse
- add(embedding_results: List[NodeWithEmbedding]) List[str]
Add embedding results to index.
- Args
embedding_results: List[NodeWithEmbedding]: list of embedding results
- property client: Any
Return Pinecone client.
- delete(ref_doc_id: str, **delete_kwargs: Any) None
Delete nodes using with ref_doc_id.
- 参数
ref_doc_id (str) -- The doc_id of the document to delete.
- query(query: VectorStoreQuery, **kwargs: Any) VectorStoreQueryResult
Query index for top k most similar nodes.
- 参数
query_embedding (List[float]) -- query embedding
similarity_top_k (int) -- top k most similar nodes
- class llama_index.vector_stores.QdrantVectorStore(collection_name: str, client: Optional[Any] = None, **kwargs: Any)
Qdrant Vector Store.
In this vector store, embeddings and docs are stored within a Qdrant collection.
During query time, the index uses Qdrant to query for the top k most similar nodes.
- 参数
collection_name -- (str): name of the Qdrant collection
client (Optional[Any]) -- QdrantClient instance from qdrant-client package
- add(embedding_results: List[NodeWithEmbedding]) List[str]
Add embedding results to index.
- Args
embedding_results: List[NodeWithEmbedding]: list of embedding results
- property client: Any
Return the Qdrant client.
- delete(ref_doc_id: str, **delete_kwargs: Any) None
Delete nodes using with ref_doc_id.
- 参数
ref_doc_id (str) -- The doc_id of the document to delete.
- query(query: VectorStoreQuery, **kwargs: Any) VectorStoreQueryResult
Query index for top k most similar nodes.
- 参数
query (VectorStoreQuery) -- query
- class llama_index.vector_stores.RedisVectorStore(index_name: str, index_prefix: str = 'llama_index', index_args: Optional[Dict[str, Any]] = None, redis_url: str = 'redis://localhost:6379', overwrite: bool = False, **kwargs: Any)
- add(embedding_results: List[NodeWithEmbedding]) List[str]
Add embedding results to the index.
- 参数
embedding_results (List[NodeWithEmbedding]) -- List of embedding results to add to the index.
- 返回
List of ids of the documents added to the index.
- 返回类型
List[str]
- 抛出
ValueError -- If the index already exists and overwrite is False.
- property client: RedisType
Return the redis client instance
- delete(ref_doc_id: str, **delete_kwargs: Any) None
Delete nodes using with ref_doc_id.
- 参数
ref_doc_id (str) -- The doc_id of the document to delete.
- delete_index() None
Delete the index and all documents.
- persist(persist_path: str, fs: Optional[AbstractFileSystem] = None, in_background: bool = True) None
Persist the vector store to disk.
- 参数
persist_path (str) -- Path to persist the vector store to. (doesn't apply)
in_background (bool, optional) -- Persist in background. Defaults to True.
fs (fsspec.AbstractFileSystem, optional) -- Filesystem to persist to. (doesn't apply)
- 抛出
redis.exceptions.RedisError -- If there is an error persisting the index to disk.
- query(query: VectorStoreQuery, **kwargs: Any) VectorStoreQueryResult
Query the index.
- 参数
query (VectorStoreQuery) -- query object
- 返回
query result
- 返回类型
- 抛出
ValueError -- If query.query_embedding is None.
redis.exceptions.RedisError -- If there is an error querying the index.
redis.exceptions.TimeoutError -- If there is a timeout querying the index.
- class llama_index.vector_stores.SimpleVectorStore(data: Optional[SimpleVectorStoreData] = None, fs: Optional[AbstractFileSystem] = None, **kwargs: Any)
Simple Vector Store.
In this vector store, embeddings are stored within a simple, in-memory dictionary.
- 参数
simple_vector_store_data_dict (Optional[dict]) -- data dict containing the embeddings and doc_ids. See SimpleVectorStoreData for more details.
- add(embedding_results: List[NodeWithEmbedding]) List[str]
Add embedding_results to index.
- property client: None
Get client.
- delete(ref_doc_id: str, **delete_kwargs: Any) None
Delete nodes using with ref_doc_id.
- 参数
ref_doc_id (str) -- The doc_id of the document to delete.
- classmethod from_persist_dir(persist_dir: str = './storage', fs: Optional[AbstractFileSystem] = None) SimpleVectorStore
Load from persist dir.
- classmethod from_persist_path(persist_path: str, fs: Optional[AbstractFileSystem] = None) SimpleVectorStore
Create a SimpleKVStore from a persist directory.
- get(text_id: str) List[float]
Get embedding.
- persist(persist_path: str = './storage/vector_store.json', fs: Optional[AbstractFileSystem] = None) None
Persist the SimpleVectorStore to a directory.
- query(query: VectorStoreQuery, **kwargs: Any) VectorStoreQueryResult
Get nodes for response.
- class llama_index.vector_stores.SupabaseVectorStore(postgres_connection_string: str, collection_name: str, dimension: int = 1536, **kwargs: Any)
Supbabase Vector.
In this vector store, embeddings are stored in Postgres table using pgvector.
During query time, the index uses pgvector/Supabase to query for the top k most similar nodes.
- 参数
postgres_connection_string (str) -- postgres connection string
collection_name (str) -- name of the collection to store the embeddings in
- add(embedding_results: List[NodeWithEmbedding]) List[str]
Add embedding results to index.
- Args
embedding_results: List[NodeWithEmbedding]: list of embedding results
- property client: None
Get client.
- delete(ref_doc_id: str, **delete_kwargs: Any) None
Delete doc.
- 参数
doc_id (str) -- document id
- query(query: VectorStoreQuery, **kwargs: Any) VectorStoreQueryResult
Query index for top k most similar nodes.
- 参数
query (List[float]) -- query embedding
- class llama_index.vector_stores.WeaviateVectorStore(weaviate_client: Optional[Any] = None, class_prefix: Optional[str] = None, **kwargs: Any)
Weaviate vector store.
In this vector store, embeddings and docs are stored within a Weaviate collection.
During query time, the index uses Weaviate to query for the top k most similar nodes.
- 参数
weaviate_client (weaviate.Client) -- WeaviateClient instance from weaviate-client package
class_prefix (Optional[str]) -- prefix for Weaviate classes
- add(embedding_results: List[NodeWithEmbedding]) List[str]
Add embedding results to index.
- Args
embedding_results: List[NodeWithEmbedding]: list of embedding results
- property client: Any
Get client.
- delete(ref_doc_id: str, **delete_kwargs: Any) None
Delete nodes using with ref_doc_id.
- 参数
ref_doc_id (str) -- The doc_id of the document to delete.
- query(query: VectorStoreQuery, **kwargs: Any) VectorStoreQueryResult
Query index for top k most similar nodes.