API reference¶

Documentation of the different classes, starting from the app layer down to the model layer.

App¶

class insightvault.app.base.BaseApp(name: str = 'insightvault.app.base', config_path: str = './config.yaml')¶

Bases: object

add_documents(documents: list[Document]) → None¶: Add documents to the database

async async_add_documents(documents: list[Document]) → None¶: Async version of add_document

async async_delete_all_documents() → None¶: Async version of delete_all_documents

async async_list_documents() → list[Document] | None¶: Async version of list_documents

delete_all_documents() → None¶: Delete all documents from the database

async init() → None¶: Initialize the app

list_documents() → list[Document] | None¶: List all documents in the database

insightvault.app.cli.main() → None¶: Entry point for the CLI

class insightvault.app.rag.RAGApp(name: str = 'insightvault.app.rag')¶

Bases: SearchApp

RAG application for retrieval-augmented generation

This application extends the SearchApp with RAG-specific query functionality. All other methods (add_documents, delete_documents, etc.) are inherited from SearchApp.

async async_clear() → None¶: Async version of clear

async async_query(query: str) → list[str]¶: Async version of query

clear() → None¶: Clears the chat history

async init() → None¶: Initialize the RAG app

query(query: str) → list[str]¶

Query the database for documents similar to the query

This RAG-specific implementation returns Document objects instead of strings.

class insightvault.app.search.SearchApp(name: str = 'insightvault.app.search')¶

Bases: BaseApp

Search application for semantic search

This application is used to query the database and add documents to the database.

Attributes:: db (Database): The database service.

async async_query(query: str) → list[str]¶: Async version of query

async init() → None¶: Initialize the search app

query(query: str) → list[str]¶

Query the database for documents similar to the query.

Returns an alphabetically sorted list of document titles.

class insightvault.app.summarizer.SummarizerApp(name: str = 'insightvault.app.summarizer')¶

Bases: BaseApp

Summarizer application

This application is used to summarize documents.

async async_summarize(text: str) → str | None¶: Async version of summarize

async init() → None¶: Initialize the summarizer app

summarize(text: str) → str | None¶: Summarize a list of documents

Services¶

class insightvault.services.database.AbstractDatabaseService¶

Bases: ABC

Abstract database service

abstract async add_documents(documents: list[Document]) → None¶: Add a list of documents to the database

abstract async delete_all_documents() → None¶: Delete all documents from the database

abstract async get_documents() → list[Document] | None¶: Get all documents from the database

abstract async query(query_embedding: Sequence[float], collection_name: str = 'default', filter_docs: bool = True) → list[Document]¶: Query the database for documents similar to the query embedding

class insightvault.services.database.ChromaDatabaseService(config: DatabaseConfig)¶

Bases: AbstractDatabaseService

Chroma database service

This service is used to interact with the Chroma database.

Embedding functions are not provided here, so the caller must provide them.

async add_documents(documents: list[Document], collection_name: str = 'default') → None¶: Add a list of documents to the database. The documents must have embeddings.

async delete_all_documents(collection_name: str = 'default') → None¶: Delete all documents in the database

async get_documents(collection_name: str = 'default') → list[Document] | None¶: List all documents in the database

async query(query_embedding: Sequence[float], collection_name: str = 'default', filter_docs: bool = True) → list[Document]¶: Query the database for documents similar to the query embedding

class insightvault.services.embedding.EmbeddingService(config: EmbeddingConfig)¶

Bases: object

Service for generating embeddings from text using sentence-transformers.

To use it, you must first call await get_client() to ensure the model is loaded.

Attributes:: config: The configuration for the embedding service client: The embedding model client loading_task: The task that loads the embedding model logger: The logger for the embedding service

async embed(texts: list[str]) → list[list[float]]¶

Generate embeddings for a list of texts

Args:: texts: List of text strings to embed
Returns:: List of embedding vectors (as lists of floats)

async init() → None¶: Initialize the embedding service

class insightvault.services.llm.AbstractLLMService(model_name: str)¶

Bases: ABC

abstract async chat(prompt: str) → str | None¶: Generate a response from the model while maintaining chat history.

abstract async clear_chat_history() → None¶: Clear the chat history.

abstract async init() → None¶: Prepare the LLM service for use, such as loading model weights.

abstract async query(prompt: str) → str | None¶: Generate a one-off response from the model without chat history.

class insightvault.services.llm.BaseLLMService(model_name: str)¶: Bases: AbstractLLMService

class insightvault.services.llm.OllamaLLMService(model_name: str = 'llama3')¶

Bases: BaseLLMService

Ollama LLM service

async chat(prompt: str) → str | None¶: Generate a response from the model while maintaining chat history.

async clear_chat_history() → None¶: Clear the chat history.

async init() → None¶: Initialize the LLM service

async query(prompt: str) → str | None¶: Generate a one-off response from the model without chat history.

class insightvault.services.prompt.PromptService¶

Bases: object

Prompt service

get_prompt(prompt_type: str, context: dict[str, str] | None = None) → str¶

Retrieves a predefined prompt for a specific use case and injects context if needed. The context can include parameters like ‘text’, etc.

Args:: prompt_type (str): The type of prompt to retrieve. context (dict | None): The context to inject into the prompt.
Returns:: str: The prompt with the injected context.

class insightvault.services.splitter.SplitterService(config: SplitterConfig)¶

Bases: object

Splitter service

Attributes:: config: The configuration for the splitter

split(document: Document) → list[Document]¶: Split a document into chunks of a given size

Models¶

class insightvault.models.database.DistanceFunction(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)¶

Bases: Enum

COSINE = 'cosine'¶

L2 = 'l2'¶

class insightvault.models.document.Document(*, id: str = <factory>, title: str, content: str, metadata: ~collections.abc.Mapping[str, ~typing.Any] = <factory>, embedding: ~collections.abc.Sequence[float] | None = None, created_at: ~datetime.datetime = <factory>, updated_at: ~datetime.datetime = <factory>)¶

Bases: BaseModel

Document model

Attributes:: id: str title: str content: str metadata: dict[str, Any] embedding: list[float] | None created_at: datetime updated_at: datetime

content: str¶

created_at: datetime¶

embedding: Sequence[float] | None¶

id: str¶

metadata: Mapping[str, Any]¶

model_config: ClassVar[ConfigDict] = {}¶: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

title: str¶

updated_at: datetime¶