API reference

Documentation of the different classes, starting from the app layer down to the model layer.

App

class insightvault.app.base.BaseApp(name: str = 'insightvault.app.base', config_path: str = './config.yaml')

Bases: object

add_documents(documents: list[Document]) None

Add documents to the database

async async_add_documents(documents: list[Document]) None

Async version of add_document

async async_delete_all_documents() None

Async version of delete_all_documents

async async_list_documents() list[Document] | None

Async version of list_documents

delete_all_documents() None

Delete all documents from the database

async init() None

Initialize the app

list_documents() list[Document] | None

List all documents in the database

insightvault.app.cli.main() None

Entry point for the CLI

class insightvault.app.rag.RAGApp(name: str = 'insightvault.app.rag')

Bases: SearchApp

RAG application for retrieval-augmented generation

This application extends the SearchApp with RAG-specific query functionality. All other methods (add_documents, delete_documents, etc.) are inherited from SearchApp.

async async_clear() None

Async version of clear

async async_query(query: str) list[str]

Async version of query

clear() None

Clears the chat history

async init() None

Initialize the RAG app

query(query: str) list[str]

Query the database for documents similar to the query

This RAG-specific implementation returns Document objects instead of strings.

class insightvault.app.search.SearchApp(name: str = 'insightvault.app.search')

Bases: BaseApp

Search application for semantic search

This application is used to query the database and add documents to the database.

Attributes:

db (Database): The database service.

async async_query(query: str) list[str]

Async version of query

async init() None

Initialize the search app

query(query: str) list[str]

Query the database for documents similar to the query.

Returns an alphabetically sorted list of document titles.

class insightvault.app.summarizer.SummarizerApp(name: str = 'insightvault.app.summarizer')

Bases: BaseApp

Summarizer application

This application is used to summarize documents.

async async_summarize(text: str) str | None

Async version of summarize

async init() None

Initialize the summarizer app

summarize(text: str) str | None

Summarize a list of documents

Services

class insightvault.services.database.AbstractDatabaseService

Bases: ABC

Abstract database service

abstract async add_documents(documents: list[Document]) None

Add a list of documents to the database

abstract async delete_all_documents() None

Delete all documents from the database

abstract async get_documents() list[Document] | None

Get all documents from the database

abstract async query(query_embedding: Sequence[float], collection_name: str = 'default', filter_docs: bool = True) list[Document]

Query the database for documents similar to the query embedding

class insightvault.services.database.ChromaDatabaseService(config: DatabaseConfig)

Bases: AbstractDatabaseService

Chroma database service

This service is used to interact with the Chroma database.

Embedding functions are not provided here, so the caller must provide them.

async add_documents(documents: list[Document], collection_name: str = 'default') None

Add a list of documents to the database. The documents must have embeddings.

async delete_all_documents(collection_name: str = 'default') None

Delete all documents in the database

async get_documents(collection_name: str = 'default') list[Document] | None

List all documents in the database

async query(query_embedding: Sequence[float], collection_name: str = 'default', filter_docs: bool = True) list[Document]

Query the database for documents similar to the query embedding

class insightvault.services.embedding.EmbeddingService(config: EmbeddingConfig)

Bases: object

Service for generating embeddings from text using sentence-transformers.

To use it, you must first call await get_client() to ensure the model is loaded.

Attributes:

config: The configuration for the embedding service client: The embedding model client loading_task: The task that loads the embedding model logger: The logger for the embedding service

async embed(texts: list[str]) list[list[float]]

Generate embeddings for a list of texts

Args:

texts: List of text strings to embed

Returns:

List of embedding vectors (as lists of floats)

async init() None

Initialize the embedding service

class insightvault.services.llm.AbstractLLMService(model_name: str)

Bases: ABC

abstract async chat(prompt: str) str | None

Generate a response from the model while maintaining chat history.

abstract async clear_chat_history() None

Clear the chat history.

abstract async init() None

Prepare the LLM service for use, such as loading model weights.

abstract async query(prompt: str) str | None

Generate a one-off response from the model without chat history.

class insightvault.services.llm.BaseLLMService(model_name: str)

Bases: AbstractLLMService

class insightvault.services.llm.OllamaLLMService(model_name: str = 'llama3')

Bases: BaseLLMService

Ollama LLM service

async chat(prompt: str) str | None

Generate a response from the model while maintaining chat history.

async clear_chat_history() None

Clear the chat history.

async init() None

Initialize the LLM service

async query(prompt: str) str | None

Generate a one-off response from the model without chat history.

class insightvault.services.prompt.PromptService

Bases: object

Prompt service

get_prompt(prompt_type: str, context: dict[str, str] | None = None) str

Retrieves a predefined prompt for a specific use case and injects context if needed. The context can include parameters like ‘text’, etc.

Args:

prompt_type (str): The type of prompt to retrieve. context (dict | None): The context to inject into the prompt.

Returns:

str: The prompt with the injected context.

class insightvault.services.splitter.SplitterService(config: SplitterConfig)

Bases: object

Splitter service

Attributes:

config: The configuration for the splitter

split(document: Document) list[Document]

Split a document into chunks of a given size

Models

class insightvault.models.database.DistanceFunction(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)

Bases: Enum

COSINE = 'cosine'
L2 = 'l2'
class insightvault.models.document.Document(*, id: str = <factory>, title: str, content: str, metadata: ~collections.abc.Mapping[str, ~typing.Any] = <factory>, embedding: ~collections.abc.Sequence[float] | None = None, created_at: ~datetime.datetime = <factory>, updated_at: ~datetime.datetime = <factory>)

Bases: BaseModel

Document model

Attributes:

id: str title: str content: str metadata: dict[str, Any] embedding: list[float] | None created_at: datetime updated_at: datetime

content: str
created_at: datetime
embedding: Sequence[float] | None
id: str
metadata: Mapping[str, Any]
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

title: str
updated_at: datetime