Skip to content

rag

thml.rag

Classes:

  • RAG

    Retrieval Augmented Generation (RAG) system.

Functions:

RAG(rag_path: str = '', doc_path: str = None, llm: object = None, embedding: object = None, text_splitter: object = None, db: object = None, rerank: bool = False, style: str = 'simple')

Retrieval Augmented Generation (RAG) system. Support vectorstore types: 'FAISS' or 'Chroma'. Default is 'FAISS'.

Use reranker to improve the quality of the retrieved documents. Default is False. Note that the reranker model must different from the embedding model.

Initialize the RAG system.

Methods:

  • ask

    Perform an question-answering task and generate the answer to the given query that content come from the documents.

  • ask_llm

    Ask the LLM model to generate the answer to the given query. It is different from qa function, which just return the answer if the documents contain the information. This function will generate the answer from the LLM model.

  • search

    Search information from the documents. This is actually perform retriever.invoke to retrieve information from vectorstore. Refer: https://python.langchain.com/docs/use_cases/question_answering/quickstart#retrieval-and-generation-retrieve.

  • set_retriever

    Define parameters for the retriever. See vectorstore.as_retriever for more information.

  • set_chain

    Set the style of the RAG system. The style can be 'simple', 'multi_query', or 'fusion'.

Attributes:

embedding = embedding instance-attribute

db = db instance-attribute

text_splitter = text_splitter instance-attribute

retriever = self.set_retriever(search_type='similarity', search_kwargs={'k': 6}) instance-attribute

compressor = reranker_model() instance-attribute

compression_retriever = ContextualCompressionRetriever(base_compressor=self.compressor, base_retriever=self.retriever) instance-attribute

llm = llm instance-attribute

info property

ask(question='what are the documents about?') -> str

Perform an question-answering task and generate the answer to the given query that content come from the documents.

Refer: https://python.langchain.com/docs/use_cases/question_answering/quickstart#retrieval-and-generation-retrieve

ask_llm(question='Who are you?') -> str

Ask the LLM model to generate the answer to the given query. It is different from qa function, which just return the answer if the documents contain the information. This function will generate the answer from the LLM model.

search(question: str = 'summary') -> list[Document]

Search information from the documents. This is actually perform retriever.invoke to retrieve information from vectorstore. Refer: https://python.langchain.com/docs/use_cases/question_answering/quickstart#retrieval-and-generation-retrieve.

Parameters:

  • query (str) –

    The query to search for.

  • k (int) –

    The number of documents to return.

Returns: results (list[Document]): The documents that match the query.

set_retriever(search_type: str = 'similarity', search_kwargs: dict = None)

Define parameters for the retriever. See vectorstore.as_retriever for more information. Ref: https://python.langchain.com/docs/modules/data_connection/retrievers/vectorstore

Parameters:

  • search_type (Optional[str], default: 'similarity' ) –

    Defines the type of search that the Retriever should perform. Can be "similarity" (default), "mmr", or

  • search_kwargs (Optional[Dict], default: None ) –

    Keyword arguments to pass to the search function. Can include things like: k: Amount of documents to return (Default: 4) score_threshold: Minimum relevance threshold for similarity_score_threshold fetch_k: Amount of documents to pass to MMR algorithm (Default: 20) lambda_mult: Diversity of results returned by MMR; 1 for minimum diversity and 0 for maximum. (Default: 0.5) filter: Filter by document metadata

set_chain(style: str = 'simple') -> None

Set the style of the RAG system. The style can be 'simple', 'multi_query', or 'fusion'.

embedding_model(provider: str = 'huggingface', model_name: str = None, model_kwargs: dict = None) -> object

Define the embedding function to use. See the list of available models of Langchain

Check the latest performance benchmarks for text embedding models at MTEB leaderboards hosted by Hugging Face. The fields to consider are: - Score: the score we should focus on is "average" and "retrieval average". Both are highly correlated, so focusing on either works. - Sequence length tells us how many tokens a model can consume and compress into a single embedding. Generally speaking, we wouldn't recommend stuffing more than a paragraph of heft into a single embedding - so models supporting up to 512 tokens are usually more than enough. - Model size: the size of a model indicates how easy it will be to run. All models near the top of MTEB are reasonably sized. One of the largest is instructor-xl (requiring 4.96GB of memory), which we can easily run on consumer hardware.

Note
  • Embedding model may be referred to as SentenceTransformer in HF.

Some HF's embedding models: - mixedbread-ai/mxbai-embed-large-v1 - BAAI/bge-large-en-v1.5

Parameters:

  • provider (str, default: 'huggingface' ) –

    The provider of the embeddings.

  • model_name (str, default: None ) –

    The name of the model to use for the reranker.

  • model_kwargs (dict, default: None ) –

    The parameters to use for the reranker.

llm_model(service: str = 'web_opengpts', **kwargs: dict) -> LLM

Predefined LangChain's style LLM models to be used in the RAG system. Args: service (str): The LLM model service. Available options: 'openai', 'web_openai', 'web_opengpts', 'web_phind', 'web_llama2', 'web_bing' **kwargs: The model parameters, depend on the service. Returns: LLM: The LangChain's LLM.

load_document(doc_path: str = '', ext: str = None) -> list[Document]

Load documents from the given path, using langchain's [document_loaders]. Supported file types: .pdf, .docx, .txt, .md, .lnk (Windows' shortcuts).

Parameters:

  • doc_path (str, default: '' ) –

    The path to the folder containing the documents.

  • ext (str, default: None ) –

    The file extension of the documents to be loaded, e.g., '.pdf'. Default, loads all files in the folder.

Returns: list[Document]: A list of Document objects, containing the loaded documents.