IndexingPipeline¶

async indexing_api.chunk_rag_data()¶: Dummy endpoint for data chunking (RAG).

async indexing_api.index_data(url: str, question: str, answer: str, language: str)¶

Upsert a single entry to the FAQ dataset.

Parameters:

Returns:

The article id, url, question, answer and language upon successful completion of the process

Return type:

dict

async indexing_api.index_faq_data(sitemap_url: str = 'https://faq.bsv.admin.ch/sitemap.xml', proxy: str = None, k: int = 0)¶

Add and index data for Autocomplete to the FAQ database. The data is obtained by scraping the website sitemap_url.

Parameters:

sitemap_url (str, default ‘https://faq.bsv.admin.ch/sitemap.xml’) – the sitemap.xml URL of the website to scrap
proxy (str, optional) – Proxy URL if necessary
k (int, default 0) – Number of article to scrap and log to test the method.

Returns:

Confirmation message upon successful completion of the process

Return type:

str

async indexing_api.index_faq_vectordb()¶

Add and index test data for Autocomplete to the FAQ database.

async indexing_api.index_rag_vectordb()¶

Add and index test data for RAG to the embedding database.

async indexing_api.parse_faq_data()¶: Dummy endpoint for FAQ data parsing.

async indexing_api.parse_rag_data()¶: Dummy endpoint for data parsing (RAG).