MatchingΒΆ

class database.service.matching.MatchingService(model=typing.Type[sqlalchemy.orm.decl_api.Base])ΒΆ

Class that provide services for matching text with database entries

get_exact_match(db: Session, user_input: str, language: str = None, k: int = 0, tags: List[str] = None)ΒΆ

Get exact match from database

Parameters:
  • db (Session) – User input to match database entries

  • user_input (str) – User input to match database entries

  • language (str, optional) – Question and results language

  • k (int, optional) – Number of results to return

Return type:

list of dict

get_fuzzy_match(db: Session, user_input: str, threshold: int = 150, language: str = None, k: int = 0, tags: List[str] = None)ΒΆ

Get fuzzy match from database using levenshtein distance

Parameters:
  • db (Session) – User input to match database entries

  • user_input (str) – User input to match database entries

  • threshold (int, optional)

  • language (str, optional) – Question and results language

  • k (int, optional) – Number of results to return

Return type:

list of dict

get_trigram_match(db: Session, user_input: str, threshold: int = 0.4, language: str = None, k: int = 0, tags: List[str] = None)ΒΆ

Get trigram match from database

Parameters:
  • db (Session) – User input to match database entries

  • user_input (str) – User input to match database entries

  • threshold (int, optional) – Trigram similarity threshold, default to 0.4

  • language (str, optional) – Question and results language

  • k (int, optional) – Number of results to return, default to 0 (return all results)

async get_semantic_match(db: Session, user_input: str, language: str = None, k: int = 0, symbol: str = '<=>', tags: List[str] = None, source: List[str] = None, organizations: List[str] = None, user_uuid: str = None, embedding_field: str | List[str] = 'text_embedding')ΒΆ

Get semantic similarity match from database

Parameters:
  • db (Session) – Database session

  • user_input (str) – Input text to match against

  • language (str, optional) – Filter by language

  • k (int, optional) – Number of results to return (0 for all)

  • symbol (str, optional) – Operator symbol for similarity comparison

  • tags (List[str], optional) – Filter by tags

  • source (List[str], optional) – Filter by source URLs

  • organizations (List[str], optional) – Filter by organizations

  • user_uuid (str, optional) – User UUID for personal documents

  • embedding_field (Union[str, List[str]], optional) – Field(s) containing embeddings

Returns:

Matched documents sorted by similarity

Return type:

List[dict]

async semantic_similarity_match_l1(db: Session, user_input: str, language: str = None, k: int = 0, tags: str = None)ΒΆ

Get semantic similarity match using L1 distance

Parameters:
  • db (Session) – Database session

  • user_input (str) – Input text to match against

  • language (str, optional) – Filter by language

  • k (int, optional) – Number of results (0 for all)

  • tags (str, optional) – Filter by tags

Returns:

Matched documents sorted by L1 distance

Return type:

List[dict]

async semantic_similarity_match_l2(db: Session, user_input: str, language: str = None, k: int = 0, tags: str = None)ΒΆ

Get semantic similarity match from database using L2 distance

async semantic_similarity_match_inner_prod(db: Session, user_input: str, language: str = None, k: int = 0, tags: str = None)ΒΆ

Get semantic similarity match from database using inner product