docsense.models

Model implementations for DocSense.

class EmbeddingModel(model_name='Qwen/Qwen2-7B', device='cuda', max_length=512, normalize_embeddings=True)[source]

Text embedding model using Qwen.

Parameters:

model_name (str)
device (str)
max_length (int)
normalize_embeddings (bool)

__init__(model_name='Qwen/Qwen2-7B', device='cuda', max_length=512, normalize_embeddings=True)[source]

Initialize the embedding model.

Parameters:

model_name (str) – Name or path of the Qwen model
device (str) – Device to run the model on (‘cuda’ or ‘cpu’)
max_length (int) – Maximum sequence length for tokenization
normalize_embeddings (bool) – Whether to L2-normalize the embeddings

encode(texts, batch_size=8)[source]

Generate embeddings for the given texts.

Parameters:

texts (Union[str, List[str]]) – Single text or list of texts to encode
batch_size (int) – Number of texts to process at once

Return type:

ndarray

Returns:

numpy array of embeddings with shape (num_texts, embedding_dim)

get_embedding_dim()[source]

Get the dimension of the embeddings.

Return type:: int
Returns:: Embedding dimension

class LLMModel(model_name='Qwen/Qwen2-7B', device='cuda', max_length=2048, temperature=0.0, top_p=1.0, repetition_penalty=1.1)[source]

Wrapper for Qwen language model.

Parameters:

model_name (str)
device (str)
max_length (int)
temperature (float)
top_p (float)
repetition_penalty (float)

__init__(model_name='Qwen/Qwen2-7B', device='cuda', max_length=2048, temperature=0.0, top_p=1.0, repetition_penalty=1.1)[source]

Initialize the LLM model.

Parameters:

model_name (str) – Name of the Qwen model to use
device (str) – Device to run the model on (‘cuda’ or ‘cpu’)
max_length (int) – Maximum sequence length for generation
temperature (float) – Sampling temperature (0.0 for deterministic output)
top_p (float) – Nucleus sampling parameter (1.0 for no filtering)
repetition_penalty (float) – Penalty for repeating tokens

generate(question, context)[source]

Generate answer based on context.

Parameters:

question (str)
context (List[str])

Return type:

Dict[str, Any]

get_max_length()[source]

Get the maximum sequence length.

Return type:: int
Returns:: Maximum sequence length

Modules

`embeddings`	Text embedding model implementation using Qwen.
`llm`	LLM (Large Language Model) wrapper implementation.