bridge.services.huggingface.huggingface_provider module#

LLM provider wrapping huggingface_hub’s InferenceClient for chat-style text generation used by pipelines.

class bridge.services.huggingface.huggingface_provider.HuggingFaceProvider(model='Qwen/Qwen3-8B', provider='featherless-ai')[source]#

Hugging Face provider for chat-capable models using InferenceClient. Supports models compatible with the chat.completions API.

Parameters:

model (str) – The Hugging Face model identifier to use for chat generation. Default is “Qwen/Qwen3-8B” (https://huggingface.co/Qwen/Qwen2.5-7B).
provider (HF_Provider) – The inference provider to use. Default is “featherless-ai”.

async generate(messages, **kwargs)[source]#

Generate a chat-based response from the model.

Parameters:

messages (list[ChatMessage]) – A list of chat messages forming the conversation history.
**kwargs – Additional generation parameters such as max_new_tokens and temperature.

Returns:

The generated chat message (response) from the model.

Return type:

ChatMessage

Raises:

bridge.services.huggingface.huggingface_provider module