bridge.services.huggingface.huggingface_provider module#
LLM provider wrapping huggingface_hub’s InferenceClient for chat-style text generation used by pipelines.
- class bridge.services.huggingface.huggingface_provider.HuggingFaceProvider(model='Qwen/Qwen3-8B', provider='featherless-ai')[source]#
Bases:
LLMProviderHugging Face provider for chat-capable models using InferenceClient. Supports models compatible with the chat.completions API.
- Parameters:
model (str) – The Hugging Face model identifier to use for chat generation. Default is “Qwen/Qwen3-8B” (https://huggingface.co/Qwen/Qwen2.5-7B).
provider (HF_Provider) – The inference provider to use. Default is “featherless-ai”.
- async generate(messages, **kwargs)[source]#
Generate a chat-based response from the model.
- Parameters:
messages (list[ChatMessage]) – A list of chat messages forming the conversation history.
**kwargs – Additional generation parameters such as max_new_tokens and temperature.
- Returns:
The generated chat message (response) from the model.
- Return type:
- Raises:
ValueError – If messages is empty.
RuntimeError – If the model response is missing required fields.
Exception – For any other errors during generation.