Scaleway Now Available as Inference Provider on Hugging Face for Fast, Secure AI Model Access
Scaleway is now a supported Inference Provider on the Hugging Face Hub, marking a significant step in expanding access to high-performance AI models. This integration allows developers to run popular open-weight models directly from Hugging Face model pages using Scaleway’s serverless inference service, with seamless support across both Python and JavaScript SDKs. The partnership enables users to easily deploy models like gpt-oss, Qwen3, DeepSeek R1, and Gemma 3 without managing infrastructure. You can explore Scaleway’s model offerings at https://huggingface.co/scaleway and discover trending models powered by Scaleway at https://huggingface.co/models?inference_provider=scaleway&sort=trending. Scaleway Generative APIs is a fully managed, serverless service that delivers access to cutting-edge AI models from leading research labs through simple API calls. It features competitive pay-per-token pricing starting at €0.20 per million tokens. The service runs on secure, European-based infrastructure in Paris, France, ensuring data sovereignty and low latency for users across Europe. The platform supports advanced capabilities including structured outputs, function calling, and multimodal processing for both text and images. Designed for production use, Scaleway delivers sub-200ms response times for the first token, making it ideal for interactive applications and AI agents. It supports both text generation and embedding models. Using Scaleway as an inference provider is straightforward. In the Hugging Face UI, users can select Scaleway from the provider dropdown on model pages, with options sorted by preference. Through the client SDKs, developers can specify the provider directly. For example, in Python using huggingface_hub (version 0.34.6 or higher), you can call a model like this: import os from huggingface_hub import InferenceClient client = InferenceClient( provider="scaleway", api_key=os.environ["HF_TOKEN"], ) messages = [ { "role": "user", "content": "Write a poem in the style of Shakespeare" } ] completion = client.chat.completions.create( model="openai/gpt-oss-120b", messages=messages, ) print(completion.choices[0].message) A similar implementation is available in JavaScript using @huggingface/inference. Billing is handled transparently. When using a direct API key from Scaleway, you are billed directly on your Scaleway account. When using a Hugging Face token for routed requests, you pay only the standard provider rates with no additional markup. Hugging Face may introduce revenue-sharing agreements with providers in the future. Hugging Face PRO users receive $2 in monthly inference credits usable across all supported providers. Upgrading to PRO also unlocks benefits like ZeroGPU, Spaces Dev Mode, higher usage limits, and more. Free users get a limited inference quota. For sustained usage, upgrading is recommended. Feedback is welcome. Share your experience or suggestions in the Hugging Face discussion thread at https://huggingface.co/spaces/huggingface/HuggingDiscussions/discussions/49.
