Hugging Face
Hugging Face is the main platform for sharing open AI models. It provides inference in two ways. Inference Providers and Inference Endpoints.
Inference Providers
Inference Providers is a serverless service powered by external inference providers and routed through Hugging Face and paid per token.
You can access your access token from Hugging Face and prioritize your providers in settings.
name: My Config
version: 0.0.1
schema: v1
models:
- name: deepseek
provider: huggingface-inference-providers
model: deepseek-ai/DeepSeek-V3.2-Exp
apiKey: <YOUR_HF_TOKEN>
apiBase: https://router.huggingface.co/v1
Inference Endpoints
Inference Endpoints is a dedicated service that allows you to run your open models dedicated hardware. It is a more advanced way to get inference from Hugging Face models where you have more control over the whole process.
Before you can use Inference Endpoints, you need to create an endpoint. You can do this by going to Inference Endpoints and clicking on "Create Endpoint".
name: My Config
version: 0.0.1
schema: v1
models:
- name: deepseek
provider: huggingface-inference-endpoints
model: <ENDPOINT_ID>
apiKey: <YOUR_HF_TOKEN>
apiBase: https://<YOUR_ENDPOINT_ID>.aws.endpoints.huggingface.cloud