Hugging Face · Enterprise

Everything Hugging Face gives your team.

Hugging Face Enterprise is a lot more than a private Hub. This page walks you through how it works in plain language: how seats work, what every seat already includes, and how you scale storage, inference, and training as you grow.

Governance isn't a control problem. It's a visibility problem.

By the time leadership defines an AI strategy, developers are already building with open models. Blocking access doesn't stop it, it just moves it to personal accounts where you can't see it. The real question isn't whether you can control AI, it's whether you can see it: which models are running, where, and on what data.

Open source is what makes governance possible, not what puts it at risk. A proprietary API is a black box that changes when the vendor decides. An open model is an artifact you own: pin the version, inspect it, run it in your own environment, and apply the same identity, audit, and approval controls you already use for everything else. AI stops being something you consume and becomes infrastructure you operate.

"Not your weights, not your brain."Andrej Karpathy

How it works

A common mix-up: people think you buy storage or inference separately. You don't. It all runs on seats. Here is the whole model in three steps.

Start with seats

Everyone who works in your org gets a seat. Seats are the foundation: they decide who has access and which plan you're on, Enterprise or Enterprise Plus.

Every seat includes a lot

Each seat already comes with private storage, monthly inference credits, faster compute, and the full governance and security suite. You're not buying features one by one, they arrive together.

Scale up when you need to

Need more storage, dedicated inference, or GPUs for a big training run? Add them on top of your plan as you grow. Nothing is locked away, it scales with you.

Enterprise

$50 /user/mo

The governed workspace your team needs to build safely on open models.

★ Most complete

Enterprise Plus

$200 /user/mo

Everything in Enterprise, plus the security, provisioning, and support larger orgs require.

What every seat includes

On either plan, every seat comes with all of this. You don't need to use it all, most teams switch on what fits their stack and grow into the rest.

Govern & secure your AI

This is the heart of Enterprise. Connect your identity provider, see every action your team takes, and decide exactly who can touch what. Open models stop being a blind spot and start being something you manage like any other part of your stack.

Storage built for models

Every seat includes 1 TB of private storage, backed by Xet so repeated data is stored once and uploads only move the bytes that changed. Mount any repo as local files with hf-mount, and keep data in the US or EU for residency. No egress fees, ever.

Storage Buckets (Xet) →
Storage regions (US / EU) →

Credits for inference

Each seat comes with monthly credits you spend on inference. Run hundreds of open models serverless through one OpenAI-compatible endpoint, billed to your org with spend limits and provider controls you set. The same credits can also cover pay-as-you-go storage when you need it.

Inference Providers →

Compute for builders

Higher-priority ZeroGPU for Spaces, on-demand GPU upgrades when an app needs more muscle, and Dev Mode for a fast, local-feeling workflow. Your team spends less time waiting on hardware and more time shipping.

Scale up as you grow

When you outgrow what's included, add more without leaving your plan. This is where most of the value shows up at scale.

More storage, at commit prices

Storage gets much cheaper as a committed volume, dropping toward $9 per TB at petabyte scale, and there are never any egress fees. That's where the savings against S3, GCS, and Azure get large. Put in your numbers and see for yourself.

Open the storage cost calculator →

Dedicated inference

When you need reserved capacity and predictable latency, deploy any model to its own managed endpoint that autoscales with your traffic. Great for production apps where the serverless credits aren't enough on their own.

Inference Endpoints →

Bigger GPUs on demand

Spin up larger GPU hardware for heavier Spaces and internal apps whenever you need it, then scale back down. You only pay for what you run, and it's all in the same place as your models and data.

Spaces GPU options →

Beyond the platform

For the most demanding security and training needs, we go further than the standard product. These are the offerings we build and run with you.

In active development

Model Gateway

For platform & security teams governing model usage

A self-hosted proxy and cache for the Hub that brings supply-chain security to how your team pulls models and datasets. Think "Artifactory for AI/ML": audit every download, enforce license and file-format policies, set allow and block rules per user and per repo, and scan for malware with the tools you already trust, all inside your own network.

A custom product we build with you.

With NVIDIA · waitlist

Training Cluster

For teams training or fine-tuning models at scale

Train your own models on reserved GPU capacity, H100 and H200, through NVIDIA DGX Cloud Lepton. Size a cluster to your run and pay only for the time you actually use, with no bills for idle hardware, plus white-glove support across North America, Europe, and East Asia.

Join the waitlist →

Custom

Custom security & deployment

For orgs with specific compliance requirements

On-prem deployments, private networking, and bespoke governance and compliance. If your stack has requirements the standard product doesn't cover yet, we'll scope them and build them with you rather than ask you to compromise.

Custom, built together.

See what fits? Let's talk.

Send your setup and we'll map the exact plan, what's included, and any scale-up or custom pieces with you.

Talk to Adrian →