OpenAI’s massive 120-billion-parameter open-weight model can now be used entirely for free, no GPU required, through DuckDuckGo’s privacy-first chatbot Duck.ai. The surprise addition, first spotted by Windows Central’s reviewer, gives anyone with a web browser instant access to gpt-oss:120b—a model so large that running it locally demands workstation-class hardware and north of 60 GB of VRAM. Instead of wrestling with quantized weights and multi-GPU setups, Windows users can simply open duck.ai, select the model from a dropdown, and start chatting.

The move signals a shift in how open-weight frontier models reach end users. DuckDuckGo’s gateway anonymizes all traffic, strips metadata, and enforces contractual no-training obligations on model providers, making it one of the more trustworthy hosted options available today. Yet as with any cloud-based AI service, convenience comes with limitations. The hosted interface does not expose the model’s chain-of-thought reasoning, file uploads are absent, and—crucially—the presence of gpt-oss:120b in Duck.ai’s public model roster has not been confirmed in DuckDuckGo’s own documentation at the time of this writing.

The technical muscle behind gpt-oss

OpenAI released the gpt-oss family under the Apache 2.0 license, offering two flavors: the larger 120-billion-parameter variant and a more compact 20-billion-parameter sibling. The 120b edition uses a Mixture-of-Experts (MoE) architecture with roughly 117 billion total parameters, activating only about 5.1 billion per token. This design keeps compute costs down despite the enormous weight count. Both models support context windows up to 128,000 tokens and are tuned for chain-of-thought reasoning and tool use.

Quantization plays a critical role in making these models deployable. OpenAI’s guidance points to MXFP4 format for the 120b version, bringing the download size to around 65 GB. In theory, that fits on a single high-end professional GPU such as an NVIDIA H100 or a 60–80 GB class card. Consumer hardware, even a pair of RTX 5090s, falls short—hence the appeal of a hosted inference solution. The 20b model, by contrast, can run on a 16 GB consumer GPU with appropriate quantization, making it a viable local option for many Windows users.

Duck.ai’s privacy play

Duck.ai acts as an anonymizing proxy between users and third-party model providers. When you send a prompt, it first hits DuckDuckGo’s servers, which strip identifying headers and forward the request to the model endpoint. The provider sees no persistent user ID, no IP address that can be traced back to you, and no device fingerprint. DuckDuckGo also negotiates contractual terms that prevent providers from storing conversations or using them for model training.

On the client side, Duck.ai stores your recent chats locally on your device—no account required. A prominent “Fire Button” clears that local data instantly. The combination has earned Duck.ai a reputation as one of the few genuinely privacy-respecting AI chat services. For Windows users who want to test a large model without feeding their prompts into an opaque training pipeline, Duck.ai removes a significant layer of risk.

The Windows Central hands-on: fast, free, but missing pieces

The Windows Central reviewer’s experience with gpt-oss:120b on Duck.ai was overwhelmingly positive in terms of raw performance. Responses came back as quickly as a local 20b model running on an RTX 5090, which makes sense since inference happens on DuckDuckGo’s infrastructure, not the user’s machine. The web app feels polished, with a familiar ChatGPT-like sidebar for recent chats and extensive settings to customize response style. It supports multiple models, not just OpenAI’s latest, and switching between them requires only a couple of clicks.

Two notable absences stood out. First, Duck.ai does not reveal the model’s chain-of-thought. Tools like Ollama or LM Studio, when running gpt-oss locally, display the internal “thinking” steps the model uses to arrive at an answer. The Windows Central author found this visibility valuable for debugging—it can reveal where the model’s reasoning went astray. Duck.ai’s interface only delivers the final reply, a deliberate design choice but one that will frustrate power users and developers.

Second, file uploads are not supported for gpt-oss sessions. Other models on the platform allow images, but document or code file uploads are absent across the board. DuckDuckGo likely views the omission as a privacy safeguard—receiving user files creates additional data-sharing obligations—but it limits the service’s utility for tasks like summarizing reports, analyzing spreadsheets, or reviewing codebases.

Verifying the gpt-oss:120b claim

While Windows Central’s report is credible and grounded in firsthand testing, the community discussion raises a legitimate caution: Duck.ai’s public help pages, at last check, do not list gpt-oss:120b among the supported models. The documented roster includes Anthropic’s Claude, Meta’s Llama family, Mistral, and OpenAI’s more mainstream offerings like GPT-4o Mini. Gpt-oss does not appear in any publicly indexed support article.

This gap could simply mean the rollout is still underway. DuckDuckGo frequently tunes its model lineup, and static documentation often lags behind live experiments. The Windows Central reviewer accessed the model through the in-app selector, so the most reliable way to confirm availability is to visit duck.ai directly and check the sidebar. If gpt-oss:120b appears, you’re good to go; if not, other powerful options—likely including gpt-oss:20b on paid tiers down the line—may fill the void.

Until DuckDuckGo issues an official statement or updates its documentation, treat the 120b availability as provisionally true but subject to change. That’s a prudent stance for anyone relying on a specific model version for a project.

Hosted vs. local: which path should you choose?

The arrival of gpt-oss on Duck.ai sharpens a perennial trade-off for Windows users: convenience and privacy versus transparency and control.

Run locally (Ollama, LM Studio, Windows AI Foundry) if:
- You need to see chain-of-thought traces to audit reasoning or debug outputs.
- You handle regulated data that must never leave your hardware.
- You want to experiment with model quantization, fine-tuning, or custom tool integrations.
- You own a GPU with at least 16 GB of VRAM for the 20b model, or a workstation-class setup for the 120b.

Use Duck.ai or another hosted gateway if:
- Your hardware cannot comfortably run the model size you want.
- You prioritize instant, frictionless access over feature depth.
- Privacy is important but you are comfortable with a reputable intermediary enforcing data-removal contracts.
- You are casually exploring the model’s capabilities without long-term reliance.

For many Windows enthusiasts, the answer will be “both.” You can try the huge model on Duck.ai to see what it can do, then switch to a local 20b deployment for tasks that demand full introspection.

What this means for the Windows AI landscape

OpenAI’s release of open-weight models was always about more than just shipping code. By publishing under Apache 2.0 and providing clear deployment guidance, OpenAI invited the ecosystem—cloud providers, tool vendors, and privacy platforms—to build novel access pathways. DuckDuckGo is the first major consumer-facing privacy brand to step up, and its move may pressure competitors to offer similar anonymized gateways.

For Windows users, the immediate impact is lower barriers to entry. Not long ago, sampling a 120-billion-parameter model meant either owning a server rack or paying for API credits. Today, a student on a budget laptop can open a browser tab and hold a sophisticated, private conversation with one of the most capable open models available. The friction has dropped from “impossible” to “instant”—at least for casual exploration.

That said, the trade-offs are real. The lack of reasoning visibility may lead some to misattribute errors to the model rather than to a missing step in the thought chain. And the absence of file-upload support reminds us that privacy guardrails, however valuable, can also fence off entire use cases. The ideal middle ground—a hosted service that optionally exposes chain-of-thought and lets users upload files into an ephemeral, non-training sandbox—does not yet exist, but Duck.ai has shown that the demand is there.

A practical checklist before you jump in

If you’re ready to try gpt-oss:120b on Duck.ai, a few steps will ensure you get the most out of the experience while staying safe:

  • Check the model selector: Open duck.ai, look at the model dropdown or sidebar, and confirm that gpt-oss:120b is listed. If it’s not, you’ll know the rollout hasn’t reached your region yet.
  • Review the privacy terms: DuckDuckGo’s help pages outline retention policies and anonymization guarantees. Read them if you plan to discuss sensitive topics. They are among the strongest in the industry, but contractual terms can evolve.
  • Understand the limitations: No chain-of-thought, no file uploads. If your workflow requires those features, plan a local deployment in parallel.
  • Test the free tier first: Duck.ai requires no account, so you lose nothing by experimenting. If the model suits your needs, you might later consider a paid subscription for access to even newer models when they become available.

For those who’d rather run on their own hardware, start with the 20b variant. Download it from Hugging Face or through Ollama’s model library, and ensure your GPU drivers and CUDA toolkit support the required quantization runtimes. The Windows AI Foundry project provides additional tooling for the Windows-on-Arm ecosystem if you’re on a Snapdragon X device.

The big picture is clear: the era of “big models only in the cloud” is receding. With DuckDuckGo’s hosted gatekeeper and OpenAI’s open-weight muscle, Windows users now have a genuine choice between zero-cost anonymous access and full local control. Which one you pick will depend on whether you value convenience or transparency more—but for the first time, both doors are open.