Tenstorrent has revealed a new kind of desktop AI machine aimed squarely at developers and teams who want serious on-prem performance without building out a server room. Called the TT-QuietBox 2 (Blackhole), it’s a liquid-cooled AI workstation built around the RISC-V architecture, designed to run large models locally, privately, and quietly—starting at $9,999.
At the heart of TT-QuietBox 2 is Tenstorrent’s Blackhole accelerator card, a purpose-built AI compute engine powered by 16 large RISC-V cores and up to 32 GB of GDDR6 memory per card. The workstation can be configured with up to four Blackhole cards, bringing total accelerator memory to as much as 128 GB of GDDR6. On top of that, the system includes 256 GB of onboard DDR5 system memory, giving it the capacity to handle demanding AI workloads without constantly bouncing data to remote servers.
Tenstorrent’s pitch is simple: the people building and deploying AI should be able to see, control, and own the entire stack—from silicon to compiler to kernels. TT-QuietBox 2 is positioned as a practical alternative for developers, small-to-medium businesses, research teams, and regulated environments that need local deployment but don’t want racks, specialized power, or the operational overhead of a dedicated data center.
One of the headline claims is that TT-QuietBox 2 can run models up to 120 billion parameters directly at your desk. Tenstorrent says GPT-OSS 120B runs fully on-device, enabling private inference without cloud token limits or sending sensitive prompts and data off-site. For other popular AI workloads, the company highlights performance figures such as Llama 3.1 70B running at 476.5 tokens per second, and Qwen3-32B operating as a private coding agent capable of reasoning across large codebases locally.
The workstation is also presented as a strong fit for creative and multimodal work. Tenstorrent notes that image generation with Flux and video synthesis with Wan 2.2 can be done entirely locally, which is increasingly important for teams that need to keep intellectual property and customer assets off external servers.
In scientific and research computing, Tenstorrent points to biomolecular modeling performance with Boltz-2. A notable example provided is predicting the structure of a 686-amino-acid protein in 49 seconds on a single Blackhole processor—a workload that could take a modern CPU around 45 minutes. The company also says the system can run four protein structure predictions in parallel for a 4x throughput increase, framing it as competitive with top-tier workstation GPU performance while targeting a lower overall cost approach.
For developers who don’t want to be locked into a narrow model list, Tenstorrent emphasizes TT-Forge, its open-source AI compiler. TT-Forge is designed to run models from common ecosystems such as PyTorch, ONNX, TensorFlow, JAX, and PaddlePaddle, compiling them to run directly on the hardware. The message is clear: if your model runs in a standard framework, the goal is for it to run on QuietBox 2.
Under the hood, the system’s four Blackhole ASICs operate together as a unified internal mesh inside a desk-friendly chassis. Tenstorrent states the configuration delivers 480 Tensix cores and up to 2,654 TFLOPS at BlockFP8 precision. The architecture is designed to avoid common throughput limitations by moving tensors efficiently through on-chip memory, reducing reliance on the kinds of memory patterns that can bottleneck conventional designs. Tenstorrent also highlights that its use of GDDR6 plus on-chip SRAM sidesteps today’s high-bandwidth memory supply constraints that have helped push AI hardware prices upward.
While the performance story is important, TT-QuietBox 2 is also built to be easier to deploy than traditional enterprise AI infrastructure. It runs Ubuntu 24.04 out of the box, plugs into a standard 120V wall outlet, and is intended to operate without racks, specialized electrical work, or a dedicated server room.
Another major focus is transparency and full-stack control. Tenstorrent says every layer of the QuietBox 2 software is open source, aiming to give developers visibility rather than a black-box experience. That includes TT-Forge for compilation and optimization workflows, TT-Metalium as a low-level AI SDK for kernel control and deterministic execution, and TT-LLK for low-level kernel software. This open approach is positioned as especially valuable for sovereign AI, regulated industries, and research institutions that need to verify exactly how data is handled.
Tenstorrent is also leaning into developer experience and day-to-day usability. The workstation ships pre-configured with Ubuntu 24.04, the complete open-source stack, and TT-Studio to accelerate setup and iteration. The company claims engineering improvements have cut idle power consumption and heat output by about 50% compared to previous generations, and the liquid-cooled chassis is designed for quiet, sustained heavy workloads on a desk—matching the “QuietBox” name with practical acoustics and thermals.
TT-QuietBox 2 is scheduled to ship globally in Q2 2026, with a starting price of $9,999.






