Run DeepSeek-R1 70B, LLaMA 3, and Qwen at production speed. Performance matches RTX 5090D — at one-third the cost. Purpose-built for teams deploying LLMs on-premise.
For teams that need to run 70B+ parameter models locally but can't justify $120K+ for H100/5090D clusters.
DeepSeek-R1 70B FP16 inference: 1,213 tokens/s at peak — virtually identical to RTX 5090D 32G × 8 at 1,211 tokens/s.
$44,800 per server vs $120,000+ for comparable 5090D setups. Same inference throughput, dramatically lower CapEx.
No 8-12 week lead times. Units are built and ready to ship. FOB Hong Kong, global logistics support.
Real-world throughput across concurrency levels. B70 matches 5090D across the board.
Actual terminal and monitoring screenshots from our FG4812T-G4 server running DeepSeek-R1-Distill-Llama-70B in FP16. All benchmarks were captured live on the machine.
📸 All screenshots captured on the actual FG4812T-G4 server running
DeepSeek-R1-Distill-Llama-70B in FP16 mode ·
Benchmark tool: conbench serve ·
Want a live demo?
| Component | Specification |
|---|---|
| Chassis | FG4812T-G4 · 4U Rackmount |
| CPU | 2× Intel Xeon Gold 8568Y+ · 48 Cores / 96 Threads · 2.3 GHz · 350W TDP |
| Memory | 128 GB RECC DDR5 |
| Storage | 3.84 TB U.2 NVMe SSD · Enterprise Grade |
| GPU | 8× Intel Arc Pro B70 32GB GDDR6 · Dual-Slot Turbine |
| Total VRAM | 256 GB |
| Power Supply | 4× 2,700W Redundant PSU (10,800W total) |
| Form Factor | 4U · Standard 19" Rack |
Actual hardware photographed at our facility. What you see is what you get.
Gunnir turbine module · TSMC N5 · Xe2 Architecture
DeepSeek-R1 70B, LLaMA 3 70B, Qwen-72B — full FP16 precision with 256GB total VRAM. 1,200+ tokens/s at peak concurrency.
Serve 50-200 concurrent users on a single server for enterprise chatbot deployments. Ideal for internal AI assistants.
Fine-tune models up to 30B parameters. Experiment with quantized 70B+ models. Full control over your compute.
Hardware AV1/H.265 encoding across 8 GPUs. Batch image generation, video transcoding, and media workflows.
Every unit is assembled, tested, and verified before shipping. Here's a look at our real workshop.
We accept irrevocable L/C at sight — the standard for secure international B2B transactions. T/T also available.
Units ship from Hong Kong. 7-day delivery to most Asian markets, 14-21 days to US/EU. Freight forwarding support available.
Commercial invoice, packing list, certificate of origin, and bill of lading provided. Authorized Gunnir dealer.
Fill in the form below — we respond within 24 hours.
📞 Contact Us Directly
We respond within 24 hours · LC / T/T payment · Global shipping