TensorPool Logo

You focus on the models.

We handle the infrastructure.

Train foundation models without thinking about infrastructure.

A CLI for GPUs

CLI COMMANDS
tp cluster create -t 8xB200 -n 4
Multi-node GPU clusters on demand. Always 3.2Tbs Infiniband.
tp storage create -s 10000
Storage volumes for clusters. 10GB/s reads & 5GB/s writes. Faster than local NVMe.
tp job push train.toml
Kick off training jobs with a Git-style interface. Never SSH into an instance again.

Why TensorPool?

FEATURES

High Performance Storage

Storage infrastructure engineered for distributed training. Optimized for high throughput & low latency workloads with sustained aggregate performance: up to 300 GB/s read, 150 GB/s writes, 1.5M read IOPS, 750k write IOPS.

Git style CLI

Training with no orchestration burden. Manage experiments with a familiar push/pull paradigm that eliminates idle GPU time. Normal SSH access also offered.

Reliability

Prior to delivering our GPUs, TensorPool conducts an extensive burn in process to ensure GPU reliability.

Capacity

TensorPool provides on-demand multi-node clusters. We do this by scheduling & binpacking users onto TensorPool operated clusters in partnership with compute providers (Nebius, Lightning AI, Verda, Lambda Labs, Google Cloud, Microsoft Azure, and others).

Frequently Asked
Questions