"The hardest part of running Claude Managed Agents isn't the model. It's everything underneath it."
A great build story on running long-lived Claude Managed Agents with Tensorlake
Covers suspend/resume, persistent state, checkpointing, branching workflows, and why state matters more than compute for many agent workloads.
pub.towardsai.net/building-long-…
Terminal-Bench 2.1 oracle, all 89 tasks: Tensorlake hits a perfect 1.000 — and runs ~1.7× faster than the next provider at the same concurrency.
Why oracle = 1.0 is the only honest baseline, and how we got there:
tensorlake.ai/blog/accelerat…
Test driving our ios app. This shell is a PTY session that you can reattach and come back anytime when you open your phone and iPad!
Beyond running shells, we built some cool features in the app that extends what builders can do on iOS devices. If you want a TestFlight invite please DM me!
Soon you won't have to leave a party to check in on your coding agents!
@tensorlake on iOS coming soon! The sandbox shell is powered by libghostty so the editing experience should be really good.
He built a research agent comparing Pinecone vs Weaviate pricing across 7 pages. Then he hit it with two suspensions and a full machine restart, mid-run. The state survived. It resumed right where it left off. What he measured: create ~800ms, resume ~400ms, suspend ~200ms.
That's the whole idea behind a sandbox-native cloud for agents. Pause, resume, no lost work.
We are happy with where the architecture of Tensorlake is right now. It gives us a foundation to go in many directions in the future to help engineers who need compute for many different use cases:
1. CAS and streaming file systems helps with GPUs.
2. Multi-Driver Dataplane architecture helps us run Firecracker/Cloud Hypervisors on bare metal machines, and gVisor based instances in VMs of neo-clouds like Nebius and Coreweave.
3. mTLS-based outbound Dataplane protocol allows us to onboard new BYOC users in under 30 minutes.
4. Multi-Raft based Scheduler scales the control plane to 5 million sandboxes easily, and can burst up to 1–2 million per project.
5. Tunneling infrastructure through proxy and Dataplane enables computer-use use cases such as browser agents.
We have a ways to go before we’re at the top of the hill, but the foundation is being laid nicely.
TensorLake sandboxes run on a block-based filesystem that sits behind the virtio-blk interface in KVM. On a representative real-world workload such as running a database like SQLite, it's ~2–4x faster than Modal, E2B, and Daytona sandboxes.
We focused on filesystems because coding agents compile a lot of code inside sandboxes, and we want users to get work done faster. We also believe a lot of the apps these agents generate will end up running in sandboxes, so the filesystem matters more over time.
As we've been working on launching GPUs, we learned the same filesystem doesn't fit GPU workloads and the reason is the economics of the market. GPU capacity is heavily constrained, so we'll likely have to run clusters across more than one neo-cloud/hyper scaler and move workloads between them as capacity opens up. A block-based filesystem makes that slow, because shifting a workload across clouds means shipping the whole image every time.
So we built a content-deduplicated filesystem: instead of moving the entire image, we store and transfer only the parts that are actually new. The space efficiency is large: adding a fresh ~7 GiB PyTorch image only costs ~2 GiB of new data, because 67–74% is already in the store.
It also deduplicates files added to a running VM when it's snapshotted to crate new VMs. If the same file exists across multiple sandboxes, you pay for exactly one copy, every other copy is free.
That lets us deliver near-instant startups and add GPU capacity to Tensorlake very quickly as demand grows. This will hit production soon when we launch GPUs!
The table below shows incremental space efficiency as users add new versions of PyTorch to our image registry.
If thousands of rollouts all need the same starting world, repeatedly rebuilding that world can become surprisingly expensive.
Good deep dive into snapshots, forks, and why statefulness matters for coding agents.
medium.com/@sebuzdugan/sn…
Dev Deputies is a new open source control plane for coding agents. Unlike terminal CLIs, these agents run in cloud sandboxes and wake up or sleep when they’re done working.
Each thread on @DevDeputies runs on isolated sandboxes, unlocking unlimited scale for coding.
New sandbox provider in @DevDeputies!
🥁🥁🥁
@tensorlake!
I was chatting with @diptanu and saw that they met all of my criteria from yesterday's "What Makes a Good Sandbox for Background Agents" post, so I added it!
Parallel agents don't automatically mean efficient agents.
If every worker rebuilds the same environment, you're still paying setup cost N times.
Snapshots + forks change the equation.
Interesting write-up on a pattern we're seeing more often in production agent systems.
Full breakdown (the architecture, the real code, and the 3 mistakes that cost me 3 rebuilds) is here:
medium.com/@yadavdivy296/…
If you're building anything with multiple AI agents, check this before you touch anything else.
New sandbox provider in @DevDeputies!
🥁🥁🥁
@tensorlake!
I was chatting with @diptanu and saw that they met all of my criteria from yesterday's "What Makes a Good Sandbox for Background Agents" post, so I added it!
The future bottleneck isn't execution. It's preserving and reusing work.
Nice to see people are building agents using stateful Tensorlake sandboxes
medium.com/@rohanmistry23…
Most sandboxes can execute code.
Increasingly, the challenge is preserving state across long-running agent workflows.
Installed packages, generated files, checkpoints, intermediate results, session history.
This walkthrough explores how stateful sandboxes enable a code interpreter to suspend, resume, snapshot, and fork instead of starting over each time.
Great writeup from the community 👏
medium.com/@rohanmistry23…#AIAgents#CodeInterpreter#Sandboxes
Importing public and private Docker Hub images into Tensorlake is now supported in both the Python and TypeScript SDKs, bringing the same functionality previously available in the CLI to programmatic workflows.
docs.tensorlake.ai/sandboxes/imag…
2K Followers 2K FollowingEttercap guy, CISO @Datadoghq, angel investor, passion for all things tech. Views are my own and not of my employer. Mastodon @[email protected]
8K Followers 266 FollowingThe first investor for technical founders. Early backers of Datadog, Chainguard, dbt Labs, Temporal, Modal, Hightouch, Luma, Scribe, and more.
42K Followers 36 FollowingA high-throughput and memory-efficient inference and serving engine for LLMs. Join https://t.co/lxJ0SfX5pJ to discuss together with the community!
12K Followers 1K FollowingCo-founder and CEO @GenReasoning. Previously lots of other things like: reasoning lead Meta AI, Llama 3/2, Galactica, Papers with Code.
100K Followers 2K FollowingAI Educator | Helping you to make money with Al, Tech Tools & Digital Skills | 📩 DM/Mail for collaboration: 💬 [email protected]
237K Followers 618 FollowingThe latest rumors and developments in the world of artificial intelligence. DM to include your AI project in the email newsletter with 100k subscribers!
151K Followers 7K FollowingCompiling in real-time, the race towards AGI.
The Largest Show on X for AI.
🗞️ Get my daily AI analysis newsletter to your email 👉 https://t.co/6LBxO8215l
7K Followers 6K FollowingTweets on AI eng. 💡 | Building DeepDocs-an AI agent that automatically scans and fixes outdated docs before users or agents read them 🧑💻 | PhD in ML
59K Followers 983 FollowingSeed. Early. Growth. We invest across consumer and enterprise and have partnered with thousands of inspiring entrepreneurs over the past two decades.