Looks like there's one feature missing from this that I care about: I'd like more finely grained control over what outbound internet connections code running on the box can make.
As far as I can tell it's all or nothing right now:
this.ctx.container.start({
enableInternet: false,
});
I want to run untrusted code (from users or LLMs) in these containers, and I'd like to avoid someone malicious using my container to launch attacks against other sites from them.As such, I'd like to be able to allow-list just specific network points. Maybe I'm OK with the container talking to an API I provide but not to the world at wide. Or perhaps I'm OK with it fetching data from npm and PyPI but I don't want it to be able to access anything else (a common pattern these days, e.g. Claude's Code Interpreter does this.)
Cloudflare has Outbound Workers for exactly this use-case: https://developers.cloudflare.com/cloudflare-for-platforms/w...
If these aren't enabled for containers / sandboxes yet, I bet they will be soon
You may be interested in the Dynamic Worker Loader API, which lets you set up isolate-based sandboxes (instead of containers) and gives you extremely fine-grained, object-capability-based control over permissions.
It was announced as part of the code mode blog post:
https://blog.cloudflare.com/code-mode/
API docs: https://developers.cloudflare.com/workers/runtime-apis/bindi...
I’m extending Packj sandbox for agentic code execution [1]. You can specify allowlist for network/fs.
1. https://github.com/ossillate-inc/packj/blob/main/packj/sandb...
This simple feature bumps up the complexity of such a firewall by several orders of magnitude, which is why no similar runtime (like Deno) offers it.
Networking as a whole can easily be controlled by the OS or any intermediate layer. For controlling access to specific sites you need to either filter it at the DNS level, which can be trivially bypassed, or bake something into the application binary itself. But if you are enabling untrusted code and giving that code access to a TCP channel then it is effectively impossible to restrict what it can or cannot access.
The most convincing implementation I've seen of this so far is to lock down access to just a single IP address, then run an HTTP proxy server at that IP address which can control what sites can be proxied to.
Then inject HTTP_PROXY and HTTPS_PROXY environment variables so tools running in the sandbox know what to use.
Codex remote environments seem to do this, we had to add support (via two lines of code) for these proxy environment variables to our CLI to support talking to GitHub from these environments.
At least on macOS, there is a third way where you can control the network connection on the PID/binary level by setting up a network system extension and then setting up a content filter so you can allow/deny requests. It is pretty trivial to set this up, but the real challenge is usually in how you want to express your rules.
Little Snitch does this pretty well: https://www.obdev.at/products/littlesnitch/index.html
> This simple feature bumps up the complexity of such a firewall by several orders of magnitude, which is why no similar runtime (like Deno) offers it.
My uneducated question, why not BPF? It's the actual original use case. Declare a filter rule (using any DSL you like), enforce it within the sandbox, move processing to the "real" firewall/kernel where applicable, etc.
That’s true, but Cloudflare is uniquely positioned to avoid this complexity by leveraging the functionality of all their existing products. For them, sandboxing the network is probably the easiest problem to solve for this product…
If a single TCP channel is all that is allowed, on a single port to a single orchestrator IP, and the only service attached to that channel on the other end is the orchestrator which reports results to the host worker, why would you need anything to do with DNS? Isn't this a simple thing to do with a firewall rule, once you know the orchestrator's network-local IP?
(Certainly this would prevent things like package manager installations, etc... but if you're in a use case where you really want to sandbox things, you wouldn't want people to have e.g. NPM access as I'm sure there are ways to use that for exfiltration/C&C!)
deno does support per-host network permissions https://docs.deno.com/runtime/fundamentals/security/#network...
You cannot bypass DNS within Cloudflare’s environment.
What does that mean? That's essentially like saying "you cannot bypass HTTP" within Cloudflare's environment. It doesn't make any sense.
Do you mean they force you to use their DNS? What about DOH(s)? What about just skipping domain lookup entirely and using a raw IP address?
You can restrict outbound network to HTTP using the outbound worker mentioned elsewhere in the thread and filter the domain name of the outbound request against a whitelist of domains you control. The DNS resolution of the domain happens within the CF network stack that you have no control over and that can’t be overwritten in anyway meaning if you restrict outbound to Google.com, there’s no way for that request to end up anywhere else. The whitelist filter you put in place would disallow raw IP addresses and DoH isn’t relevant because again your whitelist of servers you control can just not expose DoH.
When you say that the filter would disallow connecting directly to IP addresses, how would that work? When I open a tcp connection, there's no reference to any domain name. Do you think CF would proactively resolve all the domain names in my whitelist (repeatedly, in case the IPs change) and check the IP I'm connecting to against the list of IPs those domains would resolve to? That sounds like a very brittle solution.
It sounds like you haven’t done the requisite research and are asking me to do it for you. That’s not very nice. The TLDR is that the outbound request doesn’t go directly to the internet. It first goes through your interposer worker where you can sent direct TCP requests and only allow HTTP requests through after filtering for domain.
Can I send a UDP packet to a server on port 53 and receive a packet back?
You choose. But you can also choose to block that.
The pricing with such offerings is the biggest throwoff. This one comes out to be more than $58/month for just 1vCpu and 1GiB RAM when used continuously.
Compare this with instances from Hetzner or Contabo or the likes. They are 35+ times cheaper.
This means my total usage across entire month on cloudflare sandbox cannot cross even one single day of non-stop usage, just to break-even with hetzner/contabo/others.
That's like comparing any serverless offering to a continuously running host.
Continuously running means you're doing it wrong / it's a bad fit. But yes the ratio is somewhat extreme.
Ideal when traffic is sporadic.
It's interesting to also compare this to getting a bare metal instance and provisioning microVMs on it using Firecracker. (Obviously something you shouldn't roll yourself in most cases.)
You can get a bare metal AX162 from Hetzner for 200 EUR/mo, with 48 cores and 128GB of RAM. For 4:1 virtual:physical oversubscription, you could run 192 guests on such a machine, yielding a cost of 200/192 = 1.04 EUR/mo, and giving each guest a bit over 1GiB of RAM. Interestingly, that's not groundbreakingly cheaper than just getting one of Hetzner's virtual machines!
"Interestingly, that's not groundbreakingly cheaper than just getting one of Hetzner's virtual machines!" .... yea.. cause this is what these companies are doing behind the scenes :)
Cloudflare bills by CPU time. You'll come up on top if you have very irregular traffic as Hetzner bills you 24/7 for your instance.
Exactly. The core-word is "occasional" usage.
As I calculated and mentioned before, if that occasional usage ends up being more than a day in total, dedicated instances end up cheaper.
It's sad to see Cloudflare slowly adding egress fees to their new services.
It was a core differentiator to never* have to worry about egress with them.
*: unless it's so large that it borders on abuse or require a larger plan
I agree. It’s the enshittification of the internet. Luckily we still have infrastructure providers with more sensible offerings. We don’t have to use aws, gcp, etc.
Looks nice.
We rolled out our own that does pretty much the same thing but perhaps more because our solution can also mount persistent storage that can be carried between multiple runners. It does take 1-5 seconds to boot the environment (firecracker vms). If this sandbox is faster I will instruct the team to consider for fast starup.
This is also very similar to Vercel's sandbox thing. The same technology?
What I don't like about this approach is the github repo bootstrap setup. Is it more convenient compared to docker images pushed to some registry? Perhaps. But docker benefits from having all the artefacts prebuilt in advance, which in our case is quite a bit.
> It does take 1-5 seconds to boot the environment (firecracker vms).
I'd say 1-5 secs is fast. Curious to know what use cases require faster boot up, and today suffer from this latency?
When your agent performs 20 tasks saving seconds here and there becomes a very big deal. I cannot even begin to describe how much time we've spent on optimising code paths to make the overall execution fast.
Last week I was on a call with a customer. They where running OpenAI side-by-side with our solution. I was pleased that we managed to fulfil the request under a minute while OpenAI took 4.5 minutes.
The LLM is not the biggest contributor to latency in my opinion.
Thanks! While I agree with you on "saving seconds" and overall latency argument, according to my understanding, most agentic use cases are asynchronous and VM boot up time may just be a tiny fraction of overall task execution time (e.g., deep research and similar long running tasks in the background).
Have you tried e2b or Daytona fast start vms?
1-5 seconds seems high for Firecracker, depending on your requirements.
We boot VMs (using Firecracker) at ~20-50ms.
Obviously depending on the base image/overlay/etc., your system might need resources making it a network-bound boot, but based on what you've said it seems you should be able to make your system much faster!
I browsed through the documents but it does not seem to be possible to auto destroy a sandbox after certain amount of idle time. This forces who ever is implementing this to do their own cleanup. It is kind of missed opportunity if you ask me as this is a big pain. It is sold as fire and forget but it seems that more serious workflows will require also a lot of supporting infrastructure.
You can easily set an alarm in the durable object to check if it should be killed and then call destroy yourself. Just a couple lines of code.
Nice. Thanks for the tip. I did not know that this was a thing. I will look it up.
Is there some sort of competition for awful looking websites going on?
This bizarre anti-aesthetic has been pushed in the web devex space for a few years now to appeal to other web devex companies.
I thought it was cute and easy to read.
They didn't test it with FF apparently.
Looks perfectly fine in FF 144.0 on Mac OS.
Yeah I'm in FF Linux 143 and I love the page.
If anyone is curious, more details on our SDK can be found here actually https://github.com/cloudflare/sandbox-sdk
Mind answering the question here: https://news.ycombinator.com/item?id=45611301 ?
This looks rough for e2b.dev, Beam, and others in this space. Even with e2b's fresh $20M raise, taking on Cloudflare is going to be tough.
e2b have a python SDK thats why I would use them when I start a new project, (knowing Cloudflare they probably won't)
Why not use VMs from AWS (EC2) or GCP? E2b is built on top of GCP anyway.
Looking at pricing for this and especially for E2B. I've decided to work on a lightweight alternative to Cloudflare’s Sandbox because I ran into the same pain points with current solutions: - limited parallelization - expensive
My focus is: - simple SDK primitives for code execution, file ops, and git checkout—no boilerplate stacks - transparent pricing (per-second compute with monthly caps, no surprise egress) - price sandboxes for 50-60% less then competitors
Would love to get your feedback in https://usesandbox.dev/ Just finalized main pieces today
Cloudflares docs are written so hard for web dev. Can you host a monolith app that isn't serving http traffic on cloudflare tech like containers? Like can you spawn a container and have it handle tcp or udp connections until you manually shut it down? The container docs say they auto shutdown after not receiving requests...
It's a really good platform for Typescript microservices which scale-to-zero (up to very high theoretical limits), but it wouldn't be a platform you'd migrate a monolith PHP app to (for example).
Who's the designer? I assume the same as agents.cloudflare.com, finally something that looks creative and not based on purple gradients
Nanda is a great! https://x.com/nandafyi?s=21&t=eqRzxHltMiNrLXlna5kBEQ
There is an open question about how file persistence works.
The docs claim they persist the filesystem even when they move the container to an idle state but its unclear exactly what that means - https://github.com/cloudflare/sandbox-sdk/issues/102
To me, the docs answer it pretty clearly. The defined directories persist until you destroy().
The part that's unclear to me is how billing works for a sandbox's disk that's asleep, because container disks are ephemeral and don't survive sleep[2] but the sandbox pricing points you to containers which says "Charges stop after the container instance goes to sleep".
https://developers.cloudflare.com/sandbox/concepts/sandboxes...
https://developers.cloudflare.com/sandbox/concepts/sandboxes...
[2] https://developers.cloudflare.com/containers/faq/#is-disk-pe...
Yeah thats basically the issue. If container disks are ephemeral, how are they persisting it? And however they are doing it, whats the billing for it?
Sandbox is built on top of their Durable Objects; the underlying storage is $0.20/ GB-month.
You’re saying the file system in the container is persisted to the durable object storage? That doesn’t sound right.
Whilst it is in the idle state. Not whilst it is stopped.
Cloudflare Containers (and therefore Sandbox) pricing is way too expensive. The pricing is a bit cumbersome to understand by being inconsistent with pricing of other Cloudflare products in terms of units and split between memory, cpu and disk instead of combined per instance. The worst is that it is given in these tiny fractions per second.
Memory: $0.0000025 per additional GiB-second vCPU: $0.000020 per additional vCPU-second Disk: $0.00000007 per additional GB-second
The smaller instance types have super low processing power by getting a fraction of a vCPU. But if you calculate the monthly cost then it comes to:
Memory: $6.48 per GB vCPU: $51.84 per vCPU (!!!) Disk: $0.18 per GB
These prices are more expensive than the already expensive prices of the big cloud providers. For example a t2d-standard-2 on GCP with 2 vCPUs and 8GB with 16GB storage would cost $63.28 per month while the standard-3 instance on CF would cost a whopping $51.84 + $103.68 + $2.90 = $158.42, about 2.5x the price.
Cloudflare Containers also don't have peristent storage and are by design intended to shut down if not used but I could then also go for a spot vm on GCP which would bring the price down to $9.27 which is less than 6% of the CF container cost and I get persistent storage plus a ton of other features on top.
What am I missing?
You can’t compare these with regular VM of aws or gcp. VM are expected to boot up in milliseconds and can be stopped/killed in milliseconds. You are charged per second of usage. The sandboxes are ephemeral and meant for AI coding agents. Typical sandboxes run less than 30 mins session. The premium is for the flexibility it comes with.
Serously, what flexibility?
I coud easily spin-up a firecracker VM on-demand and put it behind an API. It boots up in under 200 milliseconds. and I get to control it however I wish to. And also, all costs are under my control.
I compared the costs with instances purchased from Hetzner or Contabo here: https://news.ycombinator.com/item?id=45613653
Bottomline: by doing this small stuff myself, I can save 35 times more.
In my case, it is ignorance. I am not familiar with how to wield firecracker VMs and manage their lifecycle without putting a hole in my pocket. These sandbox services(e2b, Daytona, Vercel, etc.) package them in an intuitive SDK for me to consume in my application. Since the sandboxing is not the main differentiator for me, I am okay to leverage the external providers to fill in for me. That said, I will be grateful if you can point me to right resources on how to do this myself :)
This is a pretty good use-case for an open-source project then.
For guide, just follow their official docs. I did those again today, literally copy-pasted shell commands one after the other, and voila.. had firecracker vm running and booting a full-fledge ubuntu vm.
It was sooo damn fast that when it started, at that moment I thought that my terminal had crashed because it's prompt changed. But nop. It was just that fast that even while literally looking at it I was not able to catch when it actually did boot-up.
By the way, two open-source projects already exist:
1. NodeJS: https://github.com/apocas/firecrackerode
2. Python: https://github.com/Okeso/python-firecracker
I think you can absolutely compare them and there is no added flexibility, in fact there is less flexibility. There is added convenience though.
For the huge factor in price difference you can keep spare spot VMs on GCP idle and warm all the time and still be an order of magnitude cheaper. You have more features and flexibility with these. You can also discard them at will, they are not charged per month. Pricing granularity in GCP is per second (with 1min minimum) and you can fire up firecracker VMs within milliseconds as another commenter pointed out.
Cloudflare Sandbox have less functionality at a significantly increased price. The tradeoff is simplicity because they are more focused for a specific use case for which they don't need additional configuration or tooling. The downside is that they can't do everything a proper VM can do.
It's a fair tradeoff but I argue the price difference is very much out of balance. But then again it seems to be a feature primarily going after AI companies and there is infinite VC money to burn at the moment.
It doesn't really make sense to compare this to regular VM pricing I think.
This is a on-demand managed container service with a convenient API, logging, global placement in 300+ locations, ...
AWS Lambda is probably closer in terms of product match. (sans the autoscaling)
Depending on what you do , Sandbox could be roughly on par with Lambda, or considerably cheaper.
The 1TB of included egress alone would be like 90$ on AWS.
Of course on lambda you pay per request. But you also apparently pay for Cloudflare Worker requests with Sandbox...
I reckon ... it's complicated.
Startups would build on big tech, so are likely to add their margins. Have you looked into (bulk) discounts from GCP/AWS?
Cloudflare containers feel a lot more pricey as compared to workers but I think that it could provide more streamlined experience imo but still, If we are talking about complete cost analysis, sometimes I wonder how much cf containers vs workers vs hetzner/dedicated/shared vps / gcp etc. would work out for the same thing.
Honestly, the more I think about it, my ease of sanity either wants me to use hetzner/others for golang/other binary related stuff and for the frontend to use cf workers with sveltekit
That way we could have the best in both worlds and probably glue together somethings using proto-buf or something but I guess people don't like managing two codebases but I think that sveltekit is a pleasure to work with and can easily be learnt by anybody in 3-4 weeks and maybe some more for golang but yeah I might look more into cf containers/gcp or whatever but my heart wants hetzner for backend with golang if need be and to try to extract as much juice as I can in cf workers with sveltekit in the meanwhile.
Thoughts on my stack?
This looks interesting.
Instead of having to code this up using typescript, is there an MCP server or API endpoint I can use?
Basically, I want to connect an MCP server to an agent, tell it it can run typescript code in order to solve a problem or verify something.
Hey, I'm building a similar thing to sandbox SDK
Are you interested in code execution only, or something else? File operations, git checkout etc?
Does this relate to workerd in any way or is it something else entirely?
The `getSandbox(env.Sandbox, "test-env");` code is running in workerd.
Can I run claude code or codex inside this
Yes, you can, we will publish an example soon
You can in Cloudflare containers
My one annoyance with cloudflare. Everything is JavaScript. Every example, all the things. But I guess that's catering to their audience. Over the past year you could definitely seem them shift their services more inline with other cloud providers because that's the inevitable requirement to penetrate enterprise and a broader audience. But part of that should require opening up to a bigger audience from a language perspective too e.g backend languages. One time I'd like to see a Go example first or even a tabbed example. Just my opinion.
This code has to run inside their Worker's platform; _technically_ you can compile Go to WASM and get it to run; but practically it is Javascript only.
IIUC this particular product runs in Linux containers, not in V8 isolates like Workers uses. The upside of this is broader ecosystem compatibility (you can use whatever language you want even if it doesn't compile efficiently to Wasm); the downside is that each container instance runs in a particular data center and you have to worry about the latency implications of that, whereas every Worker runs in every location.
This comment was complaining the example for launching the container was in JavaScript, this bit has to run in v8.
Python for Workers is coming eventually.
It's already live. Has been available for a while.
I’ll never use metered Cloudflare services. Just reckless to expose myself to the hazard of mega bills for small mistakes or DoS attacks. I wish more companies allowed prepayment for plans like Bunny.net does.
How much `power` do they have?
No thanks, way too many things running Cloudflare already
These CF website relaunches are just that right? Workers last week (https://workers.cloudflare.com) and now this one yesterday. I mean, if CF has something newsworthy here they should do a blog post announcing it because otherwise it's just a refreshed website. It's hard to tell if there's anything new here.
It's the same SDK stuff from earlier this year right? https://developers.cloudflare.com/changelog/2025-06-24-annou...
The workers website looks extremely based.
Love the evocative animations and it manages to still be readable and well organized.
There’s also the changelog https://developers.cloudflare.com/changelog/
it barely had any features then, this version is full of new functionality: streaming logs, long running processes, code interpreter and lots of other things and full docs site as well