This post from fly.io [1] has a pretty comprehensive survey of the tech available for running users' code safely. It's a good read.
I've been investigating something similar for a feature I want to launch. I'm currently leaning towards running users' code in Kubernetes using Firecracker or gVisor.
My main takeaway has been that while there are good solutions for isolating users' code, there's going to be a lot of worked involved in orchestrating it at scale. I.e. building and storing images, spinning up containers, managing storage, tracking/billing minutes and bandwidth, killing timed-out containers, etc. I have not found a good library for that. It seems like a good use-case for a Kubernetes operator, so I think that's what I'll wind up building.
I used a K8S cluster to run untrusted code. User code was executed inside of a container running as a job, rather than a naked pod or deployment. To monitor/track/handle abuse, I used a sidecar container running alongside the user's container.
The real challenge around running user's code isn't running code, per se. Instead, it is storage! I was never able to come up with a good solution for allowing users to create a very large number of files, such as the number of files created by creating a React app.
One powerful way to deal with these problems is event sourcing. It's a reasonably elegant way to materialize a single application-specific cache based on many different data sources. Two great resources:
Thanks! gVisor intercepts app syscalls and serve them in user space (inside separate VMs, one for each container), which reduces runtime performance significantly. Both Firecracker and gVisor use VMs to sandbox container code.
Kwarantine, on the other hand, directly runs container code on the hardware (no VMs). It uses MMU/page tables to provide a different kernel to each container.
Makes sense. Why do you think Google and Amazon didn't pursue that approach for services like Cloud Functions and Lambda? Is there a trade-off or is it a matter of complexity?
They've been around for years but it seems they've only recently started getting more focused attention. I wonder if they can point to a deliberate strategy as the cause of that, or if it's mostly just good timing and getting hyped by the right people (e.g. PG on Twitter).
Looks like a useful product! I think it's neat how you built it directly on Google Sheets. I'd love to hear more about why you made that decision (versus building an independent tool).
Some feedback on the landing page:
- Would be nice with a 2-3 minute demo video
- Put the screenshots (or demo video) closer to the top, so they're visible without scrolling
- Some of the copywriting could be clearer. For example: "Pre-built financial models for SaaS companies, with plug-and-play software built directly in Google Sheets" could be just "Financial models for SaaS companies, directly in Google Sheets"
I think building it in Google Sheets is important because finance folks are comfortable with it and (if needed) can customize it to suit their needs. It's also hard to replicate Google Sheet's collaboration features like edit history on individual cells, real-time updates, comments/suggestions etc.
- A demo video is a great idea, I'll get started on that this week.
- Screenshots/video closer to the top is a good idea too, think maybe I need an image in the hero.
- Good point on the copy, I'll have to revisit this.
Last year I read Masters of Doom by David Kushner after someone mentioned it on Hacker News. It was the best book I had read in a long time. It won't improve your skills, but I think it will motivate and inspire you to immerse yourself (if we're talking programming, doing small projects and getting feedback is a better way to improve your skills anyway).
I think you could find some inspiration in cryptocurrency exchanges. Most of them expose public websockets for prices, order books, trades, etc. They're high volume with lots of subscribers.
If you could tell me a little more about the data format, data volumes, number of subscribers, and how you get the data on the backend, I can try to give you some more concrete advice.
I've been investigating something similar for a feature I want to launch. I'm currently leaning towards running users' code in Kubernetes using Firecracker or gVisor.
My main takeaway has been that while there are good solutions for isolating users' code, there's going to be a lot of worked involved in orchestrating it at scale. I.e. building and storing images, spinning up containers, managing storage, tracking/billing minutes and bandwidth, killing timed-out containers, etc. I have not found a good library for that. It seems like a good use-case for a Kubernetes operator, so I think that's what I'll wind up building.
[1] https://fly.io/blog/sandboxing-and-workload-isolation/