Eventually, large language models will be the end of open source. That's ok, jus...

zmgsabst · on Oct 13, 2024

Why wouldn’t you use LLMs to write even more open source?

The cost of contributions falls dramatically, eg, $100 is 200M tokens of GPT-3.5; so you’re talking enough to spend 10,000 tokens developing each line of a 20kloc project (amortized).

That’s a moderate project for a single donation and an afternoon of managing a workflow framework.

atomic128 · on Oct 13, 2024

What you're describing is "open slop", and yes, there will be a lot of it.

Open source as we know it today, not so much.

gspr · on Oct 13, 2024

I don't understand this take.

If LLMs will be the end of open source, then they will constitute that end for exactly the reason you write:

> Large language models are used to aggregate and interpolate intellectual property.

> This is performed with no acknowledgement of authorship or lineage, with no attribution or citation.

> In effect, the intellectual property used to train such models becomes anonymous common property.

And if those things are true and allowed to continue, then any IP relying on copyright is equally threatened. That could of course be the case, but it's hardly unique to open source. Open source is no different, here. Or are you suggesting that non-open-source copyrighted material (code or otherwise) is protected by keeping the "source" (or equivalent) secret? Good luck making money on that blockbuster movie if you don't dare show it to anyone, or that novel if you don't dare let people read it.

> The social rewards (e.g., credit, respect) that often motivate open source work are undermined.

First of all: Those aren't the only social rewards that motivate open source work. I'd even wager they aren't the most common motivators. Those rewards seem more like the image that actors that try to social-network-ify or gamify open source work want to paint.

Second: Why would those things go away? The artistic joy that drives a portrait painter didn't go away when the camera was invented. Sure, the pure monetary drive might suffer, but that drive is perhaps the drive that's least specific to open source work.

A4ET8a8uTh0 · on Oct 13, 2024

<< Why would those things go away?

I think that is because, overall, the human nature does not change that much.

<< Open source is no different, here. Or are you suggesting that non-open-source copyrighted material (code or otherwise) is protected by keeping the "source" (or equivalent) secret? Good luck making money on that blockbuster movie if you don't dare show it to anyone, or that novel if you don't dare let people read it.

You may be conflating several different media types and we don't even know what the lawsuit tea leaves will tell us about that kind of visual/audio IP. As far as code goes, I think most companies have already shown how they protect themselves from 'open' source code.

yapyap · on Oct 13, 2024

no it won’t, it’ll just make it more niche than it already is.

atomic128 · on Oct 13, 2024

LLM users are feeding their entropy into the model, and paying for the privilege.

These LLM users produce the new training data. They are being assimilated into the tool.

This is the future of "open source": Anonymous common property continuously harvested from, and distributed to, LLM users.