Why wouldn’t you use LLMs to write even more open source?
The cost of contributions falls dramatically, eg, $100 is 200M tokens of GPT-3.5; so you’re talking enough to spend 10,000 tokens developing each line of a 20kloc project (amortized).
That’s a moderate project for a single donation and an afternoon of managing a workflow framework.
If LLMs will be the end of open source, then they will constitute that end for exactly the reason you write:
> Large language models are used to aggregate and interpolate intellectual property.
> This is performed with no acknowledgement of authorship or lineage, with no attribution or citation.
> In effect, the intellectual property used to train such models becomes anonymous common property.
And if those things are true and allowed to continue, then any IP relying on copyright is equally threatened. That could of course be the case, but it's hardly unique to open source. Open source is no different, here. Or are you suggesting that non-open-source copyrighted material (code or otherwise) is protected by keeping the "source" (or equivalent) secret? Good luck making money on that blockbuster movie if you don't dare show it to anyone, or that novel if you don't dare let people read it.
> The social rewards (e.g., credit, respect) that often motivate open source work are undermined.
First of all: Those aren't the only social rewards that motivate open source work. I'd even wager they aren't the most common motivators. Those rewards seem more like the image that actors that try to social-network-ify or gamify open source work want to paint.
Second: Why would those things go away? The artistic joy that drives a portrait painter didn't go away when the camera was invented. Sure, the pure monetary drive might suffer, but that drive is perhaps the drive that's least specific to open source work.
I think that is because, overall, the human nature does not change that much.
<< Open source is no different, here. Or are you suggesting that non-open-source copyrighted material (code or otherwise) is protected by keeping the "source" (or equivalent) secret? Good luck making money on that blockbuster movie if you don't dare show it to anyone, or that novel if you don't dare let people read it.
You may be conflating several different media types and we don't even know what the lawsuit tea leaves will tell us about that kind of visual/audio IP. As far as code goes, I think most companies have already shown how they protect themselves from 'open' source code.
Large language models are used to aggregate and interpolate intellectual property.
This is performed with no acknowledgement of authorship or lineage, with no attribution or citation.
In effect, the intellectual property used to train such models becomes anonymous common property.
The social rewards (e.g., credit, respect) that often motivate open source work are undermined.
That's how it ends.