Right, but I am talking about the general government response trajectory.
And also, even though Anthropic may not have labs themselves directly, there is a funnel of research that comes in the form of papers and conference tracks.
The AI community is pretty tight knit, and not having access to frontier models affects everyone.
I recently built a very large test bench for System Verilog.
I ran a bunch of different compilers on it, including some open source ones.
Some of them failed some tests, and it was natural to have my LLM (Claude Fable 5) root-cause the issues, and to double-check my test bench wasn't to blame.
But now I stood with all these patches that I couldn't just throw at the upstream maintainers all at once. I ended up just filing a few issues and moved on to other things.
It felt weird to just file issues when my LLM had already spent a lot of time root-causing and fixing the issues. But then, maybe they could just have their LLMs do the same.
> I ended up just filing a few issues and moved on to other things.
This is the most valuable contribution you had time for, hopefully with a minimum-viable bug reproduction.
Drive-by patches/PRs are usually a net-negative because the maintainer has to reverse-engineer the intent from GenAI code, and then make changes to have it fit in with the rest of project.
> It felt weird to just file issues when my LLM had already spent a lot of time root-causing and fixing the issues
There are countless ways to fix any issue, and only a few right ways (subjectively). The maintainers' role is to decide which ways are right for their project. You shouldn't worry too much about "wasting" code you already generated, GenAI made that step very cheap, but did little for taste and roadmapping.
> The rate of fundamental, broad-based breakthroughs lifting all LLM applications has clearly slowed with many of the most impactful recent discoveries being in scaling, optimization, tuning and productization toward specific domains.
To me it definitely feels like it's still accelerating, with the most impactful recent discovery being RL training reasoning models (late '24, early '25).
There's an interesting article called "sigmoids won't save you" https://www.astralcodexten.com/p/the-sigmoids-wont-save-you which argues that (unless you have privileged information) you should always assume a process will continue about as long as it’s continued already. (Lindy's Law)
With that in mind the current disruption should last another 10-15 years (assuming it started in '10 or '17.)
I would bet money Anthropic and OpenAI are actually profitable on inference. The problem is they have to spend large sums of money to train models that are essentially worthless after a few months.
They make more money from inference than they do training the model, but then the next model gets so much more expensive to train so their annual figures have been in the red.
One could say "that's a great point, we should take more direct ideological action to address this issue!", but expounding upon the finer details would likely get one banned here.
What I truly don't understand, as a daily heavy Opus 4.7 user, is how you can coherently prompt 15 different parallel conversations at the same time.
For me it's not even a "what the hell are you working on" so much as complete inability to understand how you can keep so many different processes working on distinct tasks. It simply doesn't map on to how I use these tools.
I spend most of my day writing extremely detailed prompts and that's how I'm able to get the sort of excellent results that confound skeptics. But I have to be honest with you: I don't think I can write (or think) fast enough to do two of these at a time, much less 15.
I definitely could not review what they are generating with any degree of confidence.
I'm really hoping you can explain what the heck your usage pattern actually looks like, because reading this makes me feel like I'm missing something.
Yeah good luck with that. I find SystemVerilog is probably the thing that AI is worst at, presumably because there's not that much training data out there, and pretty much everything about the commercial tools is paywalled.
SystemVerilog Assertions. Hardware (silicon ASICs, and also FPGAs often) are written in a language called SystemVerilog. It has a feature called "concurrent assertions" which is usually just called SVA.
These are sort of temporal regexes, e.g. you can write
Which means if the rst signal fell (changed to 0) then foo must be 1 and 1-20 cycles later it must be 0.
The nice thing about them is that there are a few commercial tools that can formally verify them. They're super expensive (~$100k/year for one license), but fairly widely used because they work really well.
It's probably the most successful application of formal verification because it doesn't require much expertise to use. Unlike software formal verification which pretty much immediately requires you to become an expert on loop invariants, termination measures, hoare triples etc. At least that has been my experience.
The human savant will remember where they read it and give you credit. It might lead more people to read your work, and ultimately you make money.
The AI won't even know where the page of text it's seeing came from, and people will avoid your book as they can just ask the AI. So you make less money. (Talking about specialized technical books here.)
> In 1983 David DeWitt (https://en.wikipedia.org/wiki/David_DeWitt) published benchmarking results showing poor performance for Oracle databases. Larry Ellison wasn't happy with the results and it's said that he tried to have DeWitt fired.
> Given how difficult it is to fire professors when there's actual misconduct, the probability of Ellison sucessfully getting someone fired for doing legitimate research in their field was pretty much zero. It's also said that, after DeWitt's non-firing,
> Larry banned Oracle from hiring Wisconsin grads and Oracle added a term to their EULA forbidding the publication of benchmarks. Over the years, many major commercial database vendors added a license clause that made benchmarking their database illegal.
Now, instead of letting car owners pay for the public space they use (street parking), you are forcing anyone without a car to waste their own private space, in case somebody wants to park there.
The subtle difference is between American parking minimums imposed on property owners - “you must reserve space on your private property for this many cars whether you own them or not” vs Japanese parking requirements imposed on car owners - “you must reserve space on some private property for your car if you want to own it”
reply