Hacker Newsnew | past | comments | ask | show | jobs | submit | socketcluster's commentslogin

Wow the token leaderboard idea is nuts. It's similar to trying to measure the productivity of software engineers based on number of lines of code.

The message isn't subtle, and isn't meant to be: "we don't care how, but we expect you to stick your nose into AI tools and find some way to fit them into your workflow".

Which indicates: the management believes there are productivity gains from AI use, but adoption lags due to inertia and reluctance to change existing workflows.


> Which indicates: the management believes there are productivity gains from AI use, but adoption lags due to inertia and reluctance to change existing workflows.

Methinks adoption lags due to management's inability to align incentives such that productivity gains are rewarded.


You say that as if "align incentives such that productivity gains are rewarded" isn't one of the hardest, most fundamental problems in all of organization and management.

Because it isn't. Aligning incentives is really easy.

"If you do something that causes a productivity gain worth X, I will give you some reward with a value Y where Y is less than X but greater than the cost Z of the effort you needed to put in to generate the productivity gain. If the cost Z would be equal to or greater than X then don't do it."

Managers make their work immensely harder on themselves by unnecessarily adding the constraint that they can't get people to do things by paying them fair amounts to do them. Now certainly there are some highly skilled managers out there who can still succeed despite this handicap, and if that earns them a fat paycheck then good for them. But if you don't have those skills you don't get to excuse your failures with an inefficiency you created for yourself.


It also indicates several different levels of 'cannot manage their way outside a paper bag'. If you had a construction foreman who decided because nailguns would be the way of the future vs hammers, therefore the metrics would be based on the number of nails used and wound up with thoroughly nailed pieces of lumber but no houses built, he belongs fired as a complete incompetent who has absolutely no business on a job site.

Due diligence, judgement, and ability to know what the hell is going on are essential skills for management. The token metrics are a complete abdication of all of the above. It isn't a cream you just slather on to boost productivity.


more of "we whined and cried and screamed that we needed new budget in order to buy these tools or we would literally die. now we have them, they don't work as well as we hoped, they aren't leading to productivity gains, and they're actively alienating our workforce and users alike. we're so screwed we literally have no idea how to do reverse this."

Fully agree. Shipping a complete product with a functioning user acquisition funnel is much harder. It's like; you have to build the whole product first with lots of features and then you have to try to create a highly condensed overview of all those features to expose them all on the landing page.

If you can't make the visitor understand your entire complex product in 10 seconds, then you've lost them.

Your product has to be complex because that's where the software market is at. All of the low-hanging fruits have been taken by the time you identify them. Sure, someone will find a way to make money using new low-hanging fruits that arise due to technological changes but it's not going to be you. You probably don't have the business connections to make that work.


I'm not entirely sure how that dismisses the CEO's putative argument: they go big on AI precisely because shipping end-to-end is hard, so they think they shouldn't waste resources on tasks that can be automated.

The structure of a good argument would be something like: certain tasks are fundamentally human and impossible to automate (which and why?) and by pushing AI use beyond what is optimal you are actually hurting your employees ability to do those hard parts.

A weaker but still useful argument is that most everything can probably be automated, but frontier models aren't there yet.


I wouldn't say it "dismisses" their argument, but I think AI marketing encourages them to take an over-simplified view of what it takes to ship product. Most folks like a good, simple story, as opposed to the unvarnished truth.

> "There's always an easy solution to every human problem; Neat, plausible and wrong."

-- H. L. Mencken

It's like the classic scenario, where you lash-up a barely functional UI demo, and the manager cuts your development schedule by 90%, because you "already have it working." That taught me to never do a lash-up demo. If I show something to someone, it is ship-quality (but often incomplete). It's a technique that I've used for years, and is a great way to involve nontechnical stakeholders, without risking stuff like "it's already working."

All that said, I think that AI definitely could automate a lot of the repetitive stuff involved in shipping. It's just that the CEO would fire the folks that could teach it, before it can learn, because they think that what they do, is "unimportant."


I hate to use a throwaway, but this bit:

> with a functioning user acquisition funnel

How do you actually get this. I've got a product, the site is hand crafted, shows the complex product really well (and had good feedback on it) but how do I acquire the users?

It seems as the cost of creating software has plummeted, it's the actual sales side of it that's going to matter even more. I'm stuck at this point.


"How do I acquire users" is the entire function of sales and marketing. A single HN comment explaining how to do sales and marketing, which is highly dependent on your product and market (and much more difficult than technical people tend to believe), is a bit unrealistic. And a great opportunity to use Claude/ChatGPT for something other than code. There's no silver bullet but as a springboard you can think about:

Who is your ideal customer profile (look up buyer personas) -- if you're B2B figure out both the profile of the company who would buy, as well as the person who would actually buy, and the person who would actually use the software: remember that buyer != user in B2B scenarios, and you'll have to figure out if the buyer, user or both is the best path to getting a sale. If you're B2C figure out your buyer personas so you know where to advertise.

Why would people want your product; sounds like you may already have this down but be ready to explain your value proposition concisely.

How will these people hear about your product -- a SaaS that falls in the woods doesn't make a sound, you need people to learn your product exists before they can pay for it. This is the point of figuring out buyer personas, you need to meet your customers where they are, and you can't know where they are unless you know who they are. This is highly dependent on your product/personas, and could range from running LinkedIn ads to SEO to having a Bluesky brand account to going to local meetup groups or conferences and trying to get your first handful of users in-person.


I really appreciate you writing this down. Thank you. This has helped a lot more than you probably think.

Get a dozen users word of mouth? They will tell friends? Won’t scale forever but it gets you going.

Sorry to burst your bubble but the cost of creating software has not, bloatware definitely has.

Whilst I understand your sentiment and don't get me wrong, there's _a lot_ of bloatware out there now, the game has fundamentally changed.

The sooner you realise this the better you will be moving forward. I won't debate you here, only returned to thank the other user that helped. Reflect on what is possible now compared to 3 years ago.


Interesting reading this because this is essentially the principle behind https://socketcluster.io/ scalability; the sharding of channels across available brokers is pseudo-random. It uses a hash function for determinism but the distribution appears to be random and that was also the best way I could find to distribute load evenly between available nodes. It is key to its embarrassingly parallel design.

It's interesting to see it being done at the data centre level as well.


That's a different thing entirely, that assumes you already have a physical layer that allows any client to connect to any broker, this is about building that physical layer

I started building something pretty obscure about 14 years ago; https://socketcluster.io/ an open source, WebSocket-based RPC + pub/sub library with a focus on in-order async stream-processing with backpressure monitoring.

It didn't start out like that. Initially, it was just another WebSocket library with a focus on making it easier to scale to multiple processes.

It's kind of mind-bending to me though that it still feels like it's "too early." You'd think that the ability to efficiently process RPCs and pub/sub messages from clients whilst maintaining ordering would be critical... Yet if you look around the industry; callback-based event handlers are still the norm for most application logic and people are still not using queues where they should be. People think of queues as some expensive/bulky system with overhead which requires additional architecture (e.g. RabbitMQ, Kafka, STOMP, NSQ) and always requires exactly-once delivery, they have not tried to make the idea a core part of their application logic. Software today is FULL of race conditions because of this blind-spot. Yet I still cannot communicate my message. It's too difficult to explain the benefits.


I had a similar issue. The blind spot was unit tests.

I think the issue is just that it's incredibly hard to sell an abstract idea and incredibly hard to convince people to abandon ingrained habit.

I created a testing framework where you wrote half a test in YAML and the framework filled in the rest based on program output.

It made writing tests quick, easy and even kinda fun.

Moreover if you added a bit of explanation prose to the YAML and used a slightly nicer example scenario it would generate you guaranteed up-to-date readable markdown how to docs. For free.

But, these things are culturally chorey and there's a shame culture built around them.


If people aren’t doing as you describe, maybe it’s not cracked up to all you think it is?

I can’t think of many places, even one if I’m being honest, where I’ve needed what you describe.


It makes me wonder about the state of their codebase if devs needs to consume more than $1500 per month.

It's interesting that AI is finally forcing businesses to think about coding maintenance costs though.

When I started working on https://saasufy.com/ as a dev tool many years ago, I was frustrated that no big company cared about software maintenance costs and I really couldn't imagine a world where maintenance costs would be a problem (which is what my platform was addressing). So this is one positive thing from my perspective, I guess. But how much longer before people put 2-and-2 together and realize that architectural complexity is the leading cause? That's the real moment I'm still waiting for.

Will what's left of the socio-economic system be sufficiently capitalist that I will be able to capitalize on that? That's my next problem.


Why do you think the cap has anything to do with the quality of their codebase? Employees could've been tokenmaxxing for various reasons: learning, experimenting, trying to impress the management, ... Naturally, this leads to AI spending skyrocketing while the business value may not be totally clear. Which leads to caps being introduced to keep the budget under control and discourage/limit tokenmaxxing.

It's based on my experience as a software engineer who has worked on both clean and messy codebases with AI.

It's a very different experience with a messy codebase. In this case, the agent spends most of its time trying to gather the relevant context and it's like a game of whac-a-mole. The agent burns through tokens and can take a long time to resolve the issue with a lot of human intervention required. I would say it takes possibly just as long or longer than a human engineer would. Also, psychologically, the temptation for the engineer to trust the AI is massive because they don't want to load themselves up with all that ugly, complex context. They are more likely to let the agent create more hacks on top.

On a relatively well-structured codebase with loose coupling and high cohesion, the experience is usually very positive, mind-blowing, even; because it feels like the agent is reading your mind and fast-forwarding you. You don't need to correct it as much. And when you do, it's usually minor things.

The first case represents a net loss of value because tech debt is being added and compounding the complexity each time a problem is 'solved'. On the other hand, the second case is a significant speedup, for me, I would say it's at least a 5x speedup. I love using AI in this way. I'm in control and not at the mercy of the agent.


I don't argue against the fact that codebase complexity increases token consumption on building context. My main point was that there are other factors affecting token consumption beyond just codebase complexity. Some of them may be related to engineering culture (verbose logs, flaky tests, lack of docs, weird hacks all over the place, etc.), some of them are organizational/social.

Sure. A lot of these things tend to go together. Weird hacks is a bad one. Those AI agents love to cheat and if they see highly elaborate hacks in the code, they won't hold back either.

I have no idea how much I’ve spent, it’s invisible to me, the company doesn’t share it with me. I have no idea what “1 credit” means in terms of $$$, is that 1$? 0.1? 0.01? Is it even a fixed price? I have no idea how much will given take cost. Well, I can ask for a plan and extrapolate from that, but all perfectly reasonable looking plans eventually end up in a rabbit hole. Providers keep introducing new models and each is more expensive while offering modest improvements, it’s a silent inflation.

So I personally can easily believe that. Especially that a lot of people will just try to see if model can make that huge improvement / refactoring they’ve been hoping to do a reality, or tons of experiments to validate ideas.


If for each story the developer needs to fetch context for 10's of micro services I could see them using a lot of tokens.

True. I've worked on projects which required updating 3+ repos for each feature. Required carefully-timed staggered deployments.

It's often a sign of poor separation of concerns. Tight coupling and low cohesion.

On a good codebase with microservices, this should happen on rare occasions, but not every single time you add a new feature. Been there. Agreed those are particularly hard to work with using AI.


I've been advocating for this approach for years. It's useful for any kind of data processing. You can't avoid race conditions without using some kind of queueing mechanism and you need backpressure to measure queue capacity. I built this into every aspect of https://socketcluster.io/ - From pub/sub channels, RPCs to event listeners.


This. The core problem is that people assume that all software is necessarily unreliable.

The fact is because they themselves are not capable of producing perfectly reliable software, they assume that everyone else is the same. With this narrow-minded worldview, you would expect software to require constant updates as the maintainer is essentially playing a never-ending game of whac-a-mole.

Not all technologies change. Often, low-level engine APIs are very stable and essentially never change... So why should the software built on top change?

According to OP, the kind of reliable software that we need in the AI slop era would fall in the category of 'dead project'. So they are doomed to create AI slop on top of other AI slop. Good luck to them.


It's interesting reading this.

Preventing these kinds of concurrency issues is exactly why I built https://socketcluster.io years ago. Though it solves the problem at the app layer rather than the storage layer.

But not many developers care about these race conditions it seems.

It's not just an issue with SQL but a more general issue with many programming languages and approaches.

This is a great example because it shows how concurrent executions can lead to significant issues.


A no-code platform packaged as an AI tool for building data-driven applications and serving as a data store for AI to tell it interact with your data; https://saasufy.com/ - Tested with Claude Code and pi.


This is why I built https://saasufy.com/ - There are 23 generic HTML components which can be assembled to provide a flexible way to render any kind of data and flexible form elements to flexibly update the data (or show errors when validation fails). It's fully declarative so there is very little room for errors. I find that this helps a lot when working with LLMs. There are no complex bugs. The only kinds of bugs you might encounter are syntax or UX related. No weird race conditions or complex technical issues.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: