Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I like how some people are accusing them of reducing the overall token usage to screw over Claude Code users and then there are yet other people that are accusing them of deliberately increasing token usage to screw over API users (or maybe to get subscription users to upgrade, I'm not really sure)


I suspect the real issue is that they just change stuff "randomly" and the experience gets worse/better cheaper/more expensive.

Since you have no way of knowing when they change stuff, you can't really know if they did change something or it's just bias.

I've experienced that so many times in the last month that I switched to codex. The worst part is, it could be entirely in my head. It's so hard to quantify these changes, and the effort it takes isn't worth it to me. I just go by "feeling".


They don't even need to do anything. LLMs are effectively random anyway. Even ignoring temperature and inadvertent nondeterminism in inference, the change in outputs from a change in inputs is unpredictable and basically pseudorandom. That's not to say they aren't useful, just that Anthropic could make zero changes and people would still see variations that they'd attribute to malice.


The issue is business and transparency. Transparency is often in the customer's interest at the individual business's expense.

There are very, very few things that can be completely transparent without giving competitors an advantage. The nice solution solution to this is to be better and faster than your competitors, but sometimes it's easier just to remove transparency.


I expect "model transparency" to become the new "SSO" enterprise feature differentiator.

Enterprise use cases have to have it (or else pawn the YOLO off on their users), so it will be a key way to bucket customers into non-enterprise vs enterprise pricing.


Nobody is accusing them of making the models more efficient.

People are complaining they are changing how many tokens you get on a subscription plan.

Why would anyone dislike getting more service for less (or the same) amount of money?


> People are complaining they are changing how many tokens you get on a subscription plan.

They didn't change this. It's the same number of tokens just a different tokenizer.


They absolutely do change this all the time - session limits vary wildly. The most damning proof of this is that there's absolutely no information about how many tokens you get per session with each subscription level, it's just terms like 5x, 20x. But 5x what? Who knows?


That's not proof of anything. Also the usage is not solely based on tokens because you also have to factor in things like prompt caching costs (and savings). So it's based on the actual API cost.


You and I have no way of knowing that.


Except that the API cost is literally logged on disk for every session and it's easy to analyze those logs.


We aren't talking about API costs or number of tokens consumed, we are talking about number of tokens in a monthly subscription.


Again, it is not based on number of tokens. If it was solely based on number of tokens then things like cache misses would not impact the usage so much. It's based on the actual cost which includes things like the caching costs.


I think this is the case. In the early GPT-4 days I tested the same model side by side across the subscription and API. The API always produced a longer better answer. To me it felt like the API model was working how it was supposed to work while the subscription model tried to reduce its token usage. From a business perspective that would make sense. I then switched to API only because I felt like it was worth the extra cost.

I did a similar test with sonnet about 6 months ago and noticed no difference, except that the subscription was way cheaper than API access. This is not the case anymore, at least not for me. The subscription these days only lasts for a few requests before it hits the usage limit and goes over to ”extra usage” billing. Last week I burned through my entire subscription budget and 80$ worth of extra usage in about 1h. That is not sustainable for me and the reason I started looking at alternatives.

From a business perspective it all makes sense. Anthropic recently gave away a ton of extra usage for free. Now people have balance on their accounts that Anthropic needs to pay for with compute, suddenly they release a model that seem to burn those tokens faster than ever. Last week I felt like the model did the opposite, it was stopping mid implementation and forgetting things after only 2 turns. Based on the responses I got it seemed like they were running out of compute, lobotomized their model and made it think less, give shorter answers etc. Probably they are also doing A/B testing on every change so my experience might be wildly different from someone else.


The UIs all bake in system prompts and other tunable configs that the API leaves open, so does Claude Code and other harnesses. So anything you notice different over the API when you're controlling the client is almost certainly that. Note that this is kind of something they have to do because consumer UI users will do stuff like ask models their name or date, or want it to respond politely and compassionately, and get upset/confused when they just get what's in the weights.

The problem with subscriptions for this kind of stuff is that it's just incompatible with their cost structure. The worst being, subscription usage is going to follow a diurnal usage pattern that overlaps with business/API users, so they're going to have to be offloaded to compute partners who most likely charge by the resource-second. And also, it's a competitive market, anybody who wants usage-based pricing can just get that.

So you basically end up with adverse selection with consumer subscription models. It's just kind of an incoherent business model that only works when your value proposition is more than just compute (which has a usage-based, pretty fungible market)


> In the early GPT-4 days I tested the same model side by side across the subscription and API. The API always produced a longer better answer.

If you are comparing responses in ChatGPT to the API, it's apples and oranges, since one applies a very opinionated system prompt and the other does not.

Since you haven't figured that out in 3 years, I didn't bother reading the rest of your comment.


this comment feels pretty rude and disrespectful for no real reason?


I don’t know about ChatGPT, but in Claude Code I _have_ been able to do a side-by-side comparison of API-based metered billing vs subscription billing, in the same UI. You just switch from one to the other using /login.

You should probably not be so quick to dismiss what people say as nonsense.


It's almost as if there are different people with different motivations and ideas about how the world should work




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: