I had a revelation the other day. Our IT team (top 500 US company) had an MS exec come in and pitch to the entire leadership team (100s of people) on the MS AI strategy and what it could mean for us. 100% sales pitch and corporate bullshit but our execs sucked this guy’s dick so fucking hard. And it hit me… We’re not a tech company. These IT professionals got stuck in this backward company and wish they were working for one of the big five - where the cool kids work. They don’t give any shits about how or whether AI works at a real level, they’re just cosplaying like they are big time tech bros. That’s it, full stop.
The more you take logic out of it the more it makes sense.
Headline sucks; it blames workers when this is a management problem.
At my company we received an email telling us to not use AI for unnecessary tasks. Of course this is after they crammed AI down our throats.
But I bet a fat nickel that they also track token usage and will recognize the top user each month as an “AI Super Hero” or some other nonsensical bullshit.
My employer has been pushing AI as well so I’ve been using Claude to help me when it makes sense. I’ve had it write python scripts to interact with various API’s, had it analyze log files, etc. I might use it two or three times every other day for between 10 to 30 minutes at a time. So while I’m not using it excessively, I’m still using it what I consider to be a reasonable amount.
I checked my Claude account the other day and it said I have used a whopping 1% of my quota for the past month. That really makes me wonder how all these companies are burning through their usage the way they are reporting. Do they have teams of employees now that are doing nothing but interacting with AIs for 8 hours a day? It seems like that would be the only way for me to put a dent in my monthly quota with Claude…
You can use it all day and stay well below the quota. Small context, with the right model for the job. Surgical precision.
But… At some point you shut off your brain, use the most expensive model on the highest reasoning level with your whole codebase as context and just wait for tens of minutes while it burns all the tokens. To speed this up you then send six agents to tackle the same problem from all angles.
I’ve been experimenting a bit with adding LLMs into my workflow, and even when using it constantly for a full 8h workday, it barely uses any of my quota. I’m guessing that those who burn through an excessive number of tokens are probably just letting a bunch of them run unattended and automatically allowing everything. There’s just no way to verify that much of its output.
People are using agentic harness software, where the LLM streams output directly to your computer terminal. With this setup the harness recognizes structured commands from the LLM, which lets it read files on your PC using local tools. This supposedly lets the LLM figure out how to engage with the project better. I’ve consumed 500k tokens in no time using this approach.
We had people using it to reply to emails. And using it for personal purposes.
In other words, “tokenminning,” short for “token minimizing,” is now in.
-
🙏🫧🪡
-
Oh gods, is minning now a thing?
I’ve been maxminming my entire life.
It means I don’t max anything. Highly different from minmaxing of course.
-
Hey guys, let’s make everyone use a product that is going to increase our cost by 30% -Dumbest timeline ever
It’s true happened just this week at work. We are looking into openrouter models now no more anthropic




