Tokens: The New Oil

How to manage API rate limits and token costs during hypergrowth

When your AI product suddenly scales from hundreds to thousands of users, token consumption becomes an existential constraint—not a cost optimization problem. Kent Beck argues that expansion-stage growth demands a mental shift: abandon the careful, reversible decisions of exploration and treat token capacity like ammunition in wartime. Your job is pushing bottlenecks into the future through rapid, imperfect solutions—multiple API accounts, aggressive caching, feature deletion—not elegant architecture.

Read full essay on Substack ↗

Questions this essay answers

  • How do I prioritize speed over code quality when API limits threaten to kill my product?
  • Should I reduce users or delete features to stay under rate limits during hypergrowth?
  • What's the fastest way to scale token capacity when investors are throwing money but APIs won't budge?
← All essays