Welcome to DU! The truly grassroots left-of-center political community where regular people, not algorithms, drive the discussions and set the standards. Join the community: Create a free account Support DU (and get rid of ads!): Become a Star Member Latest Breaking News Editorials & Other Articles General Discussion The DU Lounge All Forums Issue Forums Culture Forums Alliance Forums Region Forums Support Forums Help & Search

General Discussion

Showing Original Post only (View all)

usonian

(20,179 posts)
Fri Sep 5, 2025, 07:52 PM Friday

Where's the Shovelware? Why AI Coding Claims Don't Add Up --- Mike Judge Sep 03, 2025 [View all]

https://substack.com/inbox/post/172538377

With receipts.

"But lo! men have become the tools of their tools." --- Henry David Thoreau, "Walden"


I was an early adopter of AI coding and a fan until maybe two months ago, when I read the METR study (1) and suddenly got serious doubts. In that study, the authors discovered that developers were unreliable narrators of their own productivity. They thought AI was making them 20% faster, but it was actually making them 19% slower. This shocked me because I had just told someone the week before that I thought AI was only making me about 25% faster, and I was bummed it wasn’t a higher number. I was only off by 5% from the developer’s own incorrect estimates.

This was unsettling. It was impossible not to question if I too were an unreliable narrator of my own experience. Was I hoodwinked by the screens of code flying by and had no way of quantifying whether all that reading and reviewing of code actually took more time in the first place than just doing the thing myself?

So, I started testing my own productivity using a modified methodology from that study. I’d take a task and I’d estimate how long it would take to code if I were doing it by hand, and then I’d flip a coin, heads I’d use AI, and tails I’d just do it myself. Then I’d record when I started and when I ended. That would give me the delta, and I could use the delta to build AI vs no AI charts, and I’d see some trends. I ran that for six weeks, recording all that data, and do you know what I discovered?

I discovered that the data isn’t statistically significant at any meaningful level. That I would need to record new datapoints for another four months just to prove if AI was speeding me up or slowing me down at all. It’s too neck-in-neck.


(1) https://metr.org/blog/2025-07-10-early-2025-ai-experienced-os-dev-study/

6 replies = new reply since forum marked as read
Highlight: NoneDon't highlight anything 5 newestHighlight 5 most recent replies
Latest Discussions»General Discussion»Where's the Shovelware? W...