Weekly AI Reads for 7/7/2025
Cracks in transformer hype and the promise of diffusion models, datasets are all you need, GaslightGPT
A HUGE First Week
I want to thank you all for reading, commenting and sharing wtf.ai with your friends and colleagues. It means the world to me that anyone would be interested my thoughts and opinions on the state of AI today. I’ve also been surprised by how many smart people have come forward with their personal stories of how the current state of AI has not aligned to the stories on HackerNews or the drumbeat of the investor class. We hear you!
That said, AI clearly has demonstrable business value, and I don’t want this to just be a blog taking easy AI pot shots. The clever will find a way to leverage the current state of AI tooling to their advantage, reducing costs or improving the level of service in a way that would be impossible through traditional means. I'd love to share more of those stories — has generative AI tooling made a difference in your business or role? Please let me know!
Transformers aren’t all you need?
This week, the realities of AI coding tools and the limitations of transformers went mainstream - every third post on Reddit and Hacker News expressed frustration about gaslighting, sycophantic LLMs, how expensive Cursor is becoming, or highlighting vibe-coded messes that will be costly to reimplement. Transformers, RLHF and Deep Reasoning have sparked the generative AI revolution, and laid the tracks for the next great technology race, but they’re likely not going to bring about AGI.
So what are the options for the next technology shift in AI? Interest is beginning to grow around Diffusion Language Models. Diffusion Models were popularized for image and video generation, but might have new application in text generation as well. Instead of Transformers that generate one output token at a time, Diffusion Models generate several output tokens simultaneously, increasing output speed and reducing cost. InceptionLabs's Mercury is the first commercial implementation, but Gemini Diffusion is not far behind.
Diffusion might offer a solution to objective drift, where the model loses track of the original intent of the prompt because the entire output is generated and refined as a whole.
This is just one example of what might come next; I’ll dive into others next week.
Topics this Week
There are no new ideas in AI, only new datasets
Link to original article.
We just spoke about how Transformers might have reached their peak, but Jack Morris frames the problem slightly differently — maybe AI is really just a data problem.
The obvious takeaway is that our next paradigm shift isn’t going to come from an improvement to RL or a fancy new type of neural net. It’s going to come when we unlock a source of data that we haven’t accessed before, or haven’t properly harnessed yet.
One obvious source of information that a lot of people are working towards harnessing is video. According to a random site on the Web, about 500 hours of video footage are uploaded to YouTube *per minute*. This is a ridiculous amount of data, much more than is available as text on the entire internet. It’s potentially a much richer source of information too as videos contain not just words but the inflection behind them as well as rich information about physics and culture that just can’t be gleaned from text.
...
A final contender for the next “big paradigm” in AI is a data-gathering systems that some way embodied– or, in the words of a regular person, robots. We’re currently not able to gather and process information from cameras and sensors in a way that’s amenable to training large models on GPUs. If we could build smarter sensors or scale our computers up until they can handle the massive influx of data from a robot with ease, we might be able to use this data in a beneficial way.
First, data is the real play here. If your organization has a significant amount of proprietary data that is not internet accessible, you should be looking at ways to monetize or productize that data right now.
Second, I want to play devil's advocate on Jack's post — we're going to run out of novel, low-background steel very quickly. Will techniques emerge that lets models train on fewer samples?
Writing code was never the bottleneck
Link to original article.
The marginal cost of adding new software is approaching zero, especially with LLMs. But what is the price of understanding, testing, and trusting that code? Higher than ever.
We've known since the 1980s that lines of code are a poor predictor of a developer's productivity. This is doubly true in the LLM era.
But LLMs don’t remove the need for clear thinking, careful review, and thoughtful design. If anything, those become even more important as more code gets generated.
Tavares gets it right here. If you're a good developer in the current era, you'd be smart to work on your ability to communicate in spoken and written formats and tell the story you're trying to tell to your colleagues. AI can’t do that just yet.
Maybe diffusion is all you need?
Link to original video.
If you're a visual learner like me, this talk by Oleksiy Syvokon on Diffusion Language Models is a gem. He even goes on to show it in use using the very cool Zed IDE.
Moment of Zen: GaslightGPT
Link to original article.
Maybe at some point I'll tire of ChatGPT schadenfreude, but…. not yet.