Sora, OpenAI’s new text-to-video model, was released Feb 16 (Hacker news discussion). It’s remarkable - the ability to capture physical space, motion, and interaction in video generated from text prompts.
I’m planning to go through how Sora works in a mini ‘Hard Parts’ post but this week let’s explore the questions it raises for the fields I work in - technology and education. What does it actually take to build on top of these kinds of step-changes?
We get blown away (and maybe terrified) by the potential of new tools and yet actually integrating them so often proves a lot harder (the productivity gains of computers took years to show up in stats). Why is it so difficult?
Early adopters are usually stellar creative problem solvers and can make remarkable things happen quickly with the new tools
The kinds of problems they work on, though, are typically orders of magnitude less entangled with existing systems than those of the people and organizations who might apply the new tools at scale (cf every Fortune 500 company’s struggle to work out what to do with AI)
More cynically, there’s a degree of incentive to launch products in a way that shows off their strengths rather than how they integrate with existing workflows
How many people have managed to generate on Midjourney an exceptional high-end image that also serves some creative or commercial goals? It still requires hard learning and specialization to become effective with even these transformative tools.
That highlights what makes ChatGPT so exceptional – with its 100M users in 2 months, and Google-like ‘general use’ and ‘progressive user experience’ that introduces you to its possibilities as you use it. Even still, it will take time for people and teams to figure out how best to incorporate it.
This parallel evolution of capability and complexity keeps me optimistic about what the future of AI can achieve, in partnership and not just in competition with humans.
I think a lot about this. Preparing exceptional engineers for roles that are typically for more tenured candidates is ultimately a core part of Codesmith’s mission.
One striking thing about today’s market is the rising demand for engineering roles in nontraditional areas like healthcare, finance, and energy. What unites these domains is that, until recently, their complexity has made it difficult to deploy technology at scale.
Two related shifts show how this is changing.
At one end, large tech firms that have dominated in relatively more straightforward domains such as e-commerce are running out of runway. They're increasingly looking to these more complex fields for growth.
On the other, traditional organizations are building mature engineering teams of their own. JPMorgan, Blue Cross Blue Shield, and the New York Times all hired multiple Codesmith grads in 2023 including in leadership positions.
Meanwhile, pioneering AI tools are making complex domains more tractable, allowing engineers to tackle problems that were previously unassailable.
As this unfolds, we wrestle with how best to prepare software engineers of the future.
The tools will evolve - we’re adding modules on subjects like copilot tooling, fine-tuning models, and operationalizing ML. But we won’t change the focus on capacities rather than skills.
Skills run a never-ending risk of becoming obsolete. Capacities transcend contextual shifts, and retain their relevance – whether in software engineering or elsewhere.
I’ll explore this further in future but I’ll share here an experience that helped me grasp the distinction between a capacity and a skill, and why it eventually inspired me to start Codesmith.
When I was a university student at Oxford, I had the opportunity to meet regularly with a brilliant political science professor, Walter Mattli, in one-on-one and small group settings.
I would bring my essays that I naïvely thought comprehensively dissected the topic of the week. The next week I’d do the same for the next topic. Steadily I compiled a collection of what were essentially listicles of facts and observations.
Mattli pushed us all towards a deeper level of understanding. One that goes under the hood of the ‘what' is happening, down into a reusable explanatory framework about the ‘why and how’ it happened. In other words, toward a model – a system with input, a process, and output.
As trivial as this simplification sounds, building a model is profoundly hard. In political science, in AI, and elsewhere.
I had no idea this way of thinking even existed, and it was a long time before I could start to build and apply mental models of my own.
Beyond their usefulness, being taught a capacity can be identity-shaping. The regard our professor demonstrated for our understanding legitimized not just the lessons being taught, but the person doing the learning. Equally if not more important than the capacity itself was the care he showed in passing it on.
From the very start of Codesmith, that’s been the basic goal underlying all the conspicuous program features (pair programming, challenging open source work, the free ‘Hard Parts’ workshops). They’re all really just about empowering ‘residents’ (students) with capacities by trying to emulate that same approach of my professor back at Oxford.
This enables grads to tackle problems they haven’t seen before, including those that have only recently entered the scope of what an engineer can do.
It’s not easy to build capacity, or instill it in others. It takes genuine care, from both the learner and the instructor.
At the time I was mystified as to why Mattli did it. Why did he leave a prestigious faculty position in the States for a role that demanded more teaching?
Why does Eric (on the Codesmith team) spend so much time advising residents on their job search when he could easily go back to a high-end role at a tech company, with all its perks?
My sense is they’ve come to realize the limits of new, powerful tools on their own, and the need for people with capacity equal to wielding those tools and navigating the systems they shape for relative good.
And where they find a community with a mix of people reflective of society as a whole, they are compelled to instill in them the confidence and the capacity to do so.
With thanks to Ciaran for illustrations - “precut holes only ready to accept 'legacy shapes'"
The use of copilot tools accelerates the need for system design skills. Small bugs related to boilerplate setup are easier to fix, but the amount of code you can write without really understanding has gone up 5 - 10 fold.
In some ways LLM's are ushering in an era similar to what PHP did in the mid 90's, when it allowed any shitty engineer to build their own CRUD app that worked, despite generating almost as many dysfunctional portions in your codebase.
Another item I'm undecided on is whether or not CoPilots accelerate the skills of senior or junior people more. If you're junior you can hop over a lot of small puddles 10x faster, but the tools have a tendency to autocomplete what it thinks you already want to do. Meaning they can accelerate the build out of your inferior design patterns if you're not good at what you're doing.