The Next Problem in AI Isn't Intelligence
I think lately we're all seeing how effective AI and agents are at solving quite a few problems. We set up an objective to run through, and the AI executes. The more verifiable the domain, the better the agents perform. It's why things like coding are great, and other more complex agentic feedback loops require more robust evaluation mechanisms.
The bull case for AI reaching true "general intelligence" is that AI can solve any problem, and therefore it's "generally" intelligent. At face value, I think this makes a lot of sense. I also think it's why today's systems are generally intelligent and can provide significant value in accelerating tasks. Yet, I think what people really mean by AGI is removing/reducing human-in-the-loop, not just... changing how much a human can do in a given domain. That's AI assisted – and it's no different from giving any worker a new tool to improve their productivity.
Yet, when I look at what people really do – it's not just solving problems but figuring out what to solve. We like to distill junior engineers' roles as "given a task, do it", but even they have to decide how to spend their time: learning, executing, exploring. The more senior you are, the more obvious it is – you don't have time to do everything you want to do, and you need to adjust not just priority, but scope of projects to keep everything balanced and oriented towards some global optimum.
The problem is, that global optimum is really nuanced. It's exactly why reward models in RL training are so hard. When you introduce a reward, you might introduce an edge case to short-circuit reward without actually getting the intended result. It's like when humans get stuck in dopamine loops that don't lead them to better, more fulfilling lives. How do we get these agents to do the most meaningful work they can?
This to me is the core challenge with the "singularity" version of AGI. There's perhaps the base case, which builds off today's trajectory of increasingly powerful models with increasing power and unclear reward functions. These are likely to have unintended consequences, but I'm not sure it materializes as the AI 2027 future with complex planning. That future necessitates that AI has well-defined strategy and reward function that it can strategize against. Realistically, we probably just get some ineffective agents and messes to clean up.
The reality is we don't know how to craft a good objective function in complex domains. Look how varied performance evaluation criteria is for human workers. We have multiple performance axes, multiple methods to evaluate performance, and multiple ways to set up organizations. Success is possible with many different frameworks – which means we don't have a silver bullet framework. Instead, we manage to success as individuals because we can somehow reconcile this feedback into progress.
I'm not saying humans are perfect – we certainly aren't – but this challenge of figuring out what we should be doing is hard. When we think about asking AI to be fully autonomous, we're basically telling it – given what you know, go spend your time wisely. There are so many competing things going on in your own life – your family, your career aspirations, the last meal you ate, your childhood aspirations eating away at you... all of it somehow compiles into a program that directs you into a feedback loop of taking exactly the actions you are now.
This is what we need AI to do to be infinitely valuable in the "AGI" sense. We need it to take in all the inputs, understand the nuanced objectives and paths, and take actions that guide it toward a global optimum. We're not there right now, but in the interim we can scope that down a fixed set of inputs and less nuanced set of objectives and reap the rewards of our newly built productivity tools.
These models will get more and more powerful, and in turn encourage us to zoom out and add in ambiguity. However, this will ultimately get back to the core questions of humanity in terms of guiding the model. How to live a good life, how to do the most impactful things, how to be the most fulfilled? All of human history hasn't produced a consistent answer. And I think we're quite optimistic to think that we can do that for AI.
But I will leave you with a happy bit of irony. Chasing that optimal objective function isn't just an answer for AI, it is an answer for you as well. Consider all your inputs, all your rewards, what really matters – that fulfillment that guides your prioritization toward the global optimum is one in the same, human or machine.