The past 18 months in AI have been a whirlwind-clusterfuck of acceleration, innovation, and chaos.
First, image diffusion models cracked the feedback loop and became radically better in the span of around 6 months. They went from a niche research area to commodified magic at scale, disrupting every visual creative field.
Then LLMs went mainstream. The average Joe suddenly had access to general intelligence in the browser. ChatGPTs explosion in popularity tapped the economic and feedback flywheels, enabling usability and capabilities to go superlinear.
After decades of being that abstruse thing PhD-wielding geniuses did in some wing of Google, AI finally arrived in every script kiddie’s toolbox. Overnight.
The game was afoot. And it was fast.
Every couple of months, just as new products and services were beginning to find market niches, a major player like OpenAI would release an update — slaughtering thousands of would-be startups in broad daylight. The ruthless sacrifices serving as both a terrifying warning against stasis and the catalyst for the next wave of gladiators to try their luck in the arena.
We’re all trying to build the future, but the present keeps shifting beneath our feet.
Yesterday’s SOTA is today’s API wrapper.
A few weeks ago, the first large multi-modal models were “open sourced”, giving AI developers the chance to incorporate visual processing into LLM-driven applications.
At a weekend hackathon, I spent hours configuring and spinning up GPU-accelerated servers to host the openly-licensed Llava model, wrapped in an API. My team worked late into the night to integrate it with Anthropic’s Claude 2 so we could demo an AI medical counsellor that leveraged both the visual input of Llava and the large context window and broad knowledge base of Claude. It enabled us to build something unique — a lightning-fast doctor in your pocket that you could just “show” your issues and could keep track of your entire patient profile to help you find solutions. The “where does it hurt” AI you could call on at any moment.
With a heterogenous collection of models, expensive GPUs, advanced prompting techniques, and hacky frontend tricks, we were able to build magic.
The next day, OpenAI released multimodal GPT-4, a faster GPT-4-turbo with a context window larger than Claude’s, and an entire low-code platform for building applications with them.
What had required a dozen hours of work, years of specialised knowledge, and hundreds of lines of code to make the day before, became a low-code tutorial project. It also became hundreds of times cheaper to run and maintain.
Yesterday’s cutting-edge is today’s table stakes.
And so the cycle continues, gaining momentum with every loop. Always accelerating.
One person (with a tab opened to ChatGPT) can build and launch a revolutionary product in a week. But ideas only come as quickly as synapses can fire and synapses require sleep and nutrients.
The human nervous system is now the bottleneck.
Developers can only make things when they’re working on them. Inspiration is limited by energy which is limited by ATP. Adoption is limited by human attention, which is limited by dopamine. Revenue is limited by human capacity to desire and buy new things which is limited by the amino-acid clock speed on which we compute our hedonic adaptation.
Already, our not-quite-super intelligent systems have superseded us. Silicon has outstripped carbon by compounding 1% daily improvements we can no longer match.
Someday soon, even OpenAI releases will no longer be rate limited by Greg Brockman and his colleagues needing to eat, sleep, and shit. And then we are truly moving.
But moving towards what?
The sci-fi nerd in me hopes it’s the techno-optimist utopia. Limitless clean energy, curing all disease, abundance of time and resources — winning the cosmic battle over entropy.
But perhaps we’re all destined to work ourselves to irrelevance trying to build our silicon successors, whilst choking on the acrid fumes of burning GPUs.
We have become death. Creators of gods.
But we yearn to create, so we keep on building.