Distributed Rebellion: A thesis on crypto x AI from Delphi Labs đ
AI represents arguably the biggest technological revolution in history, and has kickstarted a technological arms race the likes of which the world has never seen before. Current AI models are already scoring in the top decile on most standardized college tests and outperforming humans at many tasks including AI research itself. Even at its current level, this is already transformative to many industries such as search, customer service, content creation, programming, education, and more.
We expect AI capabilities, funding, and its effect on society to only accelerate from here. All the big tech giants understand AI is existential to their businesses and are investing accordingly. NVIDIA revenue, arguably the best proxy for AI CapEx, is on track for over $100b in 2024, more than double that of 2023, >4x that of the year prior.
Google CEO Sundar Pichai on AI investments:
âThe risk of underinvesting is dramatically greater than the risk of overinvesting for us here."
At the same time, startups sense AI is a disruptive force with which they can unseat multi-decade incumbents and an estimated $83b has been invested into AI startups over the last 18 months.
Given that AI capabilities have tended to scale exponentially with the compute applied to them, itâs very likely we will reach something like AGI within the decade.
In this piece, we argue that competitive dynamics will result in a world of millions of models, and crypto is the ideal substrate for this many-model world. Weâll start by discussing why we think a many-models world is the logical end-game for AI. We then go over the unique differentiators crypto provides to AI. Finally, we cover the crypto x AI stack as we see it, and provide specific examples of the kinds of projects weâre excited about.
There are strong philosophical and moral reasons why open-source AI and crypto x AI is a better state of affairs for humanity, and these are excellently covered elsewhere. While we agree with them entirely and this is part of what motivates us to build in this space, for the purposes of this piece will focus purely on the practical reasons why crypto x AI will win, rather than the moral arguments for why it should win.
â God-model vs many-models
Right now, weâre tracking towards a world where a few, large vertically-integrated tech companies produce âGod-modelsâ that dominate everything else.
However, we donât think think this is the end-game for a few reasons:
Rug risk: Organisations, entrepreneurs and developers building experiences on top of AI donât want to be dependent on a single closed-source company which can change the model, alter the terms of use, or even stop serving them entirely.
Cost-performance-tradeoff: The extremely large, generalised models favoured by the big tech companies are necessarily much more expensive, both to train and to run. As a result, this renders them overpriced and overpowered for many use cases. While this isnât as big a consideration right now as people arenât thinking about profitability, as AI reaches scale people will optimise to get the lowest cost possible for the level of performance theyâre looking for. For many tasks, large models will not be competitive here. There is extensive research to support this, showing much smaller, specialised models can outperform the generalised models at everything from medical imaging diagnoses, fraud detection, speech recognition and much more.
Vertical integration: As Apple has repeatedly demonstrated, the best products often result from vertical integration across the entire stack. Ambitious entrepreneurs building AI-enabled products will seek to gain a competitive advantage by building on top of their own specialised models.
These products will also be able to capture more value, attracting more investment, etc.
Privacy concerns: AI will be at the core of organisational workflows in a way that arguably no other technology has been. Many organisations are reluctant to entrust their sensitive data to these models.
For these reasons, we believe weâre much more likely to end up in a world with many smaller, specalised models that are tailored and cost-effective for particular use cases. Application developers and users will leverage open source models such as LLaMA or those from @MistralAI as a base from which to fine-tune their own dedicated models, often using proprietary data. Many models will continue to run on servers, but smaller, more privacy-sensitive applications will run locally on client devices, while others who require censorship-resistance might use decentralised compute networks.
This is a world of modular AI legos, where devs and entrepreneurs compete to provide value to users, and users are able to pick, choose and combine different services to suit their particular needs. Routing, orchestration, synthesis, payments, and all sorts of other infrastructure will need to be built to unbundle the âGod-modelâ stack and serve this emergent AI economy.
This also happens to be the world where crypto thrives.
â Crypto x AI
Crypto intuitively feels like an area which can find utility in this many-models world. However this hype has led to significant capital allocation in the space from often under-informed investors. Much like the infra bubble before it, many projects are being funded and built which perhaps should not be. As such itâs not easy to determine which subsectors in the crypto x AI space genuinely have merit, leading many to dismiss the whole space as a meme without fundamental value.
We donât think itâs a meme, but itâs true that this many-models world could theoretically ex$ist without crypto. Therefore, it was important for us to focus on the unique Ć of crypto that allow us to create radically better products or, ideally, ones that couldnât be built without it. In order to do this, we start by identifying the unique properties of crypto and how they could apply to AI in a way that results in better products. Weâll then go over the crypto x AI stack and provide examples of use cases that we think fit this.
Trustlessness: Crypto rails tend to be trustless, which means you can have cryptographic assurances that they donât change, access cannot be unexpectedly withdrawn and you can verify that execution is as expected. This is important for the modular AI stack because, unlike with an integrated approach, builders will need to compose with a bunch of primitives they donât control and users will need to inherently trust a number of services, many of which they donât even know about.
Censorship-resistance: If deployed as immutable contracts, applications running on crypto rails are unstoppable. Even if upgradeable, itâs often by a DAO which requires a quorum of tokenholders to reach consensus. Assuming AI becomes as powerful as we expect, itâs highly likely governments will seek to control and influence it. In fact, weâre already seeing this happen. Just as Bitcoin and crypto provide money/financial rails that sit outside the system, crypto x AI provides unstoppable intelligence.
â The crypto x AI stack
Given these benefits, what applications do we think are particularly interesting at the intersection of crypto x AI?
Data Centers and Compute
The utility of compute for models broadly falls into two categories: training and inference. We see merit in using decentralised compute for both of these and weâll expand on each below.
Training on Decentralised Compute
Distributed compute is currently difficult due to the heavy communication and latency requirements between nodes during training. There are many teams trying to solve this problem and, given the size of the prize and the quality of talent working on it, weâre confident it will probably be solved. A few promising approaches here include @NousResearchâs DisTrO and @PrimeIntellectâs OpenDiLoCo.
In addition to solving the hard technical problems of distributed training and building a product that abstracts away this complexity, winners will also have to figure out:
1. How to ensure quality and accountability on a permissionless network
2. How to bootstrap a supply-side, ideally of data centers and clusters rather than consumer hardware
Token incentives will probably be table stakes for incentivising a supply-side, and more creative approaches may include giving compute providers ownership in the resulting model.
Fundamentally, the advantages of a distributed compute marketplace are that you can tap into the lowest marginal cost of compute around the world. This becomes increasingly important as rising costs from incumbent service providers causes more companies/orgs to push back and seek out cheaper alternatives. The disadvantages are latency, heterogeneous hardware as well as lack of all the optimisations and economies of scale that come from building and operating your own data centers. It remains to be seen how this plays out.
â Verifiable Inference
Broadly, we see the use case for verifiable inference as extending trust-minimised systems with AI capabilities. Itâs not practical to embed a model into a smart contract, but it is possible to run the model off-chain and post some attestation or proof that it ran as expected on-chain. For instance, projects could trustlessly offload governance decisions (e.g. decisions regarding risk parameters in a money-market) to an off-chain model.
This concept could also be used for open or closed-source models more generally, giving users assurances that the output came from the model they expected. This may become important as applications and users leverage AI for increasingly mission-critical tasks. There are many projects tackling this in various ways such as Delphi Ventures portco Inference Labs (@inference_labs).
â Data
Training LLMs today is a multi-step process requiring various kinds of data and human intervention. It starts with pre-training, where LLMs train on cleaned, curated versions of the common crawl and other freely available data sets. During post-training, the models are trained on smaller, more specific, labeled datasets to make them proficient in specific areas (e.g. Chemistry), often with the help of experts.
In order to ensure fresh and/or proprietary data, AI labs often secure deals with owners of large data sources. For example OpenAI and Reddit signed a deal worth a rumoured $60m. Similarly, the Wall Street Journal reported that News Corp's deal with OpenAI was valued at more than $250 million over five years. Itâs clear that data is more valuable than ever.
We believe that crypto networks are well placed to help teams source the data and resources required by every stage of this process. Perhaps the most interesting sector is data collection, where we believe crypto incentives are well placed to bootstrap the supply side of data collection and unlock much of the significant long tail of data sources.
For example, Grass AI (@getgrass_io) incentivises users to share their idle internet bandwidth to help scrape the web for data which is then structured, cleaned and made accessible for AI training. If Grass can bootstrap enough of a supply-side, it can effectively act as an API key providing fresh internet data for use in models.
@Hivemapper is another good example - the network was launched in November 2022 and collects millions of kilometers of road-level imagery every week, having already mapped 25% of the world. Itâs easy to see how similar models could be applied to other forms of multi-modal data and monetised by selling to AI labs.
As the NewsCorp/Reddit deals show, there are many companies who own valuable data but many are either too small or lack the connections to AI labs to monetise it. Similarly, AI labs making deals with individual small providers may not be worth the effort. A well-designed data marketplace could mitigate this by connecting providers to AI labs in a somewhat uniform manner. There are a few challenges here, the primary ones being solving for quality of data, as well as fungibility of both APIs and data.
Finally, data preparation is a significant set of tasks involving labeling, cleaning, enrichment, transformations and so on. A small team may not have all these skills in-house and look to outsource. Scale AI (@scale_AI) is a centralised company offering these services - currently estimated to have revenue of around $700m and growing fast. We believe a well designed marketplace and workflow system based on crypto rails can do well here. Lightworks is one that Delphi Ventures invested in and there are a few others - all at quite an early stage.
â Model
To paraphrase Delphi Digitalâs report, The Tower & The Square, the production and control of AI models are tracking to be almost entirely controlled by âthe towerâ - big tech and governments.
This is arguably an even more dystopian state of affairs than government-controlled money. As it allows them to not only control the most important economic resource, but also control the narrative by censoring and manipulating information, cutting certain âundesirableâ people off from the system entirely, using peopleâs private AI interactions against them, or simply using AI to maximize ad revenue.
There are many smart people working to create âthe squareâ - a decentralised network with the goal of producing a fully neutral, censorship-resistant model accessible to all. So just as Bitcoin and crypto provide money/financial rails that sit outside the system, crypto x AI would provide intelligence that sits outside the system.
Such projects aim to create a god model that rivals GPT and LLaMA by decentralising every part of the model creation process - the network sources and prepares data, trains on its own decentralised compute, runs inference on that same compute, and coordinates the whole process through decentralised governance. No part of the process is centralised and thus the model is truly community-owned and uncontrollable by the âTowerâ.
Obviously creating a decentralised model that comes anywhere close to rivaling frontier models is going to be extremely difficult. We canât expect that a large percentage of users will tolerate a worse product for moral reasons. We consider this class of projects to be "moonshots", unlikely to succeed by definition but if they do, would be incredibly valuable - and we sincerely hope they do.
Itâs also worth mentioning centralised AI labs, which embrace crypto ideals and are likely to have a token or leverage crypto rails in some other way. @NousResearch, @PondGNN and @PondGNN are some examples that Delphi Ventures has invested in.
Lastly model creation infrastructure such as Bittensor by @opentensor falls under this model part of the stack. Bittensor has been discussed thoroughly elsewhere however so we wonât get into the pros and cons of it here.
Continued:
https://x.com/delphi_labs/status/1834247706103160939?s=09