When DeepSeek released their open-source model, they didn’t just alarm US tech companies and impact NVIDIA stocks. They likely shifted focus away from expensive custom model training toward leveraging open-source models with quality data instead.
Initially, all models required training, and DeepSeek benefitted from resources developed for ChatGPT. However, the competitive advantage of the models themselves has now been diluted. As models and interfaces become commoditized, AI projects must compete on other fronts.
The Data Advantage
Financial and social data rank high on the scale of valuable data. Traditional systems limit access to such information, but blockchain stores financial data transparently and publicly.
Onchain data matters because it’s expensive to store and market forces can determine its price, establishing a hierarchy. This creates a valuable resource for AI agents needing quality inference data.
The Accessibility Problem
Despite blockchain’s openness, retrieving onchain data faces significant challenges:
- Scale (1+ terabyte and growing)
- Fragmentation across chains, L2s, and systems like IPFS
- Lack of standardized approaches
SQD’s Evolution
SQD is transitioning from offering raw blockchain data to becoming a hybrid data lake and warehouse. The platform currently indexes over 200 blockchains but is restructuring to provide structured data ready for immediate consumption to support AI agent developers rather than requiring builders to create custom indexers.