TuringData: Powering the AI Factory Era with High-Speed Data Infrastructure

Speaker: Nikhil Madan, VP of AI Infrastructure at TuringData
Topic: TuringData: The Unified Data Platform Powering the AI Factory Era
Date/Time: Wednesday, May 20, 2026, 3:40 PM to 4:00 PM (20 minutes)
Location: The AI Summit Stage, Hall 3, Asia Tech x Singapore 2026

At Asia Tech x Singapore 2026, Nikhil Madan, VP of AI Infrastructure at TuringData, took the stage at the AI Summit to address a pressing paradox. While the AI industry is projected to reach a trillion dollars well before 2030, actual AI deployments are lagging behind the hype. According to Madan, the culprit is not a shortage of GPUs, but a deeper, more fundamental bottleneck: data infrastructure.

TuringData provides a full-stack AI data infrastructure that supports the entire AI lifecycle, enabling faster data access, efficient model training, scalable enterprise-level inference, and intelligent data orchestration and management. In his session, Madan laid out a clear, pragmatic roadmap for enterprises struggling to move from AI experimentation to production at scale.

The Numbers Don’t Lie: A Trillion Dollar Reality

Madan began by re-calibrating the audience’s expectations. The AI industry was valued at roughly $185 billion in 2024 with forecast going to the trillions by 2030

The Four Horsemen of AI Deployment Challenges

Despite the excitement, Madan identified four critical barriers holding organisations back:

High Capex: Deploying AI at scale with good data and processing power requires significant investment. There is no cheap shortcut to enterprise-grade AI.
Traffic and Congestion: The industry has moved beyond kilobytes or megabytes per second. Modern AI demands hundreds of gigabytes per second of data power, and congestion is already a major problem.
Legacy Data: Massive amounts of valuable data remain trapped in end devices, data centres, and the cloud. Without a way to bring this legacy data into the AI pipeline, models cannot reach their full intelligence.
Accelerated Inferencing: As NVIDIA’s Jensen Huang recently emphasised, the focus has shifted from training to inferencing. The question is no longer just how to train a model, but how to perform fast, smart, and accelerated inferencing.

The Real Bottleneck: Starving GPUs

Perhaps Madan’s most striking revelation was about GPU utilisation. “Some of the benchmarks I saw myself show GPU utilisation is about 30% to 40%,” he said. “That means 60% to 70% of the time, the GPU is waiting for data to come in.”

He likened expensive AI infrastructure to a high-performance Ferrari with no fuel. The problem is not the engine; it is data I/O. The speed at which data travels to the GPUs determines whether AI can be delivered with quality and at scale. This data bottleneck is directly resulting in fewer production deployments than the stock market’s AI-driven rally might suggest.

TuringData’s Pragmatic Solution

Rather than simply throwing more GPUs at the problem, TuringData has developed a lightweight, scalable approach built for the AI era. Their platform addresses four key requirements:

Lightweight but Scalable: An entry-level, three-node industry-standard appliance allows companies to begin their AI journey without a massive truckload of shipments. As needs grow, the system scales seamlessly to exabytes.
Non-Negotiable Performance: TuringData delivers world-class performance starting at half a terabyte per second (500 GB/s). Without this performance level, Madan argued, companies should not even attempt AI.
Legacy Integration: The platform does not force data ingestion. Instead, it provides AI visibility into existing data, whether it resides in the cloud or scattered across a data centre, enabling intelligent processing without costly migration.
Accelerated Inferencing: Products like the Turing Data Cache Fabric solve the KV cache problem, while Turing Data Flash harnesses NVMe power to speed up inferencing, delivering the ROI that enterprises need.

Real-World Results and Future Focus

TuringData is not a standalone proprietary vendor. Madan emphasised their growing consortium with semiconductor leaders such as NVIDIA, Dell, Lenovo, and SolidIM. This collaborative approach is already yielding results for some of the world’s largest telco operators, GPU cloud providers, hedge funds (enabling billion-dollar quant trades in milliseconds), and autonomous driving companies.

Looking ahead, TuringData is working closely with NVIDIA on the next inferencing architecture, known as CMX, specifically the CMX 3.5 layer. Standards are still evolving, but the goal is clear: to make inferencing faster, smarter, and better orchestrated as AI adoption continues to explode.

NVIDIA® CMX™ context memory storage is an AI‑native context tier for long‑context, multi‑turn, and agentic AI inference. Powered by the NVIDIA BlueField®‑4 storage processor, it extends GPU memory with a shared, pod‑level context tier optimized for ephemeral key-value (KV) cache. The platform provides a high‑bandwidth path that reduces latency, cost, and power overhead for large-scale inference workloads, helping deliver higher throughput and better power efficiency on NVIDIA Rubin platforms.

The Takeaway

The AI era has arrived, but its full potential remains locked behind a data infrastructure bottleneck. As Nikhil Madan concluded, the industry needs to stop blaming GPUs and start fixing data I/O. With its high-speed platform, legacy integration, and future-focused consortium, TuringData is building the unified data platform that will power the AI factory era.

Visit their booth at the summit to see how half a terabyte per second can transform your AI deployments.

TuringData: Powering the AI Factory Era with High-Speed Data Infrastructure

The Numbers Don’t Lie: A Trillion Dollar Reality