Decentralized Training for Frontier AI

Decentralized Training for Frontier AI

Product Science provides the end-to-end orchestration for decentralized foundation model training. Our hardware-agnostic approach enables enterprises, research labs, and public institutions to train specialized models across fragmented, geo-distributed resources, leveraging everything from general-purpose GPUs to specialized ASICs, with configurable data sovereignty.

Bridging Research and Scalable Execution

Team

Product Science is a collective of researchers and engineers with deep expertise in distributed training and decentralized AI infrastructure. We are pioneering the future of decentralized AI. Having previously incubated Gonka, a high-growth decentralized network purposely built for AI training and inference, we are currently architecting the foundational protocols necessary to evolve beyond traditional GPU clusters into high-efficiency, trustless, and permissionless training environments.

Team

Product Science is a collective of researchers and engineers with deep expertise in distributed training and decentralized AI infrastructure. We are pioneering the future of decentralized AI. Having previously incubated Gonka, a high-growth decentralized network purposely built for AI training and inference, we are currently architecting the foundational protocols necessary to evolve beyond traditional GPU clusters into high-efficiency, trustless, and permissionless training environments.

Backed by

Frontier AI has a gatekeeper: the centralized data center

Substantial barriers to entry

Massive infrastructure requirements restrict frontier AI innovation to a handful of Big Tech corporations. The scarcity and cost of NVIDIA chips have become the single largest barrier to entry in frontier-grade AI training.

Substantial barriers to entry

Massive infrastructure requirements restrict frontier AI innovation to a handful of Big Tech corporations. The scarcity and cost of NVIDIA chips have become the single largest barrier to entry in frontier-grade AI training.

Chaos of scattered compute

Running training across geo-distributed, heterogeneous hardware introduces compounding engineering challenges: communication overhead at scale, fault tolerance across unstable nodes, and coordination without trusted intermediaries.

Chaos of scattered compute

Running training across geo-distributed, heterogeneous hardware introduces compounding engineering challenges: communication overhead at scale, fault tolerance across unstable nodes, and coordination without trusted intermediaries.

Fragmented Solutions

Сurrent tеraining landscape consists of specialized protocols that address isolated problems. None provides the end-to-end orchestration that frontier-scale decentralized training demands.

Fragmented Solutions

Сurrent tеraining landscape consists of specialized protocols that address isolated problems. None provides the end-to-end orchestration that frontier-scale decentralized training demands.

Moving from centralized clusters to global orchestration

Unifying fragmented hardware across the globe unlocks elastic capacity that bypasses the physical limits of individual data centers. Spanning general-purpose GPUs and specialized ASICs, allows market dynamics to regulate pricing and opens opportunities for hardware optimization that makes training fundamentally more efficient. Removing the requirement for specific, high-tier facilities allows organizations to retain greater control over where and when their workloads are deployed, aiding compliance with data sovereignty and IP requirements. By designing for heterogeneous hardware and diverse participants, fault tolerance becomes an inherent property of the system rather than a reactive patch. This creates a permissionless environment for frontier-scale training that remains resilient regardless of individual node stability or location.

Moving from centralized clusters to global orchestration

Unifying fragmented hardware across the globe unlocks elastic capacity that bypasses the physical limits of individual data centers. Spanning general-purpose GPUs and specialized ASICs, allows market dynamics to regulate pricing and opens opportunities for hardware optimization that makes training fundamentally more efficient. Removing the requirement for specific, high-tier facilities allows organizations to retain greater control over where and when their workloads are deployed, aiding compliance with data sovereignty and IP requirements. By designing for heterogeneous hardware and diverse participants, fault tolerance becomes an inherent property of the system rather than a reactive patch. This creates a permissionless environment for frontier-scale training that remains resilient regardless of individual node stability or location.

Moving from centralized clusters to global orchestration

Unifying fragmented hardware across the globe unlocks elastic capacity that bypasses the physical limits of individual data centers. Spanning general-purpose GPUs and specialized ASICs, allows market dynamics to regulate pricing and opens opportunities for hardware optimization that makes training fundamentally more efficient. Removing the requirement for specific, high-tier facilities allows organizations to retain greater control over where and when their workloads are deployed, aiding compliance with data sovereignty and IP requirements. By designing for heterogeneous hardware and diverse participants, fault tolerance becomes an inherent property of the system rather than a reactive patch. This creates a permissionless environment for frontier-scale training that remains resilient regardless of individual node stability or location.

Gonka Protocol

Infrastructure Validation at Scale

We started the Gonka project to dismantle the compute monopoly held by a few hyperscalers and to democratize access to high-performance AI infrastructure. Gonka is a decentralized AI infrastructure designed to optimize computational power specifically for AI model training and inference, offering a competitive alternative to traditional centralized cloud providers.

Infrastructure Validation at Scale

We started the Gonka project to dismantle the compute monopoly held by a few hyperscalers and to democratize access to high-performance AI infrastructure. Gonka is a decentralized AI infrastructure designed to optimize computational power specifically for AI model training and inference, offering a competitive alternative to traditional centralized cloud providers.

10,000 NVIDIA H100 GPUs

In just a few months, the Gonka Protocol has scaled to the equivalent power of 10,000 NVIDIA H100 GPUs and is currently serving inference for open-source LLMs. This unparalleled capacity allows us to redefine training orchestration at scale, paving the way for the next generation of frontier models to be trained entirely on global, distributed resources.

10,000 NVIDIA H100 GPUs

In just a few months, the Gonka Protocol has scaled to the equivalent power of 10,000 NVIDIA H100 GPUs and is currently serving inference for open-source LLMs. This unparalleled capacity allows us to redefine training orchestration at scale, paving the way for the next generation of frontier models to be trained entirely on global, distributed resources.

0

SEP

254

OCT

509

NOV

3,096

Dec

9,688

JAN 2026

11,646

Equivalent of H100

15K

10K

5K

0

SEP

254

OCT

509

NOV

3,096

Dec

9,688

JAN 2026

11,646

Equivalent of H100

15K

10K

5K

Research

We design systems for conditions where:

Computation is geo-distributed, and communication is constrained by the open internet. The environment is trustless and adversarial. There is no single point of control.

We design systems for conditions where:

Computation is geo-distributed, and communication is constrained by the open internet. The environment is trustless and adversarial. There is no single point of control.

Research Seminar

Dr. Eduard Gorbunov

MBZUAI

Monday, Dec 1, 2025

Byzantine-Tolerant Distributed Training

Abstract: Based on the ICML paper Secure Distributed Training at Scale. What happens when you train a big model over many untrusted nodes – and some of them start lying? We discuss making sure the learning process still works in that world.

Dr. Eduard Gorbunov

MBZUAI

Monday, Dec 1, 2025

Byzantine-Tolerant Distributed Training

Abstract: Based on the ICML paper Secure Distributed Training at Scale. What happens when you train a big model over many untrusted nodes – and some of them start lying? We discuss making sure the learning process still works in that world.

Contact us for collaboration

Join the newsletter

Product Science, INC. All rights reserved. 2026

Product Science, INC.
All rights reserved. 2026