| Page 176 | Kisaco Research

Ashu Dubey

Founder & CEO

Alhena AI

Jason Rogers

CEO

Invary

Alexandra Thornton

Investor

Felix Capital

Rohit Kinra

SVP & GM, Hyperscale

Iron Mountain

Rohit Kinra

SVP & GM, Hyperscale

Iron Mountain

Rohit Kinra

SVP & GM, Hyperscale

Iron Mountain

Optimized Inference Infrastructure: MoAI Inference Framework: Powering the Fastest Serving of the New AI Era

As AI evolves into agentic systems where dozens to hundreds of LLMs and specialized models work in concert, running them efficiently in data centers poses immense software challenges. From disaggregating massive LLMs and aggregating smaller models to dynamically scheduling diverse and unpredictable user requests, every layer requires precise optimization. Inference workloads must navigate continuous growth and rapid innovation, pushing the limits of conventional software stacks. Just as DeepSeek redefined its entire infrastructure beyond CUDA, Moreh is partnering with leading LLM players to deliver the fastest, most advanced distributed inference framework on AMD and Tenstorrent.

Optimized Infrastructure

Inferencing

Author:

Gangwon Jo

CEO

Moreh

Gangwon Jo is the co-founder and CEO of Moreh Inc., a startup funded by AMD that develops optimized infrastructure software, enabling more efficient and flexible AI infrastructure at scale. Moreh’s software supports AI workloads on AMD as well as other accelerator platforms, such as Tenstorrent.

Gangwon was the chief architect of “Chundoong,” a supercomputer built using AMD consumer GPUs, which was recognized as one of the world’s TOP500 supercomputers in 2012.

In 2022, Gangwon was named one of MIT Technology Review’s “Innovators Under 35.” He holds a Ph.D. in Electrical and Computer Engineering from Seoul National University.

Author:

Xiaotong Jiang

Infrastructure Engineer

Databricks

Xiaotong is an infrastructure engineer at Databricks with a background in database optimization. Her current focus is LLM inference optimization as an SGLang contributor

Read more about Optimized Inference Infrastructure: MoAI Inference Framework: Powering the Fastest Serving of the New AI Era

George Song

Principal

Strand Equity