Notion migrated from Spark on EMR to Ray, cutting embedding costs 80% and improving query latency 10x. Uber and Salesforce shared similar AI infrastructure winsNotion migrated from Spark on EMR to Ray, cutting embedding costs 80% and improving query latency 10x. Uber and Salesforce shared similar AI infrastructure wins

Notion Slashes AI Embedding Costs 80% After Ditching Spark for Ray

For feedback or concerns regarding this content, please contact us at crypto.news@mexc.com

Notion Slashes AI Embedding Costs 80% After Ditching Spark for Ray

James Ding Apr 09, 2026 16:48

Notion migrated from Spark on EMR to Ray, cutting embedding costs 80% and improving query latency 10x. Uber and Salesforce shared similar AI infrastructure wins.

Notion Slashes AI Embedding Costs 80% After Ditching Spark for Ray

Notion has slashed its AI embedding pipeline costs by more than 80% after migrating from Apache Spark to Ray, the distributed computing framework backed by Anyscale. The productivity software company also achieved 10x improvements in query latency while consolidating three separate jobs per region into one.

The migration details emerged at Ray Day Seattle on April 9, 2026, where ML engineers from Notion, Uber, Salesforce, and Apple shared hard-won lessons about scaling AI infrastructure.

What Notion Actually Changed

Mickey Liu, a software engineer on Notion's search platform team, walked through the overhaul. Their original setup used a three-step Spark pipeline running on Amazon EMR: data chunking, third-party API calls for embedding generation, and writes to a vector store.

The pain points were predictable but severe. Double compute costs. Third-party API rate limits throttling throughput. Debugging nightmares when failures occurred across tools—driver and executor logs weren't even persisted in YARN.

The new architecture streams Kafka data directly into a Ray cluster handling CPU chunking, GPU embedding generation, and vector store writes in a single pipeline. No intermediate S3 handoffs. What started as the backend for a Q&A feature in 2023 now powers all of Notion AI and custom agents.

Uber and Salesforce Report Similar Gains

Uber's Peng Zhang detailed how their Michelangelo ML platform evolved from TensorFlow/Horovod to Ray with PyTorch. The standout move: separating CPU data-loading nodes from GPU training nodes in a heterogeneous cluster design. Result? GPU utilization jumped 20%, and training time dropped roughly 50% in select pipelines.

Salesforce tackled a different beast—summarizing documents up to 200,000 tokens long (roughly a short novel) with P95 latency under 15 seconds. Their team used Ray to chunk documents and run parallel inference across a distributed actor pool with vLLM, then merge results. They landed on 1-2 GPU data parallelism as the sweet spot after running scaling experiments directly on Ray.

Why This Matters Beyond These Companies

Robert Nishihara, Ray's co-creator and Anyscale co-founder, opened the event by framing the core problem: AI infrastructure keeps getting harder. Multimodal data processing, reinforcement learning workloads, and multi-node LLM inference are pushing existing tools past their limits.

Every speaker landed on the same conclusion from different angles—their previous tooling ran out of road.

Apple engineers Charlie Chen and Haocheng Bian highlighted foundation model training challenges: massive unstructured data, billion-plus parameters, and sparse architectures like Mixture of Experts. Traditional engines fail because data pipelines and training frameworks run in separate environments with no shared context.

What's Next

Ray Day Seattle kicked off Anyscale's 2026 "Ray on the Road" tour—eight cities across three countries. The company is also running invite-only customer roundtables at each stop to preview their product roadmap.

For teams hitting similar walls with Spark or other distributed frameworks, Notion's full technical writeup is available on their engineering blog under "Two Years of Vector Search at Notion." The 80% cost reduction and 10x latency improvement offer a concrete benchmark for anyone evaluating similar migrations.

Image source: Shutterstock
  • ai infrastructure
  • ray
  • machine learning
  • enterprise tech
  • cost optimization
Market Opportunity
Raydium Logo
Raydium Price(RAY)
$0.6226
$0.6226$0.6226
-0.60%
USD
Raydium (RAY) Live Price Chart

World Cup Combo: Aim for 200x

World Cup Combo: Aim for 200xWorld Cup Combo: Aim for 200x

Combine up to 20 World Cup matches in one order

Disclaimer: The articles reposted on this site are sourced from public platforms and are provided for informational purposes only. They do not necessarily reflect the views of MEXC. All rights remain with the original authors. If you believe any content infringes on third-party rights, please contact crypto.news@mexc.com for removal. MEXC makes no guarantees regarding the accuracy, completeness, or timeliness of the content and is not responsible for any actions taken based on the information provided. The content does not constitute financial, legal, or other professional advice, nor should it be considered a recommendation or endorsement by MEXC.

You May Also Like

CME Group to launch Solana and XRP futures options in October

CME Group to launch Solana and XRP futures options in October

The post CME Group to launch Solana and XRP futures options in October appeared on BitcoinEthereumNews.com. CME Group is preparing to launch options on SOL and XRP futures next month, giving traders new ways to manage exposure to the two assets.  The contracts are set to go live on October 13, pending regulatory approval, and will come in both standard and micro sizes with expiries offered daily, monthly and quarterly. The new listings mark a major step for CME, which first brought bitcoin futures to market in 2017 and added ether contracts in 2021. Solana and XRP futures have quickly gained traction since their debut earlier this year. CME says more than 540,000 Solana contracts (worth about $22.3 billion), and 370,000 XRP contracts (worth $16.2 billion), have already been traded. Both products hit record trading activity and open interest in August. Market makers including Cumberland and FalconX plan to support the new contracts, arguing that institutional investors want hedging tools beyond bitcoin and ether. CME’s move also highlights the growing demand for regulated ways to access a broader set of digital assets. The launch, which still needs the green light from regulators, follows the end of XRP’s years-long legal fight with the US Securities and Exchange Commission. A federal court ruling in 2023 found that institutional sales of XRP violated securities laws, but programmatic exchange sales did not. The case officially closed in August 2025 after Ripple agreed to pay a $125 million fine, removing one of the biggest uncertainties hanging over the token. This is a developing story. This article was generated with the assistance of AI and reviewed by editor Jeffrey Albus before publication. Get the news in your inbox. Explore Blockworks newsletters: Source: https://blockworks.co/news/cme-group-solana-xrp-futures
Share
BitcoinEthereumNews2025/09/17 23:55
Gold Slips Toward $4,000 as Persistent Inflation Data Bolsters Higher Rate Expectations

Gold Slips Toward $4,000 as Persistent Inflation Data Bolsters Higher Rate Expectations

BitcoinWorld Gold Slips Toward $4,000 as Persistent Inflation Data Bolsters Higher Rate Expectations Gold prices edged lower in early trading, approaching the
Share
bitcoinworld2026/06/30 07:50
MARA deploys military veterans to patrol MRSM hostels in bullying crackdown

MARA deploys military veterans to patrol MRSM hostels in bullying crackdown

KUALA LUMPUR, June 30 — A total of 16 Malaysian Armed Forces (ATM) veterans will report for duty as full-time ward...
Share
Malaymail2026/06/30 08:47