OpenAI and Broadcom have launched Jalapeño, their first custom AI inference processor built to improve efficiency and reduce reliance on Nvidia chips.
AI inference startup Modal Labs is reportedly in talks to raise new funding at a valuation of about $2.5 billion.
Yotta and Mirror Security have launched encrypted AI inference as a service, aiming to secure sensitive enterprise data during AI model deployment.
Nvidia invests $150 million in AI inference startup Baseten, highlighting the growing focus on scalable infrastructure for deploying AI models in real-world enterprise applications.
Akamai to establish an AI inference cloud in India, focusing on medium-sized AI models, local innovation, and scalable cloud-edge infrastructure.
Positron AI raises $51.6M in Series A funding to scale energy-efficient AI inference hardware, backed by Intel Capital, Celesta, and M12.