Loading…
In-person + Virtual
November 6-9
Learn More and Register to Attend

The Sched app allows you to build your schedule but is not a substitute for your event registration. You must be registered for KubeCon + CloudNativeCon North America 2023 to participate in the sessions. If you have not registered but would like to join us, please go to the event registration page to purchase a registration.

Please note: This schedule is automatically displayed in Central Standard Time (UTC -6). To see the schedule in your preferred timezone, please select from the drop-down menu to the right, above "Filter by Date." The schedule is subject to change.
Wednesday November 8, 2023 11:55am - 12:30pm CST


Deploying LLMs is challenging. This talk is a case study in how cloud native technologies, specifically Kubernetes and OCI artifacts, simplifies private LLM deployments. Allowing teams to run models in their infrastructure solves significant data governance & security challenges. However, it is still difficult to efficiently share large artifacts between model developers and model consumers. Autumn and Marwan share how open standards unblocked challenges and simplified LLM delivery. First, we explore how Kubernetes made it possible to rapidly deliver a highly portable, cloud-native inference stack. Second, OCI Artifacts have been underutilized as a delivery mechanism for artifacts beyond container images. We explore how we achieved significant efficiency gains by reducing duplicate storage, increasing download speed, and minimizing governance overhead. Walk away learning how you can leverage Kubernetes and OCI in your MLOps journey.
Speakers
avatar for Autumn  Moulder

Autumn Moulder

Director of Infrastructure & Security, Cohere
Autumn is the Director of Infrastructure & Security at Cohere. She’s been with the company since September 2022 scaling teams & tools. Prior to buying into the startup life, she spent 3 years in financial services and 14 years at a large non-profit. Her passion is helping innovative... Read More →
avatar for Marwan Ahmed

Marwan Ahmed

Member of Technical Staff, Cohere
Marwan is a Member of Technical Staff on the Infrastructure team at Cohere. He has contributed to several Kubernetes projects since 2018, most notably Cluster API Azure and Cluster Autoscaler. He has previously worked at Twitter on the Distributed Coordination team and Microsoft on... Read More →
Wednesday November 8, 2023 11:55am - 12:30pm CST
W185 (Ground Level)
  ML/AI + Data Processing + Storage
Feedback form is now closed.

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Share Modal

Share this link via

Or copy link