Loading…
In-person + Virtual
November 6-9
Learn More and Register to Attend

The Sched app allows you to build your schedule but is not a substitute for your event registration. You must be registered for KubeCon + CloudNativeCon North America 2023 to participate in the sessions. If you have not registered but would like to join us, please go to the event registration page to purchase a registration.

Please note: This schedule is automatically displayed in Central Standard Time (UTC -6). To see the schedule in your preferred timezone, please select from the drop-down menu to the right, above "Filter by Date." The schedule is subject to change.
Back To Schedule
Wednesday, November 8 • 3:25pm - 4:00pm
Kube of Thoughts – Scaling Generative AI Models with Kubernetes and Inference Decision Trees - Asheesh Goja, Amazon Web Services

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Feedback form is now closed.
Generative AI technologies such as LLMs are accelerating the pace of innovation. However, using such models come at a steep computational cost and is a major barrier to entry for many businesses and researchers. This talk proposes a novel approach to improve LLM efficiency, for complex reasoning tasks, by composing containerized LLMs as an inference decision tree called Tree Of Thoughts (ToT) and orchestrating it using various K8S ecosystem projects. ToT is an inference framework in combinatorial problem space, where each node represents a specific train of thought. This approach improves inference efficiency by refining prompts from the initial inputs through ‘conversations’ between models. All this sounds abstract, so using CNCF projects such as Operator Framework, Redis, Neo4j, and KEDA you will learn how to build a concrete ToT implementation. With real time visualization of graphs of intermediate LLM prompts and thoughts, you will see how ToT on Kubernetes enhances the computational efficiency of LLM inferencing.

Speakers
avatar for Asheesh Goja

Asheesh Goja

Principal Solutions Architect, Amazon Web Services
Asheesh Goja is a Principal Solutions Architect at Amazon Web Services. Previously, he worked at Cisco and UPS to drive cloud native adoption and innovation by way of ideation, co-design, incubation and venture products. He holds several hardware and software patents that include... Read More →



Wednesday November 8, 2023 3:25pm - 4:00pm CST
W185 (Ground Level)
  ML/AI + Data Processing + Storage