June 25, 2024 The Hibernia San Francisco, CA
Jam-Packed Agenda
- Morning Session
- Autonomous Systems & Robotics
- Foundational/LLMs/GenAI
- Lightning Talks
Morning Session
-
10:00AM - 10:30AM
NEW QUALITY STANDARDS FOR AUTONOMOUS DRIVING
Fireside chat featuring Mo Elshenawy, President and CTO of Cruise Automation, and Mohamed Elgendy, CEO and Co-founder of Kolena. In this discussion, Mo Elshenawy will delve into the comprehensive AI philosophy that drives Cruise Automation. He will share unique insights into how Cruise is developing its quality standards from the ground up, with a particular focus on defining and achieving “perfect” driving. This fireside chat offers valuable perspectives on the rigorous processes involved in teaching autonomous vehicles to navigate with precision and safety.
-
Mo Elshenawy
President & CTO, Cruise
-
-
10:30AM - 11:00AM
AI and Government Regulation
Gerrit De Vynck from the Washington Post will be moderating a panel that will delve into NIST, government-implemented standards, and their roles in developing AI.
-
Gerrit De Vynck
Tech Reporter, The Washington Post
-
-
11:30AM-12:00PM
The future of trust in LLMs
Richard Socher, CEO and founder of You.com and AIX Ventures will share insights from his journey of a decade in AI and NLP: from the invention of prompt engineering to founding You.com, an AI Assistant that was the first to integrate an LLM with live web access for accurate, up-to-date answers with citations. Richard will discuss tackling the biggest challenges facing LLMs, from hallucinations to generic responses. Gain insight into the potential for these advancements to be adopted by other LLM-based platforms.-
Richard Socher
CEO & Founder, You.com
-
-
12:00PM - 12:30PM
The dollars and cents behind the AI VC boom
Natasha will be moderating a panel of leading VCs who have backed the top AI companies and understand the correction within the boom, flight to quality and what happens when OpenAI eats your lunch, how founders should think about giving big tech a spot on their cap tables, & generally how to invest at the speed of innovation right now.
-
Natasha Mascarenhas
Natasha Mascarenhas, Reporter, The Information
-
-
TBD
Evaluation of ML Systems in the Real World
Evaluation seeks to assess the quality, reliability, latency, cost, and generalizability of ML systems, given assumptions about operating conditions in the real world. That is easier said than done! This talk presents some of the common pitfalls that ML practitioners ought to avoid and makes the case for tying model evaluation to business objectives.
-
Mohamed El-Geish
CTO & Co-Founder, Monta AI
-
Autonomous Systems & Robotics
-
2:30PM - 3:00PM
Eighty-Thousand Pound Robots: AI Development & Deployment at Kodiak Speed
Kodiak is on a mission to automate the driving of every commercial vehicle in the world. Today, Kodiak operates a nationwide autonomous trucking network 24x7x365, on the highway, in the dirt, and everywhere in between. We also release and deploy software about 30 times per day across this fleet that is not just mission critical, but also safety critical. Our AI development process must match this criticality and speed, providing fast engineering iteration while guaranteeing a high level of quality that is the requirement of safety. In this talk, we’ll share the details of that process, from how the system is architected, trained, and evaluated, to the validation CICD pipeline, which is the lifeblood of the development flywheel. We’ll talk about how we collect cases, how we iterate models, and how we do quality assurance, data, and release management - all in a way that seamlessly keeps our robots truckin’ across the US.
-
Collin Otis
Director of Autonomy, Kodiak Robotics
-
-
TBA
More talks coming soon!
Foundational Models, LLMs, and GenAI
-
1:00 PM - 1:30 PM
TO RAG OR NOT TO RAG?
Retrieval-Augmented-Generations (RAG) is a powerful technique to reduce hallucinations from Large Language Models (LLMs) in GenAI applications. However, large context windows (e.g. 1M tokens for Gemini 1.5 pro) can be a potential alternative to the RAG approach. This talk contrasts both approaches and highlights when Large Context Window is a better option thank RAG, and vice-versa.
-
AMR AWADALLAH
CEO, Co-Founder, Vectara
-
-
1:30PM - 2:00PM
THE ERA OF GENERATIVE AI
Weights &Biases CEO and Co-Founder Lukas Biewald will share his perspective on the Generative AI industry: where we've come from, where we are today, and where we're headed.
-
Lukas Biewald
Co-founder & CEO, Weights and Biases
-
-
2:00PM - 2:30PM
A BLUEPRINT FOR SCALABLE & RELIABLE ENTERPRISE AI/ML SYSTEMS
Enterprise AI leaders continue to explore the best productivity solutions that solve business problems, mitigate risks and increase efficiency. Building reliable and secure AI/ML systems requires following industry standards, an operating framework, and best practices that can accelerate and streamline the scalable architecture that can produce expected business outcomes.
This session, featuring veteran practitioners, focuses on building scalable, reliable and quality AI and ML systems for the enterprises.-
Hira Dangol
VP AI/ML & Automation, Bank of America
-
Rama Akkiraju
VP Enterprise AI/ML, NVIDIA
-
Nitin Aggarwal
Head of AI Services, Google
-
Steven Eliuk
VP AI & Governance, IBM
-
-
2:30PM - 3:00PM
If you like sentences so much, name every single sentence
What do AI models see when the read and generate text and images? What are the units of meaning they use to understand the world? I’ll share some encouraging updates from my continuing exploration of how models process its input and generate data, enabled by recent breakthroughs in interpretability research. I’ll also discuss and share some demos of how this work opens up possibilities of radically different, more natural interfaces for working with generative AI models.
-
Linus Lee
Research Engineer, Notion
-
-
3:00PM-3:30PM Talk
The New AI Stack with Foundation Models
How has the ML engineering stack changed with foundation models? While the generative AI landscape is still rapidly evolving, some patterns have emerged. This talk discusses these patterns. Spoilers: the principles of deploying ML models into production remain the same, but we’re seeing many new challenges and new approaches. This talk is the result of Chip Huyen's survey of 900+ open source AI repos and discussions with many ML platform teams, both big and small.
-
Chip Huyen
VP of AI & OSS, Voltron Data
-
-
3:30PM - 4:00PM
SIMPLE, PROVEN METHODS FOR IMPROVING AI QUALITY IN PRODUCTION
In this talk, Shreya will share a candid look back at a year dedicated to developing reliable AI tools in the open-source community. The talk will explore which tools and techniques have proven effective and which ones have not, providing valuable insights from real-world experiences. Additionally, Shreya will offer predictions on the future of AI tooling, identifying emerging trends and potential breakthroughs. This presentation is designed for anyone interested in the practical aspects of AI development and the evolving landscape of open-source technology, offering both reflections on past lessons and forward-looking perspectives.
-
Shreya Rajpal
CEO, Guardrails AI
-
-
4:00PM - 4:30PM
FROM PREDICTIVE TO GENERATIVE: UBER'S JOURNEY
Today, Machine Learning (ML) plays a key role in Uber’s business, being used to make business-critical decisions like ETA, rider-driver matching, Eats homefeed ranking, and fraud detection. As Uber’s centralized ML platform, Michelangelo has been instrumental in driving Uber’s ML evolution since it was first introduced in 2016. It offers a set of comprehensive features that cover the end-to-end ML lifecycle, empowering Uber’s ML practitioners to develop and productize high-quality ML applications at scale.
-
Kai Wang
Lead PM, AI Platform, UBER
-
Raajay Viswanathan
Software Engineer, UBER
-
-
4:30PM - 5:00PM
Integrating LLMs into products
Learn about best practices when integrating Large Language Models (LLMs) into product development. We will discuss the strengths of modern LLMs like Claude and how they can be leveraged to enable and enhance various applications. The presentation will cover simple prompting strategies and design patterns that facilitate the effective incorporation of LLMs into products.
-
Emmanuel Ameisen
Research Engineer, Anthropic
-
-
5:00PM - 5:30PM
BUILDING SAFER AI: BALANCING DATA PRIVACY WITH INNOVATION
The balance between AI innovation and data security and privacy is a major challenge for ML practitioners today. In this talk, I’ll discuss policy and ethical considerations that matter for those of us building ML and AI solutions, in particular around data security, and describe ways to make sure your work doesn’t create unnecessary risks for your organization. It is possible to create incredible advances in AI without risking breaches of sensitive data or damaging customer confidence, by using planning and thoughtful development strategies.
-
Stephanie Kirmer
Senior Machine Learning Engineer, DataGrail
-
Lightning Talks
-
1:00PM
Self-improving RAG
Higher quality retrieval isn't just about more complex retrieval techniques. Using user feedback to improve model results is a tried and true technique from the ancient days of *checks notes* recommender systems. And if you know something about the pattern about your data and user queries, even synthetic data can produce fine-tuned models that significantly improve retrieval quality
-
Chang She
CEO / Co-founder, LanceDB
-
-
2:00PM
Building Robust and Trustworthy Gen AI Products: A Playbook
A practitioner's take on how you can consistently build robust, performant, and trustworthy Gen. AI products, at scale. The talk will touch on different parts of the Gen. AI product development cycle covering the must-haves, the gotchas, and insights from existing products in the market.
-
Faizaan Charania
Senior Product Manager, ML, LinkedIn
-
-
2:30PM
Beyond Benchmarks: Measuring Success for Your AI Initiatives
Join us as we move beyond benchmarks and explore a more nuanced take on model evaluation and its role in the process of specializing models. We'll discuss how to ensure that your AI model development aligns with your business objectives and results, while also avoiding common pitfalls that arise when training and deploying. We'll share tips on how to design tests and define quality metrics, and provide insights into the various tools available for evaluating your model at different stages in the development process.
-
Salma Mayorquin
CEO, Remyx AI
-
-
3:30PM
BUILDING ADVANCED QUESTION-ANSWERING AGENTS OVER COMPLEX DATA
Large Language Models (LLMs) are revolutionizing how users can search for, interact with, and generate new content, leading to a huge wave of developer-led, context-augmented LLM applications. Some recent stacks and toolkits around Retrieval-Augmented Generation (RAG) have emerged, enabling developers to build applications such as chatbots using LLMs on their private data.
However, while setting up basic RAG-powered QA is straightforward, solving complex question-answering over large quantities of complex data requires new data, retrieval, and LLM architectures. This talk provides an overview of these agentic systems, the opportunities they unlock, how to build them, as well as remaining challenges.
-
Jerry Liu
CEO, LlamaIndex
-
-
4:00PM
AIOps, MLOps, DevOps, Ops: Enduring Principles and Practices
It may be hard to believe, but AI apps powered by big Transformers are not actually the first complex system that engineers have dared to try to tame. In this talk, I will review one thread in the history of these attempts, the "ops" movements, beginning in mid-20th-century Japanese factories and passing, through Lean startups and the leftward shift of deployment, to the recent past of MLOps and the present future of LLMOps/AIOps. I will map these principles, from genchi genbustu and poka yoke to observability and monitoring, onto emerging practices in the operationalization of and quality management for AI systems.
-
Charles Frye
AI Engineer, Modal Labs
-
-
4:30PM
Less is not more: How to serve more models efficiently
While building content generation platforms for filmmakers and marketers, we learnt that professional creatives need personalized on-brand AI tools. However, running generative AI models at scale is incredibly expensive and most models suffer from throughput and latency constraints that have negative downstream effects on product experiences. Right now we are building infrastructure to help organizations developing generative AI assets train and serve models more cheaply and efficiently than was possible before, starting with visual systems.
-
Julia Turc
Co-CEO, Storia-AI
-
-
5:00PM
Redefining Code Quality in an Increasingly AI-first World
How do you enforce code quality in large codebases?
Static analysis tools like
eslint
are an invaluable resource for doing this at the AST level, but this is really just table stakes.What about at the architecture level? What about higher-level best practices that have an outsized impact on your program’s correctness, security, performance, and perhaps most importantly, your team’s ability to ship fast without breaking things?
These are all areas where we rely on senior engineers to manually enforce best practices during code reviews, which is an inefficient & error-prone use of their time.
So the question becomes: can we use AI to better enforce code quality, and if so, what would an ideal solution look like?
This talk introduces GPTLint, a fundamentally new, open source approach to code quality, and will walk you through everything you need to know to give your senior engineers a much needed break.
-
Travis Fischer
Founder, Agentic
-