ML@GT Seminar Series | Lessons from Pre-training Llama 3

Featuring Mike Lewis, Facebook AI Research

Abstract: Large language models have revolutionized artificial intelligence, but many details of their creation remain shrouded in mystery due to their cost and commercial value. I will describe the pre-training of Llama 3, a highly competitive open model. Research in pre-training is challenging, due to the need to accurately make many decisions based on ablations at a scale orders of magnitude below the final model size. However, this project demonstrates that a state-of-the-art model can be created with a surprisingly simple recipe, based around carefully optimizing data curation, building efficient infrastructure, and minimizing complexity elsewhere. I will also contrast life in a large pre-training research team with more academic projects, and discuss outstanding research questions in the field.

Bio: Mike Lewis is a research scientist at Meta, currently leading pre-training research for the Llama models. Research interests include pre-training language models (e.g. Llama 3, Bart and Roberta), retrieval augmentation (e.g. kNN-LM and RAG) and negotiation dialogue agents (such as the Cicero Diplomacy model). Previously he was a postdoc at the University of Washington (working with Luke Zettlemoyer), and  has a PhD from the University of Edinburgh (advised by Mark Steedman). He received a Best Paper Award at EMNLP 2016, Best Resource Paper at ACL 2017, and Best Paper Honourable Mention at ACL 2018. His work has been extensively covered in the media, with varying levels of accuracy.

Event Details

Date/Time:

  • Wednesday, November 20, 2024
    12:00 pm - 1:00 pm
Location: CODA 9th Floor Atrium

Related Media

Click on image(s) to view larger version(s)

  • 2024.1120 ML Seminar Announcement-Mike Lewis.jpg

For More Information Contact

Shelli Hatcher, Program and Operations Manager

Related Links