Visualizing Deep Learning Models at Facebook
This post summarizes the latest joint research between researchers at Georgia Tech and Facebook on using visualization to make sense of deep learning models, published at IEEE VIS’17, a top visualization conference.
While powerful deep learning models have significantly improved prediction accuracy, understanding these models remains a big challenge. Deep learning models are more difficult to interpret than most existing machine learning models, because of its nonlinear structures and huge number of parameters. Thus, in practice, people often use them as “black boxes”, which could be detrimental because when the models do not perform satisfactorily, users would not understand the causes or know how to fix them.
Visualization has recently become a popular means for interpreting such complex deep learning models. Data visualization and visual analytics help people make sense of data and discover insights by effectively transforming abstract data into meaningful visual representations and making it interactive. Deep learning models can be visualized by presenting intermediate data produced from models (e.g., activation, weights) or revealing relationships between datasets and results from models. With such visualization, users can better understand why and how the models work to produce results for their datasets. There are several visualization tools developed and available, including TensorBoard and Embedding Projector by Google’s Big Picture Group, Deep Visualization ToolBox, and so on.
You can check out our survey paper here.
Despite the increasing interest in visualization for deep learning interpretation, the complexity of large-scale models and datasets used in industry, like Facebook, pose unique design challenges. For example, in designing tools for real-world deployment, it is a high priority that the tools be flexible and scalable, adapting to the wide variety of models and datasets used. These observations motivate us to design and develop ActiVis, a visual analytics system for industry-scale deep neural network models.
Participatory Design Process
To learn user’s actual needs, we have conducted participatory design sessions with over 15 Facebook engineers, researchers, and data scientists across multiple teams. From these sessions, we identified six key design challenges — for data, model, and analytics — that have not been adequately addressed by existing deep learning visualization tools. The challenges include the need to support:
- Diverse input data sources
- High data volume
- Complex model architecture
- A great variety of models
- Diverse subset definitions for analytics
- Both instance- and subset-level analyses
These challenges shape the main design goals of ActiVis.
Based on the design challenges we identified, we designed and developed ActiVis, a visual analytics system for deep neural network models, now deployed on Facebook’s machine learning platform. ActiVis’s main contributions include:
- A novel visual representation that unifies instance- and subset-level inspections of neuron activation, facilitating comparison of activation patterns for multiple instances.
- An interface that tightly integrates an overview of graph-structured complex models and local inspection of neuron activations, allowing users to explore the model at different levels of abstraction.
- A deployed system scaling to large datasets and models.
Here’s what the ActiVis interface looks like:
ActiVis consists of multiple coordinated views for users to get a high-level overview of the model from which they can drill down to perform localized inspection of activations. ActiVis visualizes how neurons are activated by user-specified instances or instance subsets, to help users understand how a model derives its predictions. The subsets can be flexibly defined using data attributes, features, or output results (e.g., a set of documents that contain a particular word; a set of instances whose value for feature A is greater than 0.5), enabling model inspection from multiple angles. While many existing deep learning visualization tools support instance-level exploration (i.e., how individual instances contribute to a model’s accuracy), ActiVis is the first tool that simultaneously supports instance- and subset-level exploration. It is especially beneficial when dealing with huge datasets in industry, which may consist of millions or billions of data points. By exploring instance subsets and enabling their comparison with individual instances, users can learn how them models respond to many different slices of the data.
Deployment on Facebook’s ML Platform
We have deployed ActiVis on FBLearner Flow, Facebook’s machine learning platform. Developers who want to use ActiVis for their model can easily do so by adding only a few lines of code, which instructs their models’ training process to generate information needed for visualization. ActiVis users at Facebook (e.g., data scientists) can then train models and use ActiVis via FBLearner Flow’s web interface, without writing any additional code.
Case Studies with Potential Users at Facebook
To better understand how ActiVis may help users with their interpretation of deep neural network models, we recruited Facebook engineers and data scientists to use the latest version of ActiVis to explore text classification models relevant to their work. Here are key observations from these studies:
- Spot-checking models with “test cases”
- ActiVis helps them for spot-checking models with their test cases. Engineers often have test cases for their datasets, and ActiVis helped them to check if a model works for their test cases.
- Graph architecture view as entry point
- Our computation graph view was especially helpful for people who are less familiar with new deep learning models. ActiVis helps them understand the models first and dive into activation details.
- Debugging hints from activation patterns
- ActiVis reveals patterns, and users were able to get some hints for further improving their models. For example, if some neurons were not activated at all, they think they may decrease the number of neurons.
ActiVis is a visual analytics system for deep neural network models, deployed on Facebook’s machine learning platform. From participatory design session with researchers and engineers across many teams at Facebook, we identified key design challenges. Based on them, we developed ActiVis that unifies instance- and subset-level exploration and tightly integrates model architecture and localized activation inspection. Our case studies indicate that ActiVis help users explore and understand the complex deep learning models, specifically for spot-checking models, understanding architecture, and obtaining debugging hints.
For more information, please check out our full version of the ActiVis paper, our project webpage, demo video, and presentation slides:
- Paper published in IEEE TVCG Journal
- Project webpage with videos and presentation slides
- Demo video (2 min)
ICLR 2018 accepted papers and ML@GT
The list of accepted papers at ICLR 2018 was released last week and Machine Learning at Georgia Tech (ML@GT) had a strong presence. Out of 935 submissions, 23 oral and 314 conference papers were accepted (roughly 36%). We are pleased to announce that Georgia Tech had 10 conference papers this year, with 1 of them being oral (2% acceptance) and 2 others within the top 100, as well as 1 additional workshop paper.
This brings GT into the Top 15 among institutions and if you only consider academic institutions it is 9th on the list (note that this is a very conservative estimate as this analysis seems to miss many GT papers). This is a testament to Georgia Tech’s strong research in ML, and we see this presence increasing significantly given the ML Ph.D. program and the number of new faculty hires in this area every year.
The list of accepted ICLR 2018 papers with Georgia Tech affiliation is below.
- Deep Mean Field Games for Learning Optimal Behavior Policy of Large Populations (oral). Jiachen Yang (Georgia Tech), Xiaojing Ye (Georgia State), Rakshit Trivedi (Georgia Tech), Huan Xu (Georgia Tech), Hongyuan Zha (Georgia Tech)
- Boosting the Actor with Dual Critic. Bo Dai (Georgia Tech), Albert Shaw (Georgia Tech), Niao He (UIUC), Lihong Li (Google), Le Song (Georgia Tech)
- Cascade Adversarial Machine Learning Regularized with a Unified Embedding. Taesik Na (Georgia Tech), Jong Hwan Ko (Georgia Tech), Saibal Mukhopadhyay (Georgia Tech)
- Learning to Cluster in order to Transfer Across Domains and Tasks (top 100). Yen-Chang Hsu (Georgia Tech), Zhaoyang Lv (Georgia Tech), Zsolt Kira (Georgia Tech Research Institute)
- Multi-Agent Compositional Communication Learning from Raw Visual Input. Edward Choi (Georgia Tech), Angeliki Lazaridou (Deep Mind), Nando de Freitas (University of Oxford)
- Generative Models of Visually Grounded Imagination (top 100). Ramakrishna Vedantam (Georgia Tech), Ian Fischer (Google), Jonathan Huang (Google), Kevin Murphy (Google)
- Initialization matters: Orthogonal Predictive State Recurrent Neural Networks. Krzysztof Choromanski (Google), Carlton Downey (CMU), Byron Boots (Georgia Tech)
- Neural-Guided Deductive Search for Real-Time Program Synthesis from Examples. Ashwin Kalyan (Georgia Tech), Abhishek Mohta (MSR), Oleksandr Polozov (Univ. of Washington), Dhruv Batra (Georgia Tech), Prateek Jain (UT Austin), Sumit Gulwani (MSR)
- Syntax-Directed Variational Autoencoder for Structured Data. Hanjun Dai (Georgia Tech), Yingtao Tian (State University of NY), Bo Dai (Georgia Tech), Steven Skiena (State University of NY), Le Song (Georgia Tech)
- Truncated Horizon Policy Search: Deep Combination of Reinforcement and Imitation. Wen Sun (CMU), J. Andrew Bagnell (CMU), Byron Boots (Georgia Tech)
- Ensemble Robustness and Generalization of Stochastic Deep Learning Algorithms (workshop). Tom Zahavy (Technion), Bingyi Kang (National University of Singapore), Alex Sivak (Technion), Jiashi Feng ((National University of Singapore), Huan Xu (Georgia Tech), Shie Mannor (Technion)