Colloquia 2018-2019

Friday, June 28, 2019 - 10:00am - Gould Simpson (GS) 906

Speaker: Torsten Hoefler, Ph.D.

Title: "Demystifying, Optimizing, and Benchmarking Large-Scale Deep Learning"

Abstract: We introduce schemes to optimize communication in deep learning workloads. For this, we use properties of the standard SGD algorithm that allows us to delay the sending of some parts of the gradient updates. Our implementation SparCML speeds up practical  workloads significantly. We then discuss Deep500: the first customizable benchmarking infrastructure that enables fair comparison of the plethora of deep learning frameworks, algorithms, libraries, and techniques. The key idea behind Deep500 is its modular design, where deep learning is factorized into four distinct levels: operators, network processing, training, and distributed training. Our evaluation illustrates that Deep500 is customizable (enables combining and benchmarking different deep learning codes) and fair (uses carefully selected metrics).  Moreover, Deep500 is fast (incurs negligible overheads), verifiable (offers infrastructure to analyze correctness), and reproducible. Finally, as the first distributed and reproducible benchmarking system for deep learning, Deep500 provides software infrastructure to utilize the most powerful supercomputers for extreme-scale workloads.

Bio: Torsten is an Associate Professor of Computer Science at ETH Zürich, Switzerland. Before joining ETH, he led the performance modeling and simulation efforts of parallel petascale applications for the NSF-funded Blue Waters project at NCSA/UIUC. He is also a key member of the Message Passing Interface (MPI) Forum where he chairs the "Collective Operations and Topologies" working group. Torsten won best paper awards at the ACM/IEEE Supercomputing Conference SC10, SC13, SC14, EuroMPI'13, HPDC'15, HPDC'16, IPDPS'15, and other conferences. He published numerous peer-reviewed scientific conference and journal articles and authored chapters of the MPI-2.2 and MPI-3.0 standards. He received the Latsis prize of ETH Zurich as well as an ERC starting grant in 2015. His research interests revolve around the central topic of "Performance-centric System Design" and include scalable networks, parallel programming techniques, and performance modeling. Additional information about Torsten can be found on his homepage at

Faculty Host: Dr. Dave Lowenthal


Thursday, June 13, 2019 - 11:00am - Gould Simpson (GS) 701

Speaker: Radoslav Fulek, Ph.D.

Title: "Eliminating Crossings in Drawings of Graphs"

Abstract: It has been a long-standing open problem whether the Hanani-Tutte theorem extends to surfaces other than the plane and projective plane, although the problem was first explicitly stated in print by Schaefer and Stefankovic in 2013. They conjectured that the Hanani-Tutte theorem extends to every orientable surface for every graph G. We find a graph of genus 5 and its drawing on the orientable surface of genus 4 with every pair of nonadjacent edges crossing an even number of times. This shows that the Hanani-Tutte theorem cannot be extended to the orientable surface of genus 4. We complement our negative result by providing an approximate version of the Hanani-€“Tutte theorem on orientable surfaces. 

Bio: Radoslav Fulek holds an MS degree in Computer Science from Simon Fraser University and a PhD degree in Mathematics from Ecole Polytechnique Federale de Lausanne. He has worked as a postdoc at Charles University in Prague, Columbia University in New York, and currently at the Institute of Science and Technology in Austria. His research is in computational geometry and graph theory. 

Faculty Host: Dr. Stephen Kobourov


Tuesday, May 07, 2019  - 11:00am - Gould Simpson (GS) 906

Speaker: Clayton Morrison, Ph.D.

Title: "Assembling Models of the World"

Abstract: Modeling large, complex systems, such as projecting "food insecurity" at the sub-national scale, requires integrating knowledge from a diverse array of disciplines, each with their own modeling technology and heterogeneous data. It currently takes analysts on the order of 6 months to 2 years to put together reports that project how these systems might behave a couple of years into the future and propose mitigating interventions. A growing community of scientists and engineers seeks to reduce the time and resources it takes to build such models, so that this can be instead done in a matter of weeks. In this talk I will present two recent threads of work I am engaged in that are aimed at developing technology to enable rapid assembly of large-scale models for prediction and planning. In the first part, I will describe the Delphi framework for assembling executable models from natural language descriptions as they appear in reports and academic literature. In the second part, I will present our new effort developing the AutoMATES system for assembling semantically rich models from text, equations, and scientific software code.

Bio: Clay Morrison is an Associate Professor in the School of Information at the University of Arizona and a regular faculty member of the Statistics and Cognitive Science GIDPs. Dr. Morrison has 15+ years of experience in artificial intelligence and machine learning, during which he has published over 70 peer-reviewed articles. Dr. Morrison directs the Machine Learning for Artificial Intelligence Laboratory. His projects have focused on developing computer systems that can be taught through natural human instruction, developing machine learning algorithms for learning structured, latent representations from data, and modeling the relationship between human facial expressions, emotion, and decision-making. Dr. Morrison's current work includes development of the MUSICA collaborative musical composition system (funded under the DARPA Communicating with Computers Program), developing methods for machine reading of biological context in biomedical texts, and building frameworks for automating assembly of scientific models from text and software (funded under the DARPA World Modelers and ASKE programs).

Faculty Host: Dr. Mihai Surdeanu


Tuesday, April 16, 2019  - 11:00am - Gould Simpson (GS) 906

Speaker: Kwang-Sung Ju, Ph.D.

Title: "Accelerating Discovery Rate in Adaptive Experiments via Bandits with Low-Rank Structure"

Abstract: Accelerating discovery rate in adaptive experiments via bandits with low-rank structure. Abstract In many applications, a unit of experiment is to test a pair of objects from two different entity types. For example, imagine the drug discovery application where one has to repeatedly test drug-protein pairs via expensive experiments in order to find those pairs with the desired interaction. How one can maximize the discovery rate with the least experiment cost? In this talk, we show how one can leverage the model structure (low-rank) to accelerate discovery rate for these problems involving two entity types. At the heart of the solution is a new multi-armed bandit formulation called bilinear low-rank bandits, which leverage feature information of the two entities (e.g., drug and protein) in order to maximize the rewards (e.g., discovery rate). We first show that the existing linear bandit algorithms can solve the problem via some reduction, but its convergence rate to the optimal strategy is very slow. To improve the convergence, we exploit the low-rank structure of the model -- commonly observed in real-world data -- with a two-stage algorithm that first identifies important subspaces then invokes linear bandits with a novel use of a data-dependent regularizer to bias the algorithm to those identified subspaces. The convergence of the proposed algorithm is now sensitive to the rank of the unknown, never worse than linear bandits, and significantly better in some cases. Our result sheds light on solving bandit problems with complex but structured models.

Bio: Kwang-Sung Jun is a postdoctoral researcher at the Hariri Institute at Boston University with Prof. Francesco Orabona and will join the department of computer science at the University of Arizona as an assistant professor in Fall 2019. His research focuses on adaptive and interactive machine learning that arises in real-world and interdisciplinary applications. Specifically, he works on multi-armed bandits and online optimization, which has applications in the online advertisement, personalized product recommendation, and adaptive biological experiments. He received a Ph.D. in Computer Science from the University of Wisconsin-Madison under the supervision of Prof. Xiaojin (Jerry) Zhu and was a postdoc at the University of Wisconsin-Madison Wisconsin Institute for Discovery, advised by Profs. Robert Nowak, Rebecca Willett, and Stephen Wright.

Faculty Host: Dr. Mihai Surdeanu


Monday, April 01, 2019  - 11:00am - Gould Simpson (GS) 906

Speaker: Snigdha Chaturvedi, Ph.D.

Title: "Structured Approaches to Natural Language Understanding"

Abstract: Despite recent advancements in Natural Language Processing, computers today cannot understand text in the ways that humans can. My research aims at creating computational methods that not only read but also understand text. To accomplish this, I develop machine-learning methods that incorporate linguistic cues as well as the context in which they appear to understand language. In this talk, I will discuss two specific applications of language understanding that focus on comprehension of narratives: (i) Choosing correct endings to stories, and (ii) Automatically generating narratives. I will also discuss my ongoing and future work on applications of language understanding in domains like education, digital humanities and mental health care.

Bio: Snigdha Chaturvedi is an Assistant Professor in the department of Computer Science and Engineering at the University of California, Santa Cruz. She specializes in the field of Natural Language Processing with an emphasis on developing methods for natural language understanding. Her research has been recognized with the IBM Ph.D. Fellowship (twice), a best paper award at NAACL, and first prize at ACM student research competition held at Grace Hopper Conference. Previously, she was a postdoctoral fellow at University of Illinois, Urbana Champaign, and University of Pennsylvania working with Professor Dan Roth. She earned her Ph.D. in Computer Science at University of Maryland, College Park in 2016 (advisor: Dr. Hal Daume III) and Bachelors of Technology from Indian Institute of Technology, Kanpur in 2009. She was also a Blue Scholar at IBM Research, India from 2009 to 2011.

Faculty Host: Dr. Kobus Barnard


Thursday, March 28, 2019  - 11:00am - Gould Simpson (GS) 906

Speaker: Weihao Kong, Ph.D.

Title: "Robust Learning and Recent Progress on Learning Populations of Parameters"

Abstract: How accurately can one learn, if a significant fraction of the training set is extremely biased, outlying, or corrupted? Both in theory and in practice, most learning algorithms are brittle to the presence of such data. In the first part of the talk, I will discuss our recent results on robust linear regression, a prototypical problem in robust learning. We focus on the fundamental setting where the covariates of the uncorrupted samples are drawn from a Gaussian distribution N(0, Sigma) on R^d, and an epsilon fraction of the data is arbitrarily (or even maliciously) corrupted. For this setting, we give a natural algorithm and show that it attains nearly optimal performance guarantees. In the second part of this talk, I will present our very recent progress on the problem of learning populations of parameters (which builds on the results I discussed when I visited in November). Consider the following estimation problem: there are N entities, each with an unknown parameter p_i in [0,1], drawn from an unknown distribution P, and we observe N independent random variables, X_1,...,X_N, with X_i~Binomial(t, p_i). How accurately can one recover P? This problem arises in numerous domains, including "federated learning" settings, and biological and medical settings, where the size of the population under study, N, is large in comparison to the number of observations per individual, t. In this work, we show that the maximum likelihood estimator is both statistically minimax optimal and efficiently computable. Precisely, the MLE achieves the information theoretic optimal error bound of ax(1/t,1/sqrt(t logN)), with regards to the Wasserstein distance. This improves on the performance of the previous 'method of oments' approach when N < exp(t), and is significantly better than the 1/sqrt(t) error of the naive empirical estimator.

Bio: Weihao Kong is a Ph.D. student from the Computer Science Department at Stanford University. He received his B.S. from Shanghai Jiao Tong University in 2013. His research interests span statistical learning, high-dimensional statistics and theoretical computer science.

Faculty Host: Dr. Mihai Surdeanu


Tuesday, March 26, 2019  - 11:00am - Gould Simpson (GS) 906

Speaker: Berkay Celik, Ph.D.

Title: "Automated IoT Safety and Security Analysis"

Abstract: The introduction of Internet of Things (IoT) devices that integrate online processes and services with the physical world has had profound effects on society. Yet, while IoT systems have been widely embraced by consumers and industry alike, safety and security failures have raised questions about the risks of embracing IoT-augmented lives. These failures range from compromised baby monitors to vehicle crashes and monetary theft. As with traditional security problems, many of these failures are a consequence of software bugs, user error, poor configuration, or faulty design. In this talk, we will examine new classes of failures: Interactions within the physical domain that lead to unsafe or insecure environments. I will then demonstrate how to model the interactions between devices within physical spaces through source code analysis and formally verify via model checking not only the correct operation of one device, but the joint behavior of all of the devices in an environment. Using these techniques, we successfully identify threats to safety and security, and enforce the correct operation of IoT devices and environments in physical spaces. In so doing, we create a richer model of IoT safety and security, and provide consumers, developers, and industry with systems that mitigate threats to IoT in practice.

Bio: Berkay Celik is a PhD candidate in Computer Science and Engineering at the Pennsylvania State University, where he is advised by Professor Patrick McDaniel. Berkay has researched a variety of security topics, including machine learning systems, network security, and privacy enhancing technologies. His dissertation is in the area of Internet of Things (IoT), particularly the construction of systems that ensure safety, security, and privacy in IoT implementations through program analysis. He received his B.Sc in Computer Science from Naval Academy (Istanbul) and his M.S. in Computer Science and Engineering with a minor in Computational Science from Pennsylvania State University. He expects to earn his PhD in the Spring of 2019. Berkay has had several internships in industry, including at VMware and Vencore Labs.

Faculty Host: Dr. John Hartman


Friday, March 15, 2019  - 11:00am - Gould Simpson (GS) 906

Speaker: Daniel Khashabi, Ph.D.

Title: "Natural Language Understanding with Minimal Supervision"

Abstract: Can we solve language understanding tasks without relying on task-specific annotated data? This could be important in scenarios where the input range across various domains and it is expensive to create annotated data. I discuss two different language-understanding problems (Question Answering and Entity Typing) which have traditionally been tackled with a strong reliance on direct supervision. For these problems, I present two recent works where exploiting properties of the underlying representations and indirect signals help us move beyond traditional paradigms. And as a result, we observe better generalization across domains.

Bio: Daniel Khashabi is a recent PhD graduate from the University of Pennsylvania, under the supervision of Prof. Dan Roth. His interests lie at the intersection of computational intelligence and natural language processing, with the ultimate goal of improving natural language “understanding” & broadening its applications. He has published dozens of articles in prestigious conferences on natural language processing and artificial intelligence. He is the co-organizer of Student Research Workshop at ACL 2019.

Faculty Host: Dr. John Kececioglu


Tuesday, March 12, 2019  - 11:00am - Gould Simpson (GS) 906

Speaker: Shashank Srivastava, Ph.D.

Title: "Conversational Machine Learning"

Abstract: Humans can efficiently learn and communicate new knowledge about the world through natural language (e.g, the concept of important emails may be described through explanations like "€˜late night emails from my boss are usually important"€™). Can machines be similarly taught new tasks and behavior through natural language interactions with their users? In this talk, we'll explore two approaches towards language-based learning for classifications tasks. First, we'll consider how language can be leveraged for interactive feature space construction for learning tasks. I'll present a method that jointly learns to understand language and learn classification models, by using explanations in conjunction with a small number of labeled examples of the concept. Secondly, we'll examine an approach for using language as a substitute for labeled supervision for training machine learning models, which leverages the semantics of quantifier expressions in everyday language (`definitely', `sometimes', etc.) to enable learning in scenarios with limited or no labeled data.

Bio: Shashank Srivastava recently received his PhD from the Machine Learning department at CMU in 2018, and currently works at Microsoft Research. Shashank's research interests lie in conversational learning, interactive AI and grounded language understanding, and his dissertation focuses on helping machines learn from human interactions. Shashank has an undergraduate degree in Computer Science from IIT Kanpur, and a Master'€™s degree in Language Technologies from CMU. He received the Yahoo InMind Fellowship for 2016-17. His research has been covered by popular media outlets including GeekWire and New Scientist.

Faculty Host: Dr. Mihai Surdeanu


Thursday, February 28, 2019  - 11:00am - Gould Simpson (GS) 906

Speaker: Chicheng Zhang, Ph.D.

Title: "Efficient and Robust Interactive Learning"

Abstract: Different from traditional machine learning, interactive learning allows a learner to be involved in the data collection process through interacting with the environment. An interactive learner can carefully avoid collecting redundant information, thus being able to make accurate predictions with a small amount of data. Two key questions in interactive learning research are of central interest: first, can we design and analyze interactive learning algorithms that have data efficiency, computational efficiency and robustness guarantees? second, can we identify novel interaction models which learners can benefit from? In this talk, I will answer both questions in the affirmative. In the first part of the talk, I will present our work on efficient noise-tolerant active learning of linear classifiers with near-optimal label requirements. In the second part, I will describe new algorithms and tools for contextual bandit learning with continuous action spaces. In the last part, I will discuss a new interactive learning model, namely warm-starting contextual bandits, and present an algorithm in this model with robustness guarantees. I will conclude my talk by outlining several promising directions for future research.

Bio: Chicheng Zhang is a postdoctoral researcher at the machine learning group at Microsoft Research New York City. He received a Bachelor degree from Peking University in 2012 and a PhD in Computer Science from UC San Diego in 2017. Chicheng has broad research interests in machine learning, including but not limited to active learning, contextual bandits, unsupervised learning and confidence-rated prediction. His main research interests lie in the design and analysis of interactive machine learning algorithms.

Faculty Host: Dr. John Kececioglu


Tuesday, February 26, 2019  - 11:00am - Gould Simpson (GS) 906

Speaker: Jiang Ming, Ph.D.

Title: "Study and Mitigation of Malware Threats Armored by Application Virtualization"

Abstract: Application virtualization refers to a general technique that encapsulates programs from the underlying operating system. For example, new innovative application-level virtualization can support running multiple copies of the same Android app on a single device. However, in the realm of software security, the idea of application virtualization has become one of the most sophisticated code obfuscation techniques in the past decade. Code virtualization translates program code into custom bytecode and interprets the bytecode at run time via an embedded emulator. In this way, the original code never reappears in memory. The strong resilience to reverse engineering also gains the attention of malware authors, who always seek more advanced methods to stay under the detection radar. Rapid analysis of virtualization-obfuscated malware is vital for a swift response to emerging threats such as ransomware and cryptocurrency mining malware. In this talk, I will present VMHunt, a generic approach to automatically locate and simplify virtualized code sections from an execution trace. We develop semantics-based slicing and multiple granularity symbolic execution techniques to optimize the obfuscated code further. With the developed techniques, VMHunt is able to advance the state of the art in malicious software analysis and help better defend emerging cyber attacks. Besides, I will briefly introduce my ongoing research work in this direction: 1) towards the ultimate goal of code de-virtualization: from virtualized code back to the original; 2) defeating Android platform emerging attacks: stealthy malware powered by VirtualApp technique.

Bio: Jiang Ming is an Assistant Professor of the Department of Computer Science and Engineering at the University of Texas at Arlington. He received his Ph.D. from Pennsylvania State University, while he also holds an M.Eng. from Peking University and a B.S. from Wuhan University in China. His research interests span cybersecurity and software engineering, with a focus on binary code analysis, software virtualization security, hardware-assisted malware analysis, and IoT systems security. Jiang Ming's work has been published in prestigious security and software engineering conferences (IEEE S&P, Usenix Security, CCS, FSE, and ASE). His FSE'14 paper on obfuscation-resilient binary code similarity comparison was nominated for Distinguished Paper Award, and IEEE S&P'17 paper was selected into CSAW'17 Applied Research Competition Top 10 Finalists.

Faculty Host: Dr. Saumya Debray


Tuesday, February 19, 2019  - 11:00am - Gould Simpson (GS) 906

Speaker: Jeff Clune, Ph.D.

Title: "Understanding and Improving Deep Neural Networks"

Abstract: With deep learning, deep neural networks have produced state-of-the-art results in a number of different areas of machine learning, including computer vision, natural language processing, robotics and reinforcement learning. I will summarize three projects on better understanding deep neural networks and improving their performance. First I will describe our sustained effort to study how much deep neural networks know about the images they classify. Our team initially showed that deep neural networks are “easily fooled,” meaning they will declare with near certainty that completely unrecognizable images are everyday objects, such as guitars and starfish. These results suggested that deep neural networks do not truly understand the objects they classify. However, our subsequent results reveal that, when augmented with powerful priors, deep neural networks actually have a surprisingly deep understanding of objects, which enables them to be incredibly effective generative models that can produce a wide diversity of photo-realistic images. Second, I will summarize our Nature paper on learning algorithms that enable robots, after being damaged, to adapt in 1-2 minutes and soldier on with their mission. This work combines a novel stochastic optimization algorithm with Bayesian optimization to produce state-of-the-art robot damage recovery. Third, I will describe our recent Go-Explore algorithm, which dramatically improves the ability of deep reinforcement learning algorithms to solve previously unsolvable problems wherein reward signals are sparse, meaning that intelligent exploration is required. Go-Explore solves Montezuma’s Revenge, considered by many to be a grand challenge of AI research. I will also very briefly summarize a few other machine learning projects from my career, including our PNAS paper on automatically identifying, counting, and describing wild animals in images taken remotely by motion-sensor cameras.

Bio: Jeff Clune is the Loy and Edith Harris Associate Professor in Computer Science at the University of Wyoming and a Senior Research Manager and founding member of Uber AI Labs, which was formed after Uber acquired a startup he helped lead. Jeff focuses on robotics and training neural networks via deep learning, including deep reinforcement learning. Since 2015, a robotics paper he co-authored was on the cover of Nature, a deep learning paper from his lab was on the cover of the Proceedings of the National Academy of Sciences, he won an NSF CAREER award, he received the Distinguished Young Investigator Award from the International Society for Artificial Life, he was an invited speaker at the NeurIPS Deep Reinforcement Learning Workshop, and his deep learning papers were awarded honors (best paper awards and/or oral presentations) at the top machine learning conferences (NeurIPS, CVPR, ICLR, and an ICML workshop). His research is regularly covered in the press, including the New York Times, NPR, NBC, Wired, the BBC, the Economist, National Geographic, the Atlantic, the New Scientist, the Daily Telegraph, Science, Nature, and U.S. News & World Report. Prior to becoming a professor, he was a Research Scientist at Cornell University, received degrees from Michigan State University (a PhD and a master's degree) and the University of Michigan (a bachelor'€™s degree). More information about Jeff's research is available at

Faculty Host: Dr. Mihai Surdeanu


Tuesday, February 12, 2019  - 11:00am - Gould Simpson (GS) 906

Speaker: Jason Pacheco, Ph.D.

Title: "Probabilistic Reasoning in Complex Systems: Algorithms and Applications"

Abstract: Statistical machine learning approaches to scientific applications are complicated by high-dimensional, continuous, and nonlinear interactions that typically arise in such settings. In this talk I will discuss several general-purpose approaches to reasoning in these systems. I will introduce Diverse Particle Max-Product (D-PMP), a particle-based extension of max-product belief propagation (BP) for maximum a posteriori (MAP) inference and will show how D-PMP isolates multiple distinct, locally-optimal, configurations through a diverse particle selection process. I will further demonstrate how D-PMP can be flexibly adapted to problems of articulated human pose estimation in images and video, as well as protein structure prediction from low-resolution experimental data. In the second half of the talk I will present robust and efficient algorithms for sequential decision making in information gathering systems. These algorithms sequentially choose among actions to maximally reduce uncertainty about quantities of interest. Using gene regulatory network inference as an example, I will demonstrate how these approaches meet or exceed the performance of domain-specific methods for Bayesian experimental design.

Bio: Jason Pacheco is a postdoctoral associate at MIT in the Computer Science and Artificial Intelligence Laboratory (CSAIL) with John Fisher III. Prior to joining MIT Jason completed his graduate work at Brown University with Erik Sudderth. Jason'€™s research interests are in statistical machine learning, probabilistic graphical models, approximate inference algorithms, and information-theoretic decision making.

Faculty Host: Dr. Kobus Barnard


Thursday, February 07, 2019  - 11:00am - Gould Simpson (GS) 906

Speaker: David Inouye, Ph.D.

Title: "Deeper Understanding of Deep Learning via Shallow Learning and Model Explanations"

Abstract: Despite tremendous empirical success, modern deep learning is still relatively new, and thus, there are significant gaps in understanding compared to classical shallow learning. This lack of deep understanding of deep learning hinders practitioners from systematically and reliably developing models especially in new contexts - €”thus development is often relegated to laborious trial and error. In addition, this lack of understanding hinders users from adopting deep models in real-world applications. As one approach for deeper understanding, I will discuss how to leverage well-understood shallow learning to construct deep models so that the algorithms and insights from shallow learning can be lifted into the deep context. Specifically, I will present a destructive process that iteratively finds patterns in the data via shallow learning and then destroys these patterns via transformations. I will then show an application of this unconventional deep learning approach to deep probabilistic models. As a different approach, model explanations can increase users' understanding of deep models and thereby aid them in deciding if they should adopt a deep model or not. Thus, I will also describe how to create model explanations based on the concept of counterfactuals that are simultaneously exact and non-local—in contrast to explanations based on local approximations of the model. I will present a new framework for finding useful model explanations, conceptualized as lines or curves in the input space that compress the full model into a few relevant trajectories for a given target point, where relevancy depends on the context. In conclusion, I will discuss promising future directions for both destructive learning and model explanation.

Bio: David Inouye is a postdoctoral researcher at Carnegie Mellon University in the Machine Learning Department working with Prof. Pradeep Ravikumar. He completed his PhD in CS at The University of Texas at Austin where he was advised by Prof. Inderjit Dhillon and Prof. Pradeep Ravikumar. David's research interests include deep generative models, probabilistic graphical models, model explanations and model visualizations. He completed his BS in EE at Georgia Institute of Technology and was awarded the NSF GRFP graduate research fellowship during his senior year.

Faculty Host: Dr. Carlos Scheidegger


Tuesday, January 22, 2019  - 11:00am - Gould Simpson (GS) 906

Speaker: Cynthia Bailey, Ph.D.

Title: "What Can I Do Today to Create a More Inclusive Community in CS?"

Abstract: Many people who work in STEM wish that our field were more diverse and earnestly want to be part of the solution, yet aren't sure where to begin. This talk will focus on concrete, actionable suggestions that anyone in our field can implement to help create a more inclusive and equitable environment. Drawing on lessons learned in her teaching, research, and mentorship programs, Dr. Lee will provide an inclusion toolkit ranging from 5-minute targeted interventions to lifelong habits.

Bio: Cynthia Lee is a Lecturer in the Computer Science Department at Stanford University. Her research focuses on education and social impacts of technology. She founded to assist faculty in redesigning their CS coursework around research-based best practices. Her work experience includes NASA Ames and startups in the machine learning, children’s education apps, and augmented reality spaces. She holds a PhD from University of California, San Diego where her research focused on machine learning and market-based algorithms for high performance computing systems. She was voted the 2015 Professor of the Year by Stanford Society of Women Engineers.

Faculty Host: Dr. Michelle Strout


Tuesday, January 15, 2019  - 11:00am - Gould Simpson (GS) 906

Speaker: Michael Gleicher, Ph.D.

Title: "Interpreting Embeddings with Comparison"

Abstract: Vector embeddings place objects (e.g., documents or words) in a vector space such that similar objects are close. Embeddings can abstract information from data collections and have wide usage in fields such as machine learning, and natural language processing. However, embeddings are challenging to interpret, which limits the use of embeddings as a tool for understanding the underlying data or the performance of the methods that construct embeddings. In this talk, I will survey our work in building visualization tools that address challenges in interpreting embeddings. I will use the idea of comparison as a strategy for designing data analysis tools. I will introduce our framework for thinking about comparison, and show how we used this framework in the design of tools for embedding challenges. I will provide examples of tools for examining and comparing both document embeddings (e.g., topic models) and word vector embeddings.

Bio: Michael Gleicher is a Professor in the Department of Computer Sciences at the University of Wisconsin, Madison. Prof. Gleicher is founder of the Department's Visual Computing Group. His research interests span the range of visual computing, including data visualization, robotics, image and video processing tools, virtual reality, and character animation. His current foci are human data interaction and human robot interaction. Prior to joining the university, Prof. Gleicher was a researcher at The Autodesk Vision Technology Center and in Apple Computer's Advanced Technology Group. He earned his Ph. D. in Computer Science from Carnegie Mellon University, and holds a B.S.E. in Electrical Engineering from Duke University. In 2013-2014, he was a visiting researcher at INRIA Rhone-Alpes. Prof. Gleicher is an ACM Distinguished Scientist.

Faculty Host: Dr. Carlos Schiedegger


Wednesday, December 05, 2018  - 11:00am - Gould Simpson (GS) 701

Speaker: Patrick Verga, Ph.D. Candidate

Title: "Extracting and Embedding Entities, Types, and Relations"

Abstract: Over the past five years, deep learning has shown remarkable success on shallow natural language processing (NLP) tasks and in perceptual domains like vision. This has large been driven by models which largely ignore explicit representations of knowledge in favor of unconstrained, end-to-end learning over massive supervised training sets. On shallow perceptual tasks this is typically adequate, but as we seek to develop methods for true language understanding, knowledge representation and reasoning will be essential and the question of how best to represent and acquire knowledge remains open.

In this talk I will present methods for incorporating powerful neural network models with rich structured information to improve the representations of entities and their relations while extracting new knowledge from raw unstructured text. Symbolic knowledge graphs over fixed, human-defined schema encode facts about entities and their relations but are brittle, lack specificity, and are highly incomplete. Universal schema (US) addresses all of these issues by learning a latent schema over both existing structured resources and unstructured free text data, embedding them jointly within a shared space. We first improve generalization in US by (1) representing text using a compositional neural encoder, leading to substantially improved recall and enabling zero-shot relation extraction in a language with no training data, and (2) representing entities and entity pairs as query-specific aggregations over observed textual evidence rather than a static global embedding. We also improve accuracy in fine-grained entity typing and linking by injecting additional structure into our embedding space, enforcing a hypernym hierarchy over entities and types. Finally, we present a model for extracting all entities and relations simultaneously over full paragraphs, improving biological relation extraction by allowing for finer-grained encodings and extraction of long range cross-sentence relations.

Bio: Patrick Verga is a final year PhD candidate in the College of Information and Computer Sciences at UMass Amherst, advised by Andrew McCallum. His research contributes to knowledge representation and reasoning, with a focus on large knowledge base construction from unstructured text, with applications to general domain, commonsense, and biomedicine. Pat previously interned at Google and the Chan Zuckerberg Initiative and received a best paper award at EMNLP 2018. Over the past several years he has advised multiple M.S. and junior PhD students, resulting in published research in fine-grained entity typing linking, unsupervised parsing, and partially labeled named entity extraction. He holds M.S. and B.A degrees in computer science as well as a B.S. in neuroscience.

Faculty Host: Dr. Kobus Barnard


Tuesday, December 04, 2018  - 11:00am - Gould Simpson (GS) 906

Speaker: Emma Strubell, Ph.D. Candidate

Title: "Neural Network Architectures for Efficient and Robust NLP"

Abstract: NLP has come of age. For example, semantic role labeling (SRL), which automatically annotates sentences with a labeled graph representing "who" did "what" to "whom," has in the past ten years seen nearly 40% reduction in error, bringing it to useful accuracy. As a result, hoards of practitioners now want to deploy NLP systems on billions of documents across many domains. However, state-of-the-art NLP systems are typically not optimized for cross-domain robustness nor computational efficiency.

In this talk I will present two new methods to facilitate fast, accurate and robust NLP. First, I will describe Iterated Dilated Convolutional Neural Networks (ID-CNNs, EMNLP 2017), a faster alternative to bidirectional LSTMs for sequence labeling, which in comparison to traditional CNNs have better capacity for large context and structured prediction. Unlike LSTMs whose sequential processing on sentences of length N requires O(N) time even in the face of GPU parallelism, ID-CNNs permit fixed-depth convolutions to run in parallel across entire documents. They embody a distinct combination of network structure, parameter sharing and training procedures that enable dramatic 14-20x test-time speedups while retaining accuracy comparable to the Bi-LSTM-CRF. Second, I will present Linguistically-Informed Self-Attention (LISA, EMNLP 2018 Best Long Paper), a neural network model that combines multi-head self-attention with multi-task learning across dependency parsing, part-of-speech tagging, predicate detection and SRL. Unlike previous models which require significant pre-processing to prepare syntactic features, LISA can incorporate syntax using merely raw tokens as input, encoding the sequence only once to simultaneously perform parsing, predicate detection and role labeling for all predicates. Syntax is incorporated through the attention mechanism, by training one of the attention heads to focus on syntactic parents for each token. We show that incorporating linguistic structure in this way leads to substantial improvements over the previous state-of-the-art (syntax-free) neural network models for SRL, especially when evaluating out-of-domain, where LISA obtains nearly 10% reduction in error while also providing speed advantages. 

Bio: Emma Strubell is a final-year PhD candidate in the College of Information and Computer Sciences at UMass Amherst, advised by Andrew McCallum. Her research aims to provide fast, accurate, and robust natural language processing to the diversity of academic and industrial investigators eager to pull insight and decision support from massive text data in many domains. Toward this end she works at the intersection of natural language understanding, machine learning, and deep learning methods cognizant of modern tensor processing hardware. She has applied her methods to scientific knowledge bases in collaboration with the Chan Zuckerberg Initiative, and to advanced materials synthesis in collaboration with faculty at MIT. Emma has interned as a research scientist at Amazon and Google and received the IBM PhD Fellowship Award. She is also an active advocate for women in computer science, serving as leader of the UMass CS Women-€™s group where she co-organized and won grants to support cross-cultural peer mentoring, conference travel grants for women, and technical workshops. Her research has been recognized with best paper awards at ACL 2015 and EMNLP 2018.

Faculty Host: Dr. Kobus Barnard


Thursday, November 29, 2018  - 11:00am - Gould Simpson (GS) 906

Speaker: Paul Medvedev, Ph.D.

Title: "Assembly of Big Genomic Data"

Abstract: As genome sequencing technologies continue to facilitate the generation of large datasets, developing scalable algorithms has come to the forefront as a crucial step in analyzing these datasets. In this talk, I will discuss several recent advances, with a focus on the problem of reconstructing a genome from a set of reads (genome assembly). I will describe low-memory and scalable algorithms for automatic parameter selection and de Bruijn graph compaction, recently implemented in two tools KmerGenie and bcalm. I will also present recent advances in the theoretical foundations of genome assemblers.

Bio: Paul Medvedev is an Associate Professor in the Department of Computer Science and Engineering and the Department of Biochemistry and Molecular Biology and the Director of the Center for Computational Biology and Bioinformatics at the Pennsylvania State University. His research focus is on developing computer science techniques for analysis of biological data and on answering fundamental biological questions using such methods. Prior to joining Penn State in 2012, he was a postdoc at the University of California, San Diego and a visiting scholar at the Oregon Health & Sciences University and the University of Bielefeld. He received his Ph.D. from the University of Toronto in 2010, his M.Sc. from the University of Southern Denmark in 2004, and his B.S. from the University of California, Los Angeles in 2002. 

Faulty Host: Dr. John Kececioglu


Tuesday, November 27, 2018  - 11:00am - Gould Simpson (GS) 906

Speaker: Weihao Kong, Ph.D. Candidate

Title: "The Surprising Power of Little Data"

Abstract: Despite the rapid growth of the size of our datasets, the inherent complexity of the problems we are solving is also growing, if not at an even faster rate. This prompts the question of how to infer the most information from the available data. 

I will discuss several examples of my research that reveal a surprising ability to extract accurate information from modest amounts of data. The first setting that I discuss considers data provided by a large number of heterogeneous individuals, and we show that the empirical distribution of the data can be significantly "de-noised". The second setting considers estimating the intrinsic dimensionality of a dataset, in the sublinear sample regime where the empirical distribution of the data is misleading. The final portion of my talk focuses on estimating "learnability": given too little data to learn an accurate prediction model, we can accurately estimate the value of collecting more data. Specifically, for some natural model classes, we can estimate the performance of the best model in the class, given too little data to find any model in the class that would achieve good prediction error. In most of these settings, our algorithms are provably information-theoretically optimal and are also highly practical.

Bio: Weihao Kong is a Ph.D. student from the Computer Science Department at Stanford University. He received his B.S. from Shanghai Jiao Tong University in 2013. His research interests span machine learning, high-dimensional statistics and theoretical computer science.

Faculty Host: Dr. Carlos Scheidegger


Tuesday, November 15, 2018  - 11:00am - Gould Simpson (GS) 906

Speaker: Alfed Z. Spector, Ph.D.

Title: "Research Challenges in Computer Science"

Abstract: The trillion-fold increase in the capability of computation over the past 60 years, when coupled with global connectivity and vast amounts of available data makes, for a vibrant field that will continue to grow and provide many research and employment opportunities for computer scientists. In this talk, I describe a plethora of research challenges, including but not limited to the grand challenge problems of Artificial Intelligence. I’ll discuss research objectives relating to scalability, usability, security, robustness, knowledge representation and inferencing, application areas such as healthcare and education, and more. I’ll discuss the breadth of challenges and the diversities of talent that will be required to meet them. I’ll conclude with some example approaches to these problems from my experience leading research teams in academia and industry.

Bio: Alfred Spector is Chief Technology Officer at Two Sigma, a firm dedicated to using information to undertake many forms of economic optimization. Dr. Spector's career has led him from innovation in large scale, networked computing systems (at Stanford, CMU, and his company, Transarc) to broad research leadership: eight years leading Google Research and five years leading IBM Software Research. Recently, Spector has lectured widely on the growing importance of computer science across all disciplines (CS+X) and on the Societal Implications of Data Science. He received an AB in Applied Mathematics from Harvard and a Ph.D. in Computer Science from Stanford. He is a Fellow of the ACM and IEEE, and a member of the National Academy of Engineering and the American Academy of Arts and Sciences. Dr. Spector won the 2001 IEEE Kanai Award for Distributed Computing and was co-awarded the 2016 ACM Software Systems Award.

Faculty Host: Dr. Tom Fleming, Astronomer and Senior Lecturer in Astronomy


Tuesday, November 06, 2018  - 11:00am - Gould Simpson (GS) 906

Speaker: Hong Hu, Ph.D.

Title: "Regaining Initiative in the Eternal War in Memory"

Abstract: Memory corruption bugs, like buffer overflow or use-after-free, enable attackers to manipulate victim's memory for their bidding. Due to the lack of complete and efficient bug detection system, attackers will still have exploitable memory bugs in the near future. Unfortunately, in current competition between attackers and defenders, the latter always take actions passively after new attacks and bypasses happen, leading to ineffective protection. 

In this talk, I will present two of my work that help regain the initiative in the competition on memory bugs. In the first work, we develop a control-flow protection mechanism that prevents any control-flow hijacking attacks, with the purpose of ending the long-competition 
between control-flow integrity proposals and bypasses. In the second work, we propose a new attack methodology to explore the expressiveness of data-only attacks. We show that without corrupting any control-flow attackers are able to cause severe damage, evening achieving Turing-complete computing in the victim's memory. These work shows that partial memory safety cannot guarantee the program security, and we should spend more effort on developing efficient full memory safety mechanisms.

Bio: Dr. Hong Hu is a postdoctoral fellow in the School of Computer Science, College of Computing, the Georgia Institute of Technology. His main research area is system security, focusing on detecting memory errors from C/C++ programs, exploring new attack vectors and developing defense mechanisms to prevent exploits. His work has appeared in top venues in system security (USENIX Security, IEEE Security and Privacy, CCS, ICECCS and ESORICS). He received the Best Paper Award from ICECCS 2014. He obtained his Ph.D. degree in computer science from the National University of Singapore in 2016, and his B.E. degree in information security from the Huazhong University of Science and Technology in 2011.

Faculty Host: Dr. Saumya Debray


Thursday, October 04, 2018  - 11:00am - Gould Simpson (GS) 906

Speaker: Daniel Fried, Ph.D.

Title: "Pragmatic Models for Generating and Following Grounded Instructions"

Abstract: A system that interacts with people in natural language predicts what it should say -- why not also predict how the person listening will react? We describe methods for generating and following natural language instructions by incorporating pragmatics: explicitly modeling people and world contexts. Our pragmatics-enabled models reason about how listeners will carry out instructions, and reason counterfactually about why speakers produced the instructions they did. We find that this reasoning procedure improves state-of-the-art listener models (at correctly following human instructions) and speaker models (at generating instructions correctly interpretable by humans) for sequential tasks in diverse settings, including navigating through real-world indoor environments.

Bio: Daniel Fried is a PhD student at UC Berkeley working on natural language processing and machine learning, with a focus on grounded semantics and structured prediction. Previously, he received a BS from the University of Arizona and an MPhil from the University of Cambridge. His work has been supported by a Churchill Scholarship, NDSEG Fellowship, Huawei / Berkeley AI Fellowship, and Tencent Fellowship.

Faculty Host: Dr. Stephen Kobourov


Tuesday, August 14, 2018  - 11:00am - Gould Simpson (GS) 701

Speaker: Ahyoung Choi, Ph.D.

Title: "Health Sensing by Wearables and Its Application to Obesity Management and Vital Sign Monitoring"

Abstract: Precision medicine is an emerging approach to focus on prevention and treatment considering individual differences in genes and lifestyles rather than focusing on diagnosing the disease in a uniform way by population-based statistics and science. This approach accelerates the health monitoring and wellness monitoring device market and its integration of traditional medical records with the viewpoint of big data. For example, GearFit, Fitbit, Jawbone, G Watch Urbane and other wearable devices monitor daily exercise activity, sleep quality, and dietary habits for wellness monitoring. Portable blood pressure monitors and patch-type ECG measurement sensors collect health-related data in an unobtrusive way in daily life for health monitoring. However, such health-sensing wearable devices and methods have not yet been used as a substitute for digital healthcare widely because of low usability as well as low accuracy. This talk introduces health sensing and analysis method using wearable devices in everyday life while ensuring usability in obesity management and vital sign monitoring. 

Bio: Ahyoung Choi is currently an assistant professor in the department of software at Gachon University (Seongnam, Republic of Korea). She received her M.S. and Ph.D. in the department of information and communications at Gwangju Institute of Science and Technologies (Gwangju, Republic of Korea) in 2005 and 2011, respectively. She was a visiting scholar at Institute for Creative Technologies at University of Southern California (CA, USA) from 2011 to 2012 and worked at Samsung Electronics (Suwon, Republic of Korea) from 2012 to 2016. Her research interests include physiological signal processing, and human-computer interaction and its application to mobile healthcare systems.

Faculty Host: Dr. Stephen Kobourov