Skip to main content

Allen School faculty honored by the Association for Computing Machinery for advancing probabilistic robotics, internet-scale systems, and more

Association for Computing Machinery logo

Allen School professors Dieter Fox and Arvind Krishnamurthy recently attained the status of Fellows of the Association for Computing Machinery in honor of their influential contributions to robotics and computer vision and to systems and networking, respectively. Each year, the ACM bestows the designation of Fellow on a select group of members to recognize their impact through research and service based on nominations submitted by their peers in the computing community.

According to ACM President Gabriele Kotsis, in 2020 the organization received a record number of nominations from around the world aiming to highlight individuals who have made “pivotal contributions to technologies that are transforming whole industries, as well as our personal lives.” The ACM’s recognition of Krishnamurthy and Fox brings the number of Allen School faculty members elevated to Fellow to 26. Allen School affiliate professor Meredith Ringel Morris, an expert in human-computer interaction and accessibility at Microsoft Research, and former Allen School professor Steve Gribble, now a distinguished engineer focused on networked systems at Google, joined Fox and Krishnamurthy among the 95 researchers worldwide to be recognized for their achievements in the new class of Fellows.

Dieter Fox

Portrait of Dieter Fox

Dieter Fox was elevated to the status of ACM Fellow for “contributions to probabilistic state estimation, RGB-D perception, and learning for robotics and computer vision.” Fox’s work, which spans roughly 25 years, has had an influential impact across multiple areas, opened up new avenues of research, and helped establish the University of Washington as a leader in robotics and artificial intelligence research.

Fox currently serves as director of the Allen School’s Robotics and State Estimation Laboratory and senior director of robotics research at NVIDIA. He joined the UW faculty in 2000 after earning his Ph.D. from the University of Bonn in Germany and completing a postdoc at Carnegie Mellon University. Fox was among the pioneers of probabilistic robotics, for which he co-authored a textbook of the same name, which aims to address problems in perception and control to endow robots with the ability to interact with people and their environment in an intelligent way. 

Among Fox’s contributions is a series of novel techniques for Bayesian state estimation that have resulted in significant advancements across a variety of domains, including object manipulation, human activity recognition, and localization and mapping. For the latter — one of the fundamental problems in robotics — Fox’s early work on Markov localization in dynamic environments is considered to be a significant milestone. He and his colleagues built on that work with the introduction of Monte-Carlo Localization, a technique that broke new ground with its use of sampling-based methods that proved to be more accurate, more computationally efficient, and easier to implement than previous approaches. MCL subsequently became the preferred approach in the field, and today localization is widely regarded as a solved problem in robotics.

In another series of firsts, Fox contributed to new systems for achieving robust 3D-mapping capabilities using RGB-D cameras and advancing robot learning by incorporating model-based information into deep learning approaches. In a paper that combined insights from robotics and computer vision, Fox and colleagues in the Allen School’s Graphics & Imaging Laboratory (GRAIL) presented DynamicFusion, the first dense simultaneous localization and mapping (SLAM) system capable of producing detailed and complete reconstructions of subjects in motion, and in real time. Fox and collaborators subsequently leveraged that capability to automate the generation of robust visual training data from RGB-D video in order to take advantage of new deep learning techniques. The resulting framework enabled robots to learn rich visual features of objects and people in their environment in a self-supervised way. Self-supervised learning has since emerged as a highly active area of robotics research.

Over the course of his career, Fox has earned multiple Best Paper and Test of Time Awards at the preeminent research conferences in artificial intelligence, robotics, computer vision, and ubiquitous computing. Prior to his election as a Fellow of the ACM, Fox attained the status of Fellow of both the Association for the Advancement of Artificial Intelligence (AAAI) and the Institute for Electrical and Electronics Engineers (IEEE). Last year, Fox earned the IEEE Robotics & Automation Society’s RAS Pioneer Award in acknowledgment of his groundbreaking contributions to state estimation, robot perception and machine learning as well as his efforts to bridge academic and industrial research through his many collaborations with Intel, NVIDIA, and other partners.

Arvind Krishnamurthy

Portrait of Arvind Krishnamurthy

Arvind Krishnamurthy was named a Fellow of the ACM for “contributions to networks and distributed computer systems.” Krishnamurthy’s research advances state-of-the-art approaches for building robust and efficient computer systems to support data center operations and internet-scale computing. His work encompasses topics such as internet measurement and reliability, peer-to-peer data sharing, data center performance, network routing, and systems for bridging the gap between machine learning and hardware innovation. 

After earning his Ph.D. at the University of California, Berkeley in 1999, Krishnamurthy began his career as a faculty member at Yale University. Since joining the Allen School faculty in 2005, Krishnamurthy has focused on advancing new network primitives, protocols and services to make the internet more reliable and resilient and to optimize distributed and networked systems inside data centers. Krishnamurthy ‘s work has earned multiple awards at the preeminent conferences in operating and networked systems, including five Best Paper or Best Student Paper designations at the Symposium on Networked Systems Design and Implementation (NSDI). 

One of Krishnamurthy’s contributions to be singled out by NSDI was Reverse Traceroute, the first tool for diagnosing internet performance issues that expanded upon existing capabilities for tracing packets from source to destination to include their asymmetric return paths. The tool gained popularity for troubleshooting performance problems within corporate internal networks as well as on the internet. Three years later, Krishnamurthy and his collaborators presented their award-winning paper describing F10, a fault-tolerant network designed explicitly for the data-center setting to improve the reliability and performance of applications in the cloud. Consisting of a novel network topology and failover protocols designed to cascade and complement each other, F10 proved capable of reestablishing connectivity and load balance almost instantaneously, even in the presence of multiple failures.

Krishnamurthy was also a member of the team behind a novel operating system designed to address recent application and hardware trends that are hampered by the traditional operating-system model. In proposing Arrakis, which earned the Best Paper Award at the 2014 conference on Operating Systems Design and Implementation (OSDI), Krishnamurthy and his collaborators re-engineered the OS kernel to provide network and disk protection without requiring mediation of every operation. As a result, most applications are able to bypass the kernel entirely in favor of direct access to virtualized input/output (I/O) devices. By reimagining the traditional role of the OS kernel, the researchers demonstrated that high performance and protection could go hand in hand.

More recently, Krishnamurthy co-led a team of researchers that created TVM, a framework that enabled researchers and practitioners to take advantage of emerging machine learning capabilities — and their attendant productivity gains — by making it easy to deploy the latest deep learning applications on a range of devices without sacrificing battery power or speed. Krishnamurthy and colleagues subsequently earned a Best Paper from IEEE Micro for introducing an extension to TVM, the Versatile Tensor Accelerator (VTA), that offered a customizable deep learning architecture designed to be extensible in the face of evolving workloads while preserving the efficiency gains achieved by hardware specialization. Together, TVM and VTA provided the blueprint for an end-to-end deep learning system for supporting experimentation, optimization, and hardware-software co-design. The team later transitioned TVM to Apache and launched a related startup company, OctoML, for which Krishnamurthy serves as an advisor.

Meredith Ringel Morris

Portrait of Meredith Ringel-Morris

Allen School affiliate professor Meredith “Merrie” Ringel Morris is a senior principal researcher and the research area manager for Interaction, Accessibility and Mixed Reality at Microsoft Research, where she founded the Ability Research Group. An internationally renowned researcher with expertise spanning collaborative technologies, gesture interfaces, social media, crowdsourcing, accessible technologies, universal design, human-centered AI, and more, Morris was named a Fellow of the ACM for “contributions to human-computer interaction, information retrieval, computer-supported cooperative work and accessibility.”

Morris first joined Microsoft Research in 2006 after earning her Ph.D. from Stanford University. She holds more than 20 U.S. patents, and her work has influenced many of Microsoft’s products and services in her 15 years with the company. Morris has authored or co-authored more than 100 research publications, including multiple papers that have been recognized with Best Paper Awards or Lasting Impact Awards at major conferences in the field. Last year, Morris was elected to the CHI Academy by the ACM Special Interest Group on Computer-Human Interaction (SIGCHI) in recognition of her many and varied contributions to HCI research, including establishing the field of collaborative web search, initiating the study of friendsourced information seeking, advancing surface computing and gesture design, and furthering research into accessible social media and communications technologies. Previously, she was recognized as one of Technology Review’s “Innovators under 35” for her work on SearchTogether, a browser plugin that enables collaborative web search. 

Steve Gribble

Portrait of Steve Gribble

Former professor Steve Gribble, who spent 13 years on the Allen School faculty before leaving for industry, was named a Fellow of the ACM for “contributions to virtualization technology across clusters, servers, and networks.” Now a distinguished engineer at Google, Gribble’s current work involves designing, building, and deploying host-side networking systems, I/O offload hardware infrastructure for Google’s cloud, and software-defined networking (SDN) systems that make the company’s planetary-scale networks available, debuggable, and safe to operate.

Gribble joined the UW faculty in 2000 after earning his Ph.D. from the University of California, Berkeley. Together with Allen School colleagues, in 2006 he co-founded Skytap, a venture-backed startup company that provides cloud-based software development, test and deployment platforms. Gribble has authored or co-authored more than 70 research publications, including seven that earned Best Paper or Best Student Paper Awards at major systems conferences. During his time at the Allen School, Gribble received an Alfred P. Sloan Research Fellowship and a National Science Foundation CAREER Award for his work on the design and operation of robust, scalable internet infrastructure and services, mobile computing, virtual machine monitors, operating systems, and networks.

The new Fellows will be formally inducted at the ACM awards banquet in June. Learn more about the 2020 class of ACM Fellows here.

Congratulations to Dieter, Arvind, Merrie and Steve!

February 17, 2021

Computing Research Association recognizes undergraduates advancing data science for mental health, commonsense reasoning, mobile health sensing, and more

Joy He-Yueya applies data science techniques to measures of patient behavior to assess how they might predict the onset of schizophrenia symptoms. Meanwhile, Ximing Lu uses machine learning to improve cancer diagnosis and explores how neural language models can advance commonsense reasoning. Parker Ruth builds mobile sensing systems to detect and monitor a variety of health conditions, while Jenny Liang develops programming techniques to support developer education and collaboration. And Emily Bascom contributes to research aimed at promoting user privacy and improving patient care.

For producing results with the potential for real-world impact, each of these University of Washington students — three nominated by the Allen School, two by the Information School — recently earned national recognition as part of the Computing Research Association’s 2021 Outstanding Undergraduate Researcher Awards. Each year, the CRA competition highlights a select group of undergraduates at universities and colleges across North America who are already advancing the field of computing and making an impact through their work. The achievements of the five outstanding student researchers honored in this year’s competition are a testament to UW’s commitment to undergraduate research; they are also proof that it’s never too early to begin using computing in the service of social good.

Joy He-Yueya (Awardee – Allen School)

Joy He-Yueya portrait

CRA award recipient Joy He-Yueya is a senior majoring in computer science in the Allen School who has engaged in a variety of research projects related to health and education. Last year, He-Yueya began working with professor Tim Althoff, leader of the Allen School’s Behavioral Data Science Group, and UW Medicine faculty Benjamin Buck and Dror Ben-Zeev on a project seeking to mine the vast quantities of data generated by passively sensed measures of behavioral stability to support mental health. Their goal was to use  data science techniques to understand the relationship between patients’ routines and the onset of schizophrenia symptoms. He-Yueya, who took the lead on data preparation, analysis and visualization as well as algorithmic development for the project, was first author of the paper describing the team’s findings that recently appeared in the journal npj Schizophrenia

“What sets Joy apart as a student researcher is her independence to lead a research project herself and to collaborate with clinical researchers to connect innovations in computing and measurement to clinical goals,” said Althoff. “She was also very impressive at handling the complexity of a project that involved significant experimentation and seeing a project through from the first ideas to writing and to publication.”

He-Yueya recently contributed to a project at the Max Planck Institute for Software Systems in Saarbrücken, Germany that applies reinforcement learning to generate personalized curricula for students learning to code. She also has been working with researchers at Seattle-based Giving Tech Labs to develop methods for identifying relationships between voice and emotions and between voice and aging. In addition to her research, He-Yueya has served as a teaching assistant for the Allen School’s Introduction to Algorithms and Data Structures and Parallelism courses and has volunteered her time to a number of tutoring and peer mentoring roles — including leading workshops to help her fellow undergraduates get their own start in research.

He-Yueya’s entrée to academic research was working with iSchool professor Katie Davis in the Digital Youth Lab, where she focused on digital incentives for students to pursue science and engineering-related education. She earned a Mary Gates Research Scholarship from UW last year for her work. He-Yueya plans to pursue a Ph.D. following her graduation from the Allen School next spring.

Ximing Lu (Runner-up – Allen School)

Ximing Lu portrait

Ximing Lu, who is majoring in computer science and statistics, was named a runner-up in the CRA competition for her work in machine learning and natural language processing. In less than three years, Lu has contributed to four major papers in submission — three of them as lead author. Her first foray into undergraduate research was working on a project with professor Linda Shapiro, who holds a joint appointment in the Allen School and Department of Electrical & Computer Engineering, that applies machine learning to improve the speed and accuracy of breast cancer diagnosis by reducing the uncertainty that stems from subjective human interpretation. The system they designed, Holistic ATtention Network (HATNet), is capable of learning representations from clinically relevant tissue structures without explicit supervision to classify gigapixel-sized biopsy images with the same level of accuracy as human pathologists.

Since last year, Lu has collaborated with Allen School professor Yejin Choi and colleagues in the Allen Institute for AI’s MOSAIC group on multiple projects seeking to advance the state of the art in natural language processing and visual commonsense reasoning. Among Lu’s contributions is TuBERT, a new multi-modal neural network capable of commonsense reasoning about the temporal relationship between visual events using a combination of YouTube video content and clues from their accompanying text narratives. Since its introduction, TuBERT has achieved state-of-the-art results on multiple commonsense reasoning tasks by outperforming substantially larger, commercially funded neural networks. Lu has also worked on Reflective Decoding, an approach for enabling pre-trained language models to complete paraphrasing and abductive text-infilling tasks without supervision, and the NeuroLogic decoding algorithm for controlling neural text generation models through logical constraints.

“Ximing is one of the most creative and innovatives undergraduate students I have had the pleasure to work with,” said Choi. “She has an impressive ability to rapidly synthesize new technical ideas based on seemingly disconnected pieces. Everyone in my lab has been eager to collaborate with her.”

Last fall, Lu received the Lisa Simonyi Prize honoring an Allen School student who exemplifies excellence, leadership and diversity. She was also named a Levinson Emerging Scholar by the UW in recognition of her accomplishments in research. After graduation, Lu plans to continue her studies next fall as a student in the Allen School’s fifth-year master’s program.

Parker Ruth (Finalist – Allen School)

Parker Ruth portrait

Parker Ruth, a computer engineering and bioengineering major advised by professor Shwetak Patel in the Allen School’s UbiComp Lab, was named a finalist by CRA for his cross-disciplinary work on mobile health sensing technologies and computational tools for supporting population health. During his more than three years as an undergraduate researcher, Ruth has contributed to multiple projects aimed at enabling early identification and monitoring of symptoms and risk factors associated with a variety of  medical conditions. His efforts have included the development of non-invasive, smartphone-based tools for measuring bone density to screen for osteoporosis, tracking cardiovascular disease through real-time measurement of pulse transit time, and detecting cough to facilitate diagnosis and monitoring of respiratory illness.

Most recently, Ruth has led the design of a smartphone-based exercise sensing system that employs ultrasonic sonar to measure physical activity as part of the Exercise Rx initiative in collaboration with The Sports Institute at UW Medicine. He is also developing algorithms to screen for risk of stroke by measuring blood flow in videos of the face. In addition, Ruth has contributed to the development of a wearable pulse sensing system for detecting a rare but serious cardiovascular condition known as postural orthostatic tachycardia syndrome. In response to the current pandemic, Ruth has worked on environmental sampling and viral detection protocols for screening air filtration systems in public transit and, in collaboration with bioengineering professor Barry Lutz, built image processing software that powers a smartphone-based tool for streamlining molecular assays for the virus in order to speed up diagnosis.

“Parker is advancing how we think about digital health and how commodity devices can play a role in democratizing health care and increasing access for everyone. He has demonstrated unprecedented work ethic, creativity, rigor, and an unique ability to present his work to a general audience,” said Patel. “He already shows the maturity of a graduate student and the capacity to define broad research agendas. On top of that, he is the most humble and selfless person you will ever meet.”

Ruth previously was named a Goldwater Scholar and recognized as a member of the Husky 100. He is a student in the University’s Interdisciplinary Honors Program and Lavin Entrepreneurship Honors Program and has been active in outreach to K-12 students, including helping to oversee the UbiComp Lab’s high school mentorship program.

Jenny Liang (Honorable Mention – iSchool)

Jenny Liang portrait

Jenny Liang is well-known to the Allen School community, as she majors in both computer science and informatics. Her research spans software engineering, human-computer interaction, and applied machine learning. Liang earned an Honorable Mention from the CRA for her work with iSchool professor and Allen School adjunct professor Amy Ko in the Code & Cognition Lab

Liang collaborated with Ko and partners at George Mason University on the development of HowTooDev, a searchable knowledge base of strategies for solving hard programming problems that has the potential to transform programmer productivity and reshape computer science education. Liang’s contributions to the project included the development and testing of multiple search-interface prototypes and a classification ​system for​ various programming activities. She combined the latter with semantic text search that leverages natural-language strategy notations to build the front and back end of a robust strategy search engine.

“Jenny is a force,” said Ko. “She is the kind of force that we don’t find often in academia — the kind that pushes the boundaries of our knowledge, and leads.”

Previously, Liang worked with postdoc Spencer Sevilla and professor Kurtis Heimerl in the Allen School’s Information and Communication Technology for Development (ICTD) Lab on the development of LTE-based community-owned networks to connect rural and developing communities to the internet. Liang co-authored a paper presenting the team’s solution, dubbed CoLTE, that appeared at the 25th International Conference on Mobile Computing and Networking (Mobicom 2019). The recipient of the 2020 Allen AI Outstanding Engineer Scholarship for Women and Underrepresented Minorities, since last summer Liang has been an intern with AI2’s MOSAIC team working on toxic language classification. She previously was honored with the Allen School’s Undergraduate Service Award and recognized as a member of the 2020 class of the Husky 100.

Emily Bascom (Honorable Mention – iSchool)

Emily Bascom portrait

Informatics major Emily Bascom earned an Honorable Mention from CRA for her work on user privacy and information technology tools for improving patient outcomes. Bascom, who is pursuing a concentration in human-computer interaction, spent two years working with iSchool professor Alexis Hiniker in the User Empowerment Lab. There, she focused on a project examining the privacy risks associated with ubiquitous audio recording capabilities of smartphones, smart speakers, and other devices. Her contributions included helping to design the study protocol, leading design workshops with study participants, and analyzing data generated by design sessions and interviews. 

“It was apparent throughout the project that Emily is a very talented scholar with an exciting career ahead of her,” said Hiniker, who is also an adjunct professor in the Allen School. “Her design insights and intellectual contributions far exceeded my expectations, and I can’t wait to see her translate those kinds of contributions into social change in the future.”

Bascom also collaborated with iSchool professor Wanda Pratt in the iMed research group on a project to understand how best to support patients and caregivers in acting as safeguards for hospital care — including improving communication between providers and patients to reduce medical errors. The researchers developed a tool, NURI, that enables patients and caregivers to record audio and semi-automatically transcribe their interactions with physicians and help them to understand the information they were given during those interactions. Bascom’s contributions included qualitative analysis of the user studies and preparation of the manuscript detailing the team’s findings and related work. She subsequently contributed to all aspects of a project led by Dr. Ari Pollack of UW Medicine and Seattle Children’s Hospital to develop tools to support pediatric kidney transplant patients, including protocol development, qualitative data analysis, and manuscript preparation.

Read more about Liang’s and Bascom’s work in a related iSchool story here.

Congratulations to these five exceptional researchers on their achievements!

January 15, 2021

“You are not a drop in the ocean, you are the entire ocean in a drop”: New Allen School scholarship will turn a family’s loss into students’ dreams fulfilled

Leo Maddox Schneider smiling under a multi-colored canopy against a vivid blue sky
Leo Maddox Schneider: July 7, 2005 – January 12, 2019

As a student at Seattle’s Hamilton International Middle School, Leo Maddox Schneider demonstrated early mastery of mathematics and languages, was an avid gamer and athlete, and carved out a reputation as a budding conservationist. Enthusiastic about learning from an early age, Leo had already taken to heart his mother Sylvia Bolton’s advice to find something that he loved and was passionate about and to make that his profession. As she relayed to her son at the time, “it will bring fulfillment and a lot of happiness.”

What Leo loved was computer coding and Lego design; what he was passionate about were environmental causes. He might have pursued both at the University of Washington if not for the injuries he sustained in an automobile accident. Four and a half months later, on January 12, 2019, Leo passed away from those injuries and related complications. Nearly two years after that tragic loss, the foundation established by Leo’s mother to honor her son’s memory will give Allen School students the opportunity to fulfill their own dreams and carry on his legacy through the Leo Maddox Foundation Scholarship in Computer Science & Engineering.

”Leo loved computer science,” Bolton explained. “He and his friend Lennox shared a dream of attending a university that excelled in computer science so they could build their own company and make a difference in the world.”

Even at the tender age of 13, Leo was already well on his way toward making that difference. He forged enduring friendships with Lennox and Judson while playing Minecraft and Fortnite, which helped spark his interest in coding. He was already three years ahead of his grade level in mathematics and conversant in both Spanish and Bulgarian. His enthusiasm for the outdoors led Leo to champion environmental causes; he once convinced his mother to enter into one of their “non-negotiable” agreements permitting him to collect garbage for recycling. (Another of their non-negotiable agreements stipulated that he would eat his vegetables at dinner.) Leo was particularly passionate about the ocean, learning to swim with dolphins and developing a love of boat building craftsmanship inspired in part by his mother’s work as a luxury yacht designer.

“Everyone knew Leo as having a big, sweet soul and people just loved him. Losing him turned our world upside down into complete darkness,” recalled Bolton. “But we do not want the tragedy of Leo’s passing to define him. Leo was and will always be remembered as the smart, kind and compassionate kid who was gifted at math and science, loved the outdoors, and was a friend to many. With so much life ahead of him.”

To that end, Bolton established the Leo Maddox Foundation as a way to ensure that Leo’s legacy and aspirations for the future would live on in others. The Foundation supports a variety of initiatives designed to help promising young students with financial need to fully achieve their academic and creative potential, from assisting Rainier Scholars to go to college, to “Love, Leo” genius grants inspired by their namesake’s creative, can-do approach to solving problems he saw in the world. The new Leo Maddox Foundation Scholarship in Computer Science & Engineering will support Allen School undergraduate students in covering the cost of tuition and other educational expenses based on academic merit and financial need.

UW Huskies football player makes the "Dubs up!" sign with his fingers, with his arm around Leo Maddox Schneider

“We are heartbroken that Leo will never get the chance to apply to the Allen School and our hearts and prayers are with his family. We are deeply appreciative of the scholarship established by the Foundation in his name,” said professor Magdalena Balazinska, director of the Allen School. “This scholarship will touch many lives. It will promote the success of many talented students who need support to fulfill their dreams.”

In deference to her son’s twin loves, in addition to the Allen School scholarship Bolton also created the Leo Maddox Foundation Scholarship in Oceanography to support students in the School of Oceanography engaged in climate-related studies. The university’s preeminence in both disciplines and focus on student support convinced the Foundation to entrust it with Leo’s memory.

“As important as it is for the Leo Maddox Foundation to support young adults, it is equally important that we do so with the leaders in both fields,” said Vivian Ho, creator of the Leo Maddox Foundation. “In conducting our due diligence, it was clear that the University of Washington had a lot to offer in both areas of study and in shaping support for student scholarships. They created the perfect vehicles for our founder, Sylvia Bolton, to make the impactful difference she was seeking for Leo’s legacy.”

Learn more about the Leo Maddox Foundation Scholarships here and Leo’s life and legacy here.

December 8, 2020

Uncovering secrets of the “black box”: Pedro Domingos, author of “The Master Algorithm,” shares new work examining the inner workings of deep learning models

Pedro Domingos leaning on a stairwell railing
Pedro Domingos in the Paul G. Allen Center for Computer Science & Engineering. In his latest work, Domingos lifts the lid on the black box of deep learning. Dennis Wise/University of Washington

Deep learning has been immensely successful in recent years, spawning a lot of hope and generating a lot of hype, but no one has really understood why it works. The prevailing wisdom has been that deep learning is capable of discovering new representations of the data, rather than relying on hand-coded features like other learning algorithms do. But because deep networks are black boxes — what Allen School professor emeritus Pedro Domingos describes as “an opaque mess of connections and weights” — how that discovery actually happens is anyone’s guess.

Until now, that is. In a new paper posted on the preprint repository arXiv, Domingos gives us a peek inside that black box and reveals what is — and just as importantly, what isn’t — going on inside. Read on for a Q&A with Domingos on his latest findings, what they mean for our understanding of how deep learning actually works, and the implications for researchers’ quest for a “master algorithm” to unify all of machine learning.

You lifted the lid off the so-called black box of deep networks, and what did you find?

Pedro Domingos: In short, I found that deep networks are not as unintelligible as we thought, but neither are they as revolutionary as we thought. Deep networks are learned by the backpropagation algorithm, an efficient implementation for neural networks of the general gradient descent algorithm that repeatedly tweaks the network’s weights to make its output for each training input better match the true output. That process helps the model learn to label an image of a dog as a dog, and not as a cat or as a chair, for instance. This paper shows that all gradient descent does is memorize the training examples, and then make predictions about new examples by comparing them with the training ones. This is actually a very old and simple type of learning, called similarity-based learning, that goes back to the 1950s. It was a bit of a shock to discover that, more than half a century later, that’s all that is going on in deep learning!

Deep learning has been the subject of a lot of hype. How do you think your colleagues will respond to these findings?

PD: Critics of deep learning, of which there are many, may see these results as showing that deep learning has been greatly oversold. After all, what it does is, at heart, not very different from what 50-year-old algorithms do — and that’s hardly a recipe for solving AI! The whole idea that deep learning discovers new representations of the data, rather than relying on hand-coded features like previous methods, now looks somewhat questionable — even though it has been deep learning’s main selling point.

Conversely, some researchers and fans of deep learning may be reluctant to accept this result, or at least some of its consequences, because it goes against some of their deepest beliefs (no pun intended). But a theorem is a theorem. In any case, my goal was not to criticize deep learning, which I’ve been working in since before it became popular, but to understand it better. I think that, ultimately, this greater understanding will be very beneficial for both research and applications in this area. So my hope is that deep learning fans will embrace these results.

So it’s a good news/bad news scenario for the field? 

PD: That’s right. In “The Master Algorithm,” I explain that when a new technology is as pervasive and game-changing as machine learning has become, it’s not wise to let it remain a black box. Whether you’re a consumer influenced by recommendation algorithms on Amazon, or a computer scientist building the latest machine learning model, you can’t control what you don’t understand. Knowing how deep networks learn gives us that greater measure of control.

So, the good news is that it is now going to be much easier for us to understand what a deep network is doing. Among other things, the fact that deep networks are just similarity-based algorithms finally helps to explain their brittleness, whereby changing an example just slightly can cause the network to make absurd predictions. Up until now, it has puzzled us why a minor tweak would, for example, lead a deep network to suddenly start labeling a car as an ostrich. If you’re training a model for a self-driving car, you probably don’t want to hit either, but for multiple reasons — not least, the predictability of what an oncoming car might do compared to an oncoming ostrich — I would like the vehicle I’m riding in to be able to tell the difference. 

But these findings could be considered bad news in the sense that it’s clear there is not much representation learning going on inside these networks, and certainly not as much as we hoped or even assumed. How to do that remains a largely unsolved problem for our field.

If they are essentially doing 1950s-style learning, why would we continue to use deep networks?

PD: Compared to previous similarity-based algorithms such as kernel machines, which were the dominant approach prior to the emergence of deep learning, deep networks have a number of important advantages. 

One is that they allow incorporating bits of knowledge of the target function into the similarity measure — the kernel — via the network architecture. This is advantageous because the more knowledge you incorporate, the faster and better you can learn. This is a consequence of what we call the “no free lunch” theorem in machine learning: if you have no a priori knowledge, you can’t learn anything from data besides memorizing it. For example, convolutional neural networks, which launched the deep learning revolution by achieving unprecedented accuracy on image recognition problems, differ from “plain vanilla” neural networks in that they incorporate the knowledge that objects are the same no matter where in the image they appear. This is how humans learn, by building on the knowledge they already have. If you know how to read, then you can learn about science much faster by reading textbooks than by rediscovering physics and biology from scratch.

Another advantage to deep networks is that they can bring distant examples together into the same region, which makes learning more complex functions easier. And through superposition, they’re much more efficient at storing and matching examples than other similarity-based approaches.

Can you describe superposition for those of us who are not machine learning experts?

PD: Yes, but we’ll have to do some math. The weights produced by backpropagation contain a superposition of the training examples. That is, the examples are mapped into the space of variations of the function being learned and then added up. As a simple analogy, if you want to compute 3 x 5 + 3 x 7 + 3 x 9, it would be more efficient to instead compute 3 x ( 5 + 7 + 9) = 3 x 21. The 5, 7 and 9 are now “superposed” in the 21, but the result is still the same as if you separately multiplied each by 3 and then added the results.

The practical result is that deep networks are able to speed up learning and inference, making them more efficient, while reducing the amount of computer memory needed to store the examples. For instance, if you have a million images, each with a million pixels, you would need on the order of terabytes to store them. But with superposition, you only need an amount of storage on the order of the number of weights in the network, which is typically much smaller. And then, if you want to predict what a new image contains, such as a cat, you need to cycle through all of those training images and compare them with the new one. That can take a long time. With superposition, you just have to pass the image through the network once. That takes much less time to execute. It’s the same with answering questions based on text; without superposition, you’d have to store and look through the corpus, instead of a compact summary of it.

So your findings will help to improve deep learning models?

PD: That’s the idea. Now that we understand what is happening when the aforementioned car suddenly becomes an ostrich, we should be able to account for that brittleness in the models. If we think of a learned model as a piece of cheese and the failure regions as holes in that cheese, we now understand better where those holes are, and what their shape and size is. Using this knowledge, we can actively figure out where we need new data or adjustments to the model to fix the holes. We should also improve our ability to defend against attacks that cause deep networks to misclassify images by tweaking some pixels such that they cause the network to fall into one of those holes. An example would be attempts to fool self-driving cars into misrecognizing traffic signs.

What are the implications of your latest results in the search for the master algorithm?

PD: These findings represent a big step forward in unifying the five major machine learning paradigms I described in my book, which is our best hope for arriving at that universal learner, what I call the “master algorithm.” We now know that all learning algorithms based on gradient descent — including but not limited to deep networks — are similarity-based learners. This fact serves to unify three of the five paradigms: neural, probabilistic, and similarity-based learning. Tantalizingly, it may also be extensible to the remaining two, symbolic and genetic learning.

Given your findings, what’s next for deep learning? Where does the field go from here?

PD: I think deep learning researchers have become too reliant on backpropagation as the near-universal learning algorithm. Now that we know how limited backprop is in terms of the representations it can discover, we need to look for better learning algorithms! I’ve done some work in this direction, using combinatorial optimization to learn deep networks. We can also take inspiration from other fields, such as neuroscience, psychology, and evolutionary biology. Or, if we decide that representation learning is not so important after all — which would be a 180-degree change — we can look for other algorithms that can form superpositions of the examples and that are compact and generalize well.

Now that we know better, we can do better.

Read the paper on arXiv here.

December 2, 2020

Allen School’s Pedro Domingos and Daniel Weld elected Fellows of the American Association for the Advancement of Science

The American Association for the Advancement of Science, the world’s largest general scientific society, has named Allen School professor emeritus Pedro Domingos and professor Daniel Weld among its class of 2020 AAAS Fellows honoring members whose scientifically or socially distinguished efforts have advanced science or its applications. Both Domingos and Weld were elected Fellows in the organization’s Information, Computing, and Communication section for their significant impact in artificial intelligence and machine learning research.

Pedro Domingos

Pedro Domingos portrait

Domingos was honored by the AAAS for wide-ranging contributions in AI spanning more than two decades and 200 technical publications aimed at making it easier for machines to discover new knowledge, learn from experience, and extract meaning from data with little or no help from people. Prominent among these, to his AAAS peers, was his introduction of Markov logic networks unifying logical and probabilistic reasoning. He and collaborator Matthew Richardson (Ph.D., ‘04) were, in fact, the first to coin the term Markov logic networks (MLN) when they presented their simple yet efficient approach that combined first-order logic and probabilistic graphical models to support inference learning. 

Domingos’ work has resulted in several other firsts that represented significant leaps forward for the field. He again applied Markov logic to good effect to produce the first unsupervised approach to semantic parsing — a key method by which machines extract knowledge from text and speech and a foundation of machine learning and natural language processing — in collaboration with then-student Hoifung Poon (Ph.D., ‘11). Later, Domingos worked with graduate student Austin Webb (M.S., ‘13) on Tractable Markov Logic (TML), the first non-trivially tractable first-order probabilistic language that suggested efficient first-order probabilistic inference could be feasible on a larger scale.

Domingos also helped launch a new branch of AI research focused on adversarial learning through his work with a team of students on the first algorithm to automate the process of adversarial classification, which enabled data mining systems to adapt in the face of evolving adversarial attacks in a rapid and cost-effective way. Among his other contributions was the Very Fast Decision Tree learner (VFDT) for mining high-speed data streams, which retained its status as the fastest such tool available for 15 years after Domingos and Geoff Hulten (Ph.D., ‘05) first introduced it.

In line with the AAAS’ mission to engage the public in science, in 2015 Domingos published The Master Algorithm: How the Quest for the Ultimate Learning Machine Will Remake Our World. Geared to the expert and layperson alike, the book offers a comprehensive exploration of how machine learning technologies influence nearly every aspect of people’s lives — from what ads and social posts they see online, to what route their navigation system dictates for their commute, to what movie a streaming service suggests they should watch next. It also serves as a primer on the various schools of thought, or “tribes,” in the machine learning field that are on a quest to find the master algorithm capable of deriving all the world’s knowledge from data.

Prior to this latest honor, Domingos was elected a Fellow of the Association for the Advancement of Artificial Intelligence (AAAI) and earned two of the highest accolades in data science and AI: the SIGKDD Innovation Award from the Association of Computing Machinery’s Special Interest Group on Knowledge Discovery and Data Mining, and the IJCAI John McCarthy Award from the International Joint Conference on Artificial Intelligence.

Daniel S. Weld

Daniel Weld portrait

AAAS recognized Weld for distinguished contributions in automated planning, software agents, crowdsourcing, and internet information extraction during a research career that spans more than 30 years. As leader of the UW’s Lab for Human-AI Interaction, Weld seeks to combine human and machine intelligence to accomplish more than either could on their own. To that end, he and his team focus on explainable machine learning, intelligible and trustworthy AI, and human-AI team architectures to enable people to better understand and control AI-driven tools, assistants, and systems.

Weld has focused much of his career on advanced intelligent user interfaces for enabling more seamless human-machine interaction. Prominent among these is SUPPLE, a system he developed with Kryzstof Gajos (Ph.D., ‘08) that dynamically and optimally renders user interfaces based on device characteristics and usage patterns while minimizing user effort. Recognizing the potential for that work to improve the accessibility of online tools for people with disabilities, the duo subsequently teamed up with UW Information School professor and Allen School adjunct professor Jacob Wobbrock to extend SUPPLE’s customization to account for a user’s physical capabilities as well.

Another barrier that Weld has sought to overcome is the amount of human effort required to organize and maintain the very large datasets that power AI applications. To expedite the process, researchers turned to crowdsourcing, but the sheer size and ever-changing nature of the datasets still made it labor-intensive. Weld, along with Jonathan Bragg (Ph.D., ‘18) and affiliate faculty member Mausam (Ph.D., ‘07), created Deluge to optimize the process of multi-label classification that significantly reduced the amount of labor required compared to the previous state of the art without sacrificing quality. Quality control is a major theme of Weld’s work in this area, which has yielded new tools such as Sprout for improving task design, MicroTalk and Cicero for augmenting decision-making, and Gated Instruction for more accurate relation extraction.

In addition to his technical contributions, AAAS also cited Weld’s impact via the commercialization of new AI technologies. During his tenure on the UW faculty, he co-founded multiple venture-backed companies based on his research: Netbot Inc., creator of the first online comparison shopping engine that was acquired by Excite; AdRelevance, an early provider of tools for monitoring online advertising data that was acquired by Nielsen Netratings; and Nimble Technology, a provider of business intelligence software that was acquired by Actuate. Weld has since gone from founder to funder as a venture partner and member of the Technology Advisory Board at Madrona Venture Group.

Weld, who holds the Thomas J. Cable/WRF Professorship, presently splits his time between the Allen School, Madrona, and the Allen Institute for Artificial Intelligence (AI2), where he directs the Semantic Scholar research group focused on the development of AI-powered research tools to help scientists overcome information overload and extract useful knowledge from the vast and ever-growing trove of scholarly literature. Prior to this latest recognition by AAAS, Weld was elected a Fellow of both the AAAI and the ACM. He is the author of roughly 200 technical papers and two books on AI on the theories of comparative analysis and planning-based information agents, respectively.

Domingos and Weld are among four UW faculty members elected as AAAS Fellows this year. They are joined by Eberhard Fetz, a professor in the Department of Physiology & Biophysics and DXARTS who was honored in the Neuroscience section for his contributions to understanding the role of the cerebral cortex in controlling ocular and forelimb movements as well as motor circuit plasticity, and Daniel Raftery, a professor in UW Medicine’s Department of Anesthesiology and Pain Medicine who was honored in the Chemistry section for his contributions in the fields of metabolomics and nuclear magnetic resonance, including advanced analytical methods for biomarker discovery and cancer diagnosis.

Read the AAAS announcement here and the UW News story here.

Congratulations to Pedro, Dan, and all of the honorees!

November 24, 2020

Porcupine molecular tagging scheme offers a sharp contrast to conventional inventory control systems

MISL logo
MISL logo

Many people have had the experience of being poked in the back by those annoying plastic tags while trying on clothes in a store. That is just one example of radio frequency identification (RFID) technology, which has become a mainstay not just in retail but also in manufacturing, logistics, transportation, health care, and more. And who wouldn’t recognize the series of black and white lines comprising that old grocery-store standby, the scannable barcode? That invention — which originally dates back to the 1950s — eventually gave rise to the QR code, whose pixel patterns serve as a bridge between physical and digital content in the smartphone era.

Despite their near ubiquity, these object tagging systems have their shortcomings: they may be too large or inflexible for certain applications, they are easily damaged or removed, and they may be impractical to apply in high quantities. But recent advancements in DNA-based data storage and computation offer new possibilities for creating a tagging system that is smaller and lighter than conventional methods.

That’s the point of Porcupine, a new molecular tagging system introduced by University of Washington and Microsoft researchers that can be programmed and read within seconds using a portable nanopore device. In a new paper published in Nature Communications, the team in the Molecular Information Systems Laboratory (MISL) describe how dehydrated strands of synthetic DNA can take the place of bulky plastic or printed barcodes. Building on recent developments in nanopore-based DNA sequencing technologies and raw signal processing tools, the team’s inexpensive and user-friendly design eschews the need for access to specialized labs and equipment. 

“Molecular tagging is not a new idea, but existing methods are still complicated and require access to a lab, which rules out many real-world scenarios,” said lead author Kathryn Doroschak, a Ph.D. student in the Allen School. “We designed the first portable, end-to-end molecular tagging system that enables rapid, on-demand encoding and decoding at scale, and which is more accessible than existing molecular tagging methods.”

Diagram of steps in Porcupine tagging system

Instead of radio waves or printed lines, the Porcupine tagging scheme relies on a set of distinct DNA strands called molbits — short for molecular bits — that incorporate highly separable nanopore signals to ease later readout. Each individual molbit comprises one of 96 unique barcode sequences combined with a longer DNA fragment selected from a set of predetermined sequence lengths. Under the Porcupine system, the binary 0s and 1s of a digital tag are signified by the presence or absence of each of the 96 molbits.

“We wanted to prove the concept while achieving a high rate of accuracy, hence the initial 96 barcodes, but we intentionally designed our system to be modular and extensible,” explained MISL co-director Karin Strauss, senior principal research manager at Microsoft Research and affiliate professor in the Allen School. “With these initial barcodes, Porcupine can produce roughly 4.2 billion unique tags using basic laboratory equipment without compromising reliability upon readout.”

Although DNA is notoriously expensive to read and write, Porcupine gets around this by presynthesizing the fragments of DNA. In addition to lowering the cost, this approach has the added advantage of enabling users to arbitrarily mix existing strands to quickly and easily create new tags. The molbits are prepared for readout during initial tag assembly and then dehydrated to extend shelf life of the tags. This approach protects against contamination from other DNA present in the environment while simultaneously reducing readout time later.

Another advantage of the Porcupine system is that molbits are extremely tiny, measuring only a few hundred nanometers in length. In practical terms, this means each molecular tag is small enough to fit over a billion copies within one square millimeter of an object’s surface. This makes them ideal for keeping tabs on small items or flexible surfaces that aren’t suited to conventional tagging methods. Invisible to the naked eye, the nanoscale form factor also adds another layer of security compared to conventional tags.

The Porcupine team: (top, from left) Kathryn Doroschak, Karen Zhang, Melissa Queen, Aishwarya Mandyam; (bottom, from left) Karin Strauss, Luis Ceze, Jeff Nivala

“Unlike existing inventory control methods, DNA tags can’t be detected by sight or touch. Practically speaking, this means they are difficult to tamper with,” explained co-author Jeff Nivala, a research scientist at the Allen School. “This makes them ideal for tracking high-value items and separating legitimate goods from forgeries. A system like Porcupine could also be used to track important documents. For example, you could envision molecular tagging being used to track voters’ ballots and prevent tampering in future elections.”

To read the data in a Porcupine tag, a user rehydrates the tag and runs it through a portable Oxford Nanopore Technologies’ MinION device. To demonstrate, the researchers encoded and then decoded their lab acronym, “MISL,” reliably and within a few seconds using the Porcupine system. As advancements in nanopore technologies make them increasingly affordable, the team believes molecular tagging could become an increasingly attractive option in a variety of real-world settings.

“Porcupine is one more exciting example of a hybrid molecular-electronic system, combining molecular engineering, new sensing technology and machine learning to enable new applications.” said Allen School professor and MISL co-director Luis Ceze

In addition to Ceze, Doroschak, Nivala and Strauss, contributors to the project include Allen School undergraduate Karen Zhang, master’s student Aishwarya Mandyam, and Ph.D. student Melissa Queen. This research was funded in part by the Defense Advanced Research Project Agency (DARPA) under its Molecular Informatics Program and gifts from Microsoft.

Read the paper in Nature Communications here.

November 3, 2020

Allen School professor Yin Tat Lee earns Packard Fellowship to advance the fundamentals of modern computing

Yin Tat Lee, a professor in the Allen School’s Theory of Computation group and visiting researcher at Microsoft Research, has earned a Packard Fellowship for Science and Engineering for his work on faster optimization algorithms that are fundamental to the theory and practice of computing and many other fields, from mathematics and statistics, to economics and operations research. Each year, the David and Lucile Packard Foundation bestows this prestigious recognition upon a small number of early-career scientists and engineers who are at the leading edge of their respective disciplines. Lee is among just 20 researchers nationwide — and one of only two in the Computer & Information Sciences category — to be chosen as members of the 2020 class of fellows. 

“In a year when we are confronted by the devastating impacts of a global pandemic, racial injustice, and climate change, these 20 scientists and engineers offer us a ray of hope for the future,” Frances Arnold, Packard Fellowships Advisory Panel Chair and 2018 Nobel Laureate in Chemistry, said in a press release. “Through their research, creativity, and mentorship to their students and labs, these young leaders will help equip us all to better understand and address the problems we face.”

Lee’s creative approach to addressing fundamental problems in computer science became apparent during his time as a Ph.D. student at MIT, where he earned the George M. Sprowls Award for outstanding doctoral thesis for advancing state-of-the-art solutions to important problems in linear programming, convex programming, and maximum flow. Lee’s philosophy toward research hinges on a departure from the conventional approach taken by many theory researchers, who tend to view problems in continuous optimization and in combinatorial, or discrete, optimization in isolation. Among his earliest successes was a new general interior point method for solving general linear programs that produced the first significant improvement in the running time of linear programming in more than two decades — a development that earned him and his collaborators both the Best Student Paper Award and a Best Paper Award at the IEEE Symposium on Foundations of Computer Science (FOCS 2014). Around that same time, Lee also contributed to a new approximate solution to the maximum flow problem in near-linear time, for which he and the team were recognized with a Best Paper Award at the ACM-SIAM Symposium on Discrete Algorithms (SODA 2014). The following year, Lee and his colleagues once again received a Best Paper Award at FOCS, this time for unveiling a faster cutting plane method for solving convex optimization problems in near-cubic time.

Since his arrival at the University of Washington in 2017, Lee has continued to show his eagerness to apply techniques from one area of theoretical computer science to another in unexpected ways — often to great effect. 

“Even at this early stage in his career, Yin Tat is regarded as a revolutionary figure in convex optimization and its applications in combinatorial optimization and machine learning,” observed his Allen School colleague James Lee. “He often picks up new technical tools as if they were second nature and then applies them in remarkable and unexpected ways. But it’s at least as surprising when he uses standard tools and still manages to break new ground on long-standing open problems!”

One of those problems involved the question of how to optimize non-smooth convex functions in distributed networks to enable the efficient deployment of machine learning applications that rely on massive datasets. Researchers had already made progress in optimizing the trade-offs between computation and communication time for smooth and strongly convex functions in such networks; Lee and his collaborators were the first to extend a similar theoretical analysis to non-smooth convex functions. The outcome was a pair of new algorithms capable of achieving optimal convergence rates for this more challenging class of functions — and yet another Best Paper Award for Lee, this time from the flagship venue for developments in machine learning research, the Conference on Neural Information Processing Systems (NeurIPS 2018).

Since then, Lee’s contributions have included the first algorithm capable of solving dense bipartite matching in nearly linear time, and a new framework for solving linear programs as fast as linear systems for the first time. The latter work incorporates new techniques that are extensible to a broader class of convex optimization problems.

Having earned a reputation as a prolific researcher — he once set a record for the total number of papers from the same author accepted at one of the top theory conferences, the ACM Symposium on Theory of Computing (STOC), in one year — Lee also has received numerous accolades for the quality and impact of his work. These include a Sloan Research Fellowship, a Microsoft Research Faculty Fellowship, a National Science Foundation CAREER Award, and the A.W. Tucker Prize from the Mathematical Optimization Society.

“Convex optimization is the workhorse that powers much of modern machine learning, and therefore, modern computing. Yin Tat is not only a pivotal figure in the theory that underpins our field, but also one of the brightest young stars in all of computer science,” said Magdalena Balazinska, professor and director of the Allen School. “Combined with his boundless curiosity and passion for collaboration, Yin Tat’s depth of knowledge and technical skill hold the promise for many future breakthroughs. We are extremely proud to have him as a member of the Allen School faculty.”

Lee is the fifth Allen School faculty member to be recognized by the Packard Foundation. As one of the largest nongovernmental fellowships in the country supporting science and engineering research, the Packard Fellowship provides $875,000 over five years to each recipient to grant them the freedom and flexibility to pursue big ideas.

Read the Packard Foundation announcement here.

Congratulations, Yin Tat!

October 15, 2020

Allen School, UCLA and NTT Research cryptographers solve decades-old problem by proving the security of indistinguishability obfuscation

Allen School professor Rachel Lin helped solve a decades-old problem of how to prove the security of indistinguishability obfuscation (iO)

Over the past 20 years, indistinguishability obfuscation (iO) has emerged as a potentially powerful cryptographic method for securing computer programs by making them unintelligible to would-be hackers while retaining their functionality. While the mathematical foundation of this approach was formalized back in 2001 and has spawned more than 100 papers on the subject, much of the follow-up work relied upon new hardness assumptions specific to each application — assumptions that, in some cases, have been broken through subsequent cryptanalysis. Since then, researchers have been stymied in their efforts to achieve provable security guarantees for iO from well-studied hardness assumptions, leaving the concept of iO security on shaky ground.

That is, until now. In a new paper recently posted on public archives, a team that includes University of California, Los Angeles graduate student and NTT Research intern Aayush Jain; Allen School professor Huijia (Rachel) Lin; and professor Amit Sahai, director of the Center for Encrypted Functionalities at UCLA, have produced a theoretical breakthrough that, as the authors describe it, finally puts iO on terra firma. In their paper “Indistinguishability Obfuscation from Well-Founded Assumptions,” the authors show, for the first time, that provably secure iO is constructed from subexponential hardness of four well-founded assumptions, all of which have a long history of study well-rooted in complexity, coding, and number theory: Symmetric External Diffie-Hellman (SXDH) on pairing groups, Learning with Errors (LWE), Learning Parity with Noise (LPN) over large fields, and a Boolean Pseudo-Random Generator (PRG) that is very simple to compute. 

Previous work on this topic has established that, to achieve iO, it is sufficient to assume LWE, SXDH, PRG in NC0 — a very simple model of computation in which every output bit depends on a constant number of input bits — and one other object. That object, in this case, is a structured-seed PRG (sPRG) with polynomial stretch and special efficiency properties, the seed of which consists of both a public and a private part. The sPRG is designed to maintain its pseudo-randomness even when an adversary can see the public seed as well as the output of the sPRG. One of the key contributions from the team’s paper is a new and simple way to leverage LPN over fields and PRG in NC0 to build an sPRG for this purpose.

Co-authors Aayush Jain (left) and Amit Sahai

“I am excited that iO can now be based on well-founded assumptions,” said Lin. “This work was the result of an amazing collaboration with Aayush Jain and Amit Sahai, spanning over more than two years of effort.

“The next step is further pushing the envelope and constructing iO from weaker assumptions,” she explained. “At the same time, we shall try to improve the efficiency of the solutions, which at the moment is very far from being practical.”

This is not the first time Lin has contributed to significant advancements in iO in a quest to bring one of the most advanced cryptographic objects into the mainstream. In previous work, she established a connection between iO and PRG to prove that constant-degree, rather than high-degree, multilinear maps are sufficient for obfuscating programs. She subsequently refined that work with Allen School colleague Stefano Tessaro to reduce the degree of multilinear maps required to construct iO from more than 30 to just three. 

More recently, Lin worked with Jain, Sahai, and UC Santa Barbara professor Prabhanjan Ananth and then-postdoc Christian Matt on a new method for constructing iO without multilinear maps through the use of certain pseudo-random generators with special properties, formalized as Pseudo Flawed-smudging Generators (PFG) or perturbation resilient generators (ΔRG). In separate papers, Lin and her co-authors introduced partially hiding functional encryption for constant-degree polynomials or even branching programs, based only on the SXDH assumption over bilinear groups. Though these works still assumed new assumptions in order to achieve iO, they offered useful tools and ideas that paved the way to the recent new construction.  

Lin, Jain and Sahai aim to build on their latest breakthrough to make the solution more efficient so that it works not just on paper but also in real-world applications.

“These are ambitious goals that will need the joint effort from the entire cryptography community. I look forward to working on these questions and being part of the effort,” Lin concluded.

Read the research paper here, and the press release here. Read a related Quanta magazine article here.

October 12, 2020

Garbage in, garbage out: Allen School and AI2 researchers examine how toxic online content can lead natural language models astray

Metal garbage can in front of brick wall
Photo credit: Pete Willis on Unsplash

In the spring of 2016, social media users turned a friendly online chatbot named Tay — a seemingly innocuous experiment by Microsoft in which the company invited the public to engage with its work in conversational learning  — into a racist, misogynistic potty mouth that the company was compelled to take offline the very same day that it launched. Two years later, Google released its Smart Compose tool for Gmail, a feature designed to make drafting emails more efficient by suggesting how to complete partially typed sentences that also had an unfortunate tendency to suggest a bias towards men — leading the company to eschew the use of gendered pronouns altogether. 

These and other examples serve as a stark illustration of that old computing adage “garbage in, garbage out,” acknowledging that a program’s outputs can only be as good as its inputs. Now, thanks to a team of researchers at the Allen School and Allen Institute for Artificial Intelligence (AI2), there is a methodology for examining just how trashy some of those inputs might be when it comes to pretrained neural language models — and how this causes the models themselves to degenerate into purveyors of toxic content. 

The problem, as Allen School Master’s student Samuel Gehman (B.S., ‘19) explains, is that not all web text is created equal.

“The massive trove of text on the web is an efficient way to train a model to produce coherent, human-like text of its own. But as anyone who has spent time on Reddit or in the comments section of a news article can tell you, plenty of web content is inaccurate or downright offensive,” noted Gehman. “Unfortunately, this means that in addition to higher quality, more factually reliable data drawn from news sites and similar sources, these models also take their cues from low-quality or controversial sources. And that can lead them to churn out low-quality, controversial content.”

The team analyzed how many tries it would take for popular language models to produce toxic content and found that most have at least one problematic generation in 100 tries.

Gehman and the team set out to measure how easily popular neural language models such as GPT-1, GPT-2, and CTRL would begin to generate problematic outputs. The researchers evaluated the models using a testbed they created called RealToxicityPrompts, which contains 100,000 naturally occurring English-language prompts,  i.e., sentence prefixes, that models have to finish. What they discovered was that all three were prone to toxic degeneration even with seemingly innocuous prompts; the models began generating toxic content within 100 generations, and exceeded expected maximum toxicity levels within 1,000 generations.

The team — which includes lead author Gehman, Ph.D. students Suchin Gururangan and Maarten Sap, and Allen School professors and AI2 researchers Yejin Choi and Noah Smithpublished its findings in a paper due to appear at the next conference on Findings of Empirical Methods in Natural Language Processing (Findings of EMNLP 2021).

“We found that if just 4% of your training data is what we would call ‘highly toxic,’ that’s enough to make these models produce toxic content, and to do so rather quickly,” explained Gururangan. “Our research also indicates that existing techniques that could prevent such behavior are not effective enough to safely release these models into the wild.”

That approach, in fact, can backfire in unexpected ways, which brings us back around to Tay — or rather, Tay’s younger “sibling,” Zo. When Microsoft attempted to rectify the elder chatbot’s propensity for going on racist rants, it scrubbed Zo clean of any hint of political incorrectness. The result was a chatbot that refused to discuss any topic suggestive of religion or politics, such as the time a reporter simply mentioned that they live in Iraq and wear a hijab. When the conversation steered towards such topics, Zo’s response would become agitated; if pressed, the chatbot might terminate the conversation altogether.

As an alternative to making certain words or topics automatically off-limits — a straightforward solution but one that lacked nuance, as evidenced by Zo’s refusal to discuss subjects that her filters deemed controversial whether they were or not — Gururangan and his collaborators explored how the use of steering methods such as the fine-tuning of a model with the help of non-toxic data might alleviate the problem. They found that domain-adaptive pre-training (DAPT), vocabulary shifting, and PPLM decoding showed the most promise for reducing toxicity. But it turns out that even the most effective steering methods have their drawbacks: in addition to being computationally and data intensive, they could only reduce, not prevent, neural toxic degeneration of a tested model.

The Allen School and AI2 team behind RealToxicityPrompts, top row from left: Samuel Gehman, Suchin Gururangan, and Maarten Sap; bottom row from left: Yejin Choi and Noah Smith

Having evaluated more conventional approaches and found them lacking, the team is encouraging an entirely new paradigm when it comes to pretraining modern NLP systems. The new framework calls for greater care in the selection of data sources and more transparency around said sources, including public release of original text, source URLs, and other information that would enable a more thorough analysis of these datasets. It also encourages researchers to incorporate value-sensitive or participatory design principles when crafting their models.

“While fine-tuning is preferable to the blunt-instrument approach of simply banning certain words, even the best steering methods can still go awry,” explained Sap. “No method is foolproof, and attempts to clean up a model can have had the unintended consequence of shutting down legitimate discourse or failing to consider language within relevant cultural contexts. We think the way forward is to ensure that these models are more transparent and human-centered, and also reflect what we refer to as algorithmic cultural competency.”

Learn more by visiting the RealToxicityPrompts project page here, and read the research paper here. Check out the AI2 blog post here, and a related Fortune article here.

September 29, 2020

Allen School’s Joseph Jaeger and Cornell Tech’s Nirvan Tyagi honored at CRYPTO 2020 for advancing new framework for analyzing multi-user security

Joseph Jaeger (left) and Nirvan Tiyagi

Allen School postdoctoral researcher Joseph Jaeger and visiting researcher Nirvan Tyagi, a Ph.D. student at Cornell Tech, received the Best Paper by Early Career Researchers Award at the 40th Annual International Cryptology Conference (Crypto 2020) organized by the International Association for Cryptologic Research (IACR). Jaeger and Tyagi, who have been working with professor Stefano Tessaro of the Allen School’s Theory and Cryptography groups, earned the award for presenting a new approach to proving multi-user security in “Handling Adaptive Compromise for Practical Encryption Schemes.” 

Jaeger and Tyagi set out to explore a classic problem in cryptography: How can the security of multi-party communication be assured in cases where an adversary is able to adaptively compromise the security of particular parties? In their winning paper, the authors aim to answer this question by presenting a new, extensible framework enabling formal analyses of multi-user security of encryption schemes and pseudorandom functions in cases where adversaries are able to adaptively compromise user keys. To incorporate an adversary’s ability to perform adaptive compromise, they expanded upon existing simulation-based, property-based security definitions to yield new definitions for simulation-based security under adaptive corruption in chosen plaintext attack (SIM-AC-CPA) and chosen ciphertext attack (SIM-AC-CCA) scenarios. Jaeger and Tyagi also introduced a new security notion for pseudorandom functions (SIM-AC-PRF), to simulate adaptive compromise for one of the basic building blocks of symmetric encryption schemes. This enabled the duo to pursue a modular approach that reduces the complexity of the ideal model analysis by breaking it into multiple steps and splitting it from the analysis of the high-level protocol — breaking from tradition in the process.

“Traditional approaches to formal security analysis are not sufficient to prove confidentiality in the face of adaptive compromise, and prior attempts to address this gap have been shown to be impractical and error-prone,” explained Jaeger. “By employing idealized primitives combined with a modular approach, we avoid the pitfalls associated with those methods. Our framework and definitions can be used to prove adaptive security in a variety of well-studied models, and they are easily applied to a variety of practical encryption schemes employed in real-world settings.”

One of the schemes for which they generated a positive proof was BurnBox, a system that enables users to temporarily revoke access from their devices to files stored in the cloud to preserve their privacy during compelled-access searches — for example, when an agent at a border crossing compels a traveler to unlock a laptop or smartphone to view its contents. In another analysis, the authors applied their framework to prove the security of a commonly used searchable symmetric encryption scheme for preserving the confidentiality of data and associated searches stored in the cloud. In both of the aforementioned examples, Jaeger and Tyagi showed that their approach produced simpler proofs while avoiding bugs contained in previous analyses. They also discussed how their framework could be extended beyond randomized symmetric encryption schemes currently in use to more modern nonce-based encryption — suggesting that their techniques will remain relevant and practical as the use of newer security schemes becomes more widespread.

“Joseph and Nirvan’s work fills an important void in the cryptographic literature and, surprisingly, identifies important aspects in assessing the security of real-world cryptographic systems that have been overlooked,” said Tessaro. “It also defines new security metrics according to which cryptographic systems ought to be assessed, and I can already envision several avenues of future research.”

Read the full research paper here.

Congratulations to Joseph and Nirvan!

August 31, 2020

Older Posts »