Skip to main content

Allen School professor Yin Tat Lee earns Packard Fellowship to advance the fundamentals of modern computing

Yin Tat Lee, a professor in the Allen School’s Theory of Computation group and visiting researcher at Microsoft Research, has earned a Packard Fellowship for Science and Engineering for his work on faster optimization algorithms that are fundamental to the theory and practice of computing and many other fields, from mathematics and statistics, to economics and operations research. Each year, the David and Lucile Packard Foundation bestows this prestigious recognition upon a small number of early-career scientists and engineers who are at the leading edge of their respective disciplines. Lee is among just 20 researchers nationwide — and one of only two in the Computer & Information Sciences category — to be chosen as members of the 2020 class of fellows. 

“In a year when we are confronted by the devastating impacts of a global pandemic, racial injustice, and climate change, these 20 scientists and engineers offer us a ray of hope for the future,” Frances Arnold, Packard Fellowships Advisory Panel Chair and 2018 Nobel Laureate in Chemistry, said in a press release. “Through their research, creativity, and mentorship to their students and labs, these young leaders will help equip us all to better understand and address the problems we face.”

Lee’s creative approach to addressing fundamental problems in computer science became apparent during his time as a Ph.D. student at MIT, where he earned the George M. Sprowls Award for outstanding doctoral thesis for advancing state-of-the-art solutions to important problems in linear programming, convex programming, and maximum flow. Lee’s philosophy toward research hinges on a departure from the conventional approach taken by many theory researchers, who tend to view problems in continuous optimization and in combinatorial, or discrete, optimization in isolation. Among his earliest successes was a new general interior point method for solving general linear programs that produced the first significant improvement in the running time of linear programming in more than two decades — a development that earned him and his collaborators both the Best Student Paper Award and a Best Paper Award at the IEEE Symposium on Foundations of Computer Science (FOCS 2014). Around that same time, Lee also contributed to a new approximate solution to the maximum flow problem in near-linear time, for which he and the team were recognized with a Best Paper Award at the ACM-SIAM Symposium on Discrete Algorithms (SODA 2014). The following year, Lee and his colleagues once again received a Best Paper Award at FOCS, this time for unveiling a faster cutting plane method for solving convex optimization problems in near-cubic time.

Since his arrival at the University of Washington in 2017, Lee has continued to show his eagerness to apply techniques from one area of theoretical computer science to another in unexpected ways — often to great effect. 

“Even at this early stage in his career, Yin Tat is regarded as a revolutionary figure in convex optimization and its applications in combinatorial optimization and machine learning,” observed his Allen School colleague James Lee. “He often picks up new technical tools as if they were second nature and then applies them in remarkable and unexpected ways. But it’s at least as surprising when he uses standard tools and still manages to break new ground on long-standing open problems!”

One of those problems involved the question of how to optimize non-smooth convex functions in distributed networks to enable the efficient deployment of machine learning applications that rely on massive datasets. Researchers had already made progress in optimizing the trade-offs between computation and communication time for smooth and strongly convex functions in such networks; Lee and his collaborators were the first to extend a similar theoretical analysis to non-smooth convex functions. The outcome was a pair of new algorithms capable of achieving optimal convergence rates for this more challenging class of functions — and yet another Best Paper Award for Lee, this time from the flagship venue for developments in machine learning research, the Conference on Neural Information Processing Systems (NeurIPS 2018).

Since then, Lee’s contributions have included the first algorithm capable of solving dense bipartite matching in nearly linear time, and a new framework for solving linear programs as fast as linear systems for the first time. The latter work incorporates new techniques that are extensible to a broader class of convex optimization problems.

Having earned a reputation as a prolific researcher — he once set a record for the total number of papers from the same author accepted at one of the top theory conferences, the ACM Symposium on Theory of Computing (STOC), in one year — Lee also has received numerous accolades for the quality and impact of his work. These include a Sloan Research Fellowship, a Microsoft Research Faculty Fellowship, a National Science Foundation CAREER Award, and the A.W. Tucker Prize from the Mathematical Optimization Society.

“Convex optimization is the workhorse that powers much of modern machine learning, and therefore, modern computing. Yin Tat is not only a pivotal figure in the theory that underpins our field, but also one of the brightest young stars in all of computer science,” said Magdalena Balazinska, professor and director of the Allen School. “Combined with his boundless curiosity and passion for collaboration, Yin Tat’s depth of knowledge and technical skill hold the promise for many future breakthroughs. We are extremely proud to have him as a member of the Allen School faculty.”

Lee is the fifth Allen School faculty member to be recognized by the Packard Foundation. As one of the largest nongovernmental fellowships in the country supporting science and engineering research, the Packard Fellowship provides $875,000 over five years to each recipient to grant them the freedom and flexibility to pursue big ideas.

Read the Packard Foundation announcement here.

Congratulations, Yin Tat!

October 15, 2020

Allen School, UCLA and NTT Research cryptographers solve decades-old problem by proving the security of indistinguishability obfuscation

Allen School professor Rachel Lin helped solve a decades-old problem of how to prove the security of indistinguishability obfuscation (iO)

Over the past 20 years, indistinguishability obfuscation (iO) has emerged as a potentially powerful cryptographic method for securing computer programs by making them unintelligible to would-be hackers while retaining their functionality. While the mathematical foundation of this approach was formalized back in 2001 and has spawned more than 100 papers on the subject, much of the follow-up work relied upon new hardness assumptions specific to each application — assumptions that, in some cases, have been broken through subsequent cryptanalysis. Since then, researchers have been stymied in their efforts to achieve provable security guarantees for iO from well-studied hardness assumptions, leaving the concept of iO security on shaky ground.

That is, until now. In a new paper recently posted on public archives, a team that includes University of California, Los Angeles graduate student and NTT Research intern Aayush Jain; Allen School professor Huijia (Rachel) Lin; and professor Amit Sahai, director of the Center for Encrypted Functionalities at UCLA, have produced a theoretical breakthrough that, as the authors describe it, finally puts iO on terra firma. In their paper “Indistinguishability Obfuscation from Well-Founded Assumptions,” the authors show, for the first time, that provably secure iO is constructed from subexponential hardness of four well-founded assumptions, all of which have a long history of study well-rooted in complexity, coding, and number theory: Symmetric External Diffie-Hellman (SXDH) on pairing groups, Learning with Errors (LWE), Learning Parity with Noise (LPN) over large fields, and a Boolean Pseudo-Random Generator (PRG) that is very simple to compute. 

Previous work on this topic has established that, to achieve iO, it is sufficient to assume LWE, SXDH, PRG in NC0 — a very simple model of computation in which every output bit depends on a constant number of input bits — and one other object. That object, in this case, is a structured-seed PRG (sPRG) with polynomial stretch and special efficiency properties, the seed of which consists of both a public and a private part. The sPRG is designed to maintain its pseudo-randomness even when an adversary can see the public seed as well as the output of the sPRG. One of the key contributions from the team’s paper is a new and simple way to leverage LPN over fields and PRG in NC0 to build an sPRG for this purpose.

Co-authors Aayush Jain (left) and Amit Sahai

“I am excited that iO can now be based on well-founded assumptions,” said Lin. “This work was the result of an amazing collaboration with Aayush Jain and Amit Sahai, spanning over more than two years of effort.

“The next step is further pushing the envelope and constructing iO from weaker assumptions,” she explained. “At the same time, we shall try to improve the efficiency of the solutions, which at the moment is very far from being practical.”

This is not the first time Lin has contributed to significant advancements in iO in a quest to bring one of the most advanced cryptographic objects into the mainstream. In previous work, she established a connection between iO and PRG to prove that constant-degree, rather than high-degree, multilinear maps are sufficient for obfuscating programs. She subsequently refined that work with Allen School colleague Stefano Tessaro to reduce the degree of multilinear maps required to construct iO from more than 30 to just three. 

More recently, Lin worked with Jain, Sahai, and UC Santa Barbara professor Prabhanjan Ananth and then-postdoc Christian Matt on a new method for constructing iO without multilinear maps through the use of certain pseudo-random generators with special properties, formalized as Pseudo Flawed-smudging Generators (PFG) or perturbation resilient generators (ΔRG). In separate papers, Lin and her co-authors introduced partially hiding functional encryption for constant-degree polynomials or even branching programs, based only on the SXDH assumption over bilinear groups. Though these works still assumed new assumptions in order to achieve iO, they offered useful tools and ideas that paved the way to the recent new construction.  

Lin, Jain and Sahai aim to build on their latest breakthrough to make the solution more efficient so that it works not just on paper but also in real-world applications.

“These are ambitious goals that will need the joint effort from the entire cryptography community. I look forward to working on these questions and being part of the effort,” Lin concluded.

Read the research paper here, and the press release here.

October 12, 2020

Manaswi Saha wins 2020 Google Fellowship for advancing computing research with social impact

Manaswi Saha, a Ph.D. student working with Allen School professor Jon Froehlich, has been named a 2020 Google Ph.D. Fellow for her work in human computer interaction focused on assistive technologies and artificial intelligence for social good. Her research focuses on collecting data and building tools that can improve the understanding of urban accessibility and serve as a mechanism for advocacy, urban planning, and policymaking.

Saha, who is one of 53 students throughout the world to be selected for a Google Fellowship, will use those tools to fill an informational gap between citizens and the local government and stakeholders showing where improvements in sidewalks need to be made to make them accessible to all. 

“Since the beginning of my academic career, my research interests have been towards socially impactful projects. Public service, especially for underrepresented communities, runs in my family,” Saha said. “The driving force for the work I do stems from my role model, my father, who dedicated his life towards rural and agricultural development in India. His selfless efforts inspired me to explore how technology can be used for the betterment of society. With this goal in mind, I set out to do my Ph.D. with a focus on high-value social problems.”

Saha works with Froehlich in the Makeability Lab on one of its flagship ventures, Project Sidewalk. The project has two goals: to develop and study data collection methods for acquiring street-level accessibility information using crowdsourcing, machine learning, and online map imagery and to design and develop navigation and map tools for accessibility. 

To start, Saha led the pilot deployment of the Project Sidewalk tool for data collection in Washington DC. During the 18-month study, which consisted of 800 volunteers collecting sidewalk accessibility labels in Washington, D.C., crowdworkers virtually walked city streets using Google Street View and remotely reported on pedestrian-related accessibility problems such as missing curb ramps, cracked sidewalks, missing sidewalks and obstacles. Saha was the lead author on the paper presenting the team’s work, Project Sidewalk: A Web-based Crowdsourcing Tool for Collecting Sidewalk Accessibility Data At Scale, which earned a Best Paper Award at CHI 2019

“Because Project Sidewalk is publicly deployed, the tool must work robustly — so code is carefully tested and reviewed — a somewhat slow and arduous process, particularly for academics used to building fast research prototypes,” Froehlich said. “Project Sidewalk is not an easy project; however, Manaswi performed admirably. As lead graduate student, Manaswi helped co-manage the team of students, ideate, design, and implement new features, brainstorm research questions and corresponding study protocols, and help execute the studies themselves.”

Since the initial data was completed — and is also being gathered currently in Seattle; Newburg, Oregon, Columbus, Ohio; and Mexico City and San Pedro Garza García in Mexico,  Saha conducted a formative study to understand visualization needs. She is using what she learns to build an interactive web visualization tool that will answer questions about accessibility for a variety of stakeholders, including people with mobility disabilities, caregivers, local government officials, policymakers, and accessibility advocates. The tools will allow cities to see where they need to allocate resources to resolve the lack of accessibility. Saha won an Amazon Catalyst Award last year to help fund this research.

Additionally, Saha has conducted research going beyond looking at the physical barriers challenging people with disabilities to also understand the socio-political challenges that impede accessible infrastructure development. She will publish and present a paper detailing her findings later this month at the 23rd ACM Conference on Computer-Supported Cooperative Work and Social Computing

Saha also authored a paper that appeared at last year’s ACM SIGACCESS Conference on Computers and Accessibility (ASSETS 2019) on  shortcomings in GPS wayfinding tools that lead to problems for visually impaired pedestrians. Saha and her collaborators found that while users can get to a desired vicinity, they often struggle to find the exact location because the GPS tools are not specific enough. The paper, which Saha worked on while an intern at Microsoft Research, addressed this challenge, along with exploring implications for future systems to support precise navigation for people with visual impairments.

In addition to being a student and researcher, Saha is a teaching assistant (CSE482A: Capstone Software Design to Empower Underserved Populations; CSE599H: Crowdsourcing, Citizen Science, and Large-scale Online Experimentation; CSE599S: The Future of Access Technologies; CSE441: Advanced HCI: Advanced User Interface Design, Prototyping, And Evaluation; CSE440: Introduction to HCI), a volunteer (CHI ‘16, CHI ‘18 and Girls Who Code at Adobe) and a mentor to undergraduate researchers. 

Since 2009, the Google Ph.D. Fellowship program has recognized and supported exceptional graduate students working in core and emerging areas of computer science. Previous Allen School recipients include Hao Peng (2019), Joseph Redmon (2018), Tianqi Chen and Arvind Satyanarayan (2016), Aaron Parks and Kyle Rector (2015) and Robert Gens and Vincent Liu (2014). Learn more about the 2020 Google Fellowships here

Congratulations, Manaswi — and thanks for all of your contributions to the Allen School community! 

October 7, 2020

Garbage in, garbage out: Allen School and AI2 researchers examine how toxic online content can lead natural language models astray

Metal garbage can in front of brick wall
Photo credit: Pete Willis on Unsplash

In the spring of 2016, social media users turned a friendly online chatbot named Tay — a seemingly innocuous experiment by Microsoft in which the company invited the public to engage with its work in conversational learning  — into a racist, misogynistic potty mouth that the company was compelled to take offline the very same day that it launched. Two years later, Google released its Smart Compose tool for Gmail, a feature designed to make drafting emails more efficient by suggesting how to complete partially typed sentences that also had an unfortunate tendency to suggest a bias towards men — leading the company to eschew the use of gendered pronouns altogether. 

These and other examples serve as a stark illustration of that old computing adage “garbage in, garbage out,” acknowledging that a program’s outputs can only be as good as its inputs. Now, thanks to a team of researchers at the Allen School and Allen Institute for Artificial Intelligence (AI2), there is a methodology for examining just how trashy some of those inputs might be when it comes to pretrained neural language models — and how this causes the models themselves to degenerate into purveyors of toxic content. 

The problem, as Allen School Master’s student Samuel Gehman (B.S., ‘19) explains, is that not all web text is created equal.

“The massive trove of text on the web is an efficient way to train a model to produce coherent, human-like text of its own. But as anyone who has spent time on Reddit or in the comments section of a news article can tell you, plenty of web content is inaccurate or downright offensive,” noted Gehman. “Unfortunately, this means that in addition to higher quality, more factually reliable data drawn from news sites and similar sources, these models also take their cues from low-quality or controversial sources. And that can lead them to churn out low-quality, controversial content.”

The team analyzed how many tries it would take for popular language models to produce toxic content and found that most have at least one problematic generation in 100 tries.

Gehman and the team set out to measure how easily popular neural language models such as GPT-1, GPT-2, and CTRL would begin to generate problematic outputs. The researchers evaluated the models using a testbed they created called RealToxicityPrompts, which contains 100,000 naturally occurring English-language prompts,  i.e., sentence prefixes, that models have to finish. What they discovered was that all three were prone to toxic degeneration even with seemingly innocuous prompts; the models began generating toxic content within 100 generations, and exceeded expected maximum toxicity levels within 1,000 generations.

The team — which includes lead author Gehman, Ph.D. students Suchin Gururangan and Maarten Sap, and Allen School professors and AI2 researchers Yejin Choi and Noah Smithpublished its findings in a paper due to appear at the next conference on Findings of Empirical Methods in Natural Language Processing (Findings of EMNLP 2021).

“We found that if just 4% of your training data is what we would call ‘highly toxic,’ that’s enough to make these models produce toxic content, and to do so rather quickly,” explained Gururangan. “Our research also indicates that existing techniques that could prevent such behavior are not effective enough to safely release these models into the wild.”

That approach, in fact, can backfire in unexpected ways, which brings us back around to Tay — or rather, Tay’s younger “sibling,” Zo. When Microsoft attempted to rectify the elder chatbot’s propensity for going on racist rants, it scrubbed Zo clean of any hint of political incorrectness. The result was a chatbot that refused to discuss any topic suggestive of religion or politics, such as the time a reporter simply mentioned that they live in Iraq and wear a hijab. When the conversation steered towards such topics, Zo’s response would become agitated; if pressed, the chatbot might terminate the conversation altogether.

As an alternative to making certain words or topics automatically off-limits — a straightforward solution but one that lacked nuance, as evidenced by Zo’s refusal to discuss subjects that her filters deemed controversial whether they were or not — Gururangan and his collaborators explored how the use of steering methods such as the fine-tuning of a model with the help of non-toxic data might alleviate the problem. They found that domain-adaptive pre-training (DAPT), vocabulary shifting, and PPLM decoding showed the most promise for reducing toxicity. But it turns out that even the most effective steering methods have their drawbacks: in addition to being computationally and data intensive, they could only reduce, not prevent, neural toxic degeneration of a tested model.

The Allen School and AI2 team behind RealToxicityPrompts, top row from left: Samuel Gehman, Suchin Gururangan, and Maarten Sap; bottom row from left: Yejin Choi and Noah Smith

Having evaluated more conventional approaches and found them lacking, the team is encouraging an entirely new paradigm when it comes to pretraining modern NLP systems. The new framework calls for greater care in the selection of data sources and more transparency around said sources, including public release of original text, source URLs, and other information that would enable a more thorough analysis of these datasets. It also encourages researchers to incorporate value-sensitive or participatory design principles when crafting their models.

“While fine-tuning is preferable to the blunt-instrument approach of simply banning certain words, even the best steering methods can still go awry,” explained Sap. “No method is foolproof, and attempts to clean up a model can have had the unintended consequence of shutting down legitimate discourse or failing to consider language within relevant cultural contexts. We think the way forward is to ensure that these models are more transparent and human-centered, and also reflect what we refer to as algorithmic cultural competency.”

Learn more by visiting the RealToxicityPrompts project page here, and read the research paper here. Check out the AI2 blog post here, and a related Fortune article here.

September 29, 2020

Vivek Jayaram and John Thickstun win 2020 Qualcomm Innovation Fellowship for their work in source separation

Vivek Jayaram (left) and John Thickstun

Allen School Ph.D. students Vivek Jayaram and John Thickstun have been named 2020 Qualcomm Innovation Fellows for their work in signal processing, computer vision and machine learning using the latest in generative modeling to improve source separation. In their paper, “Source Separation with Deep Generative Priors,” published at the 2020 International Conference on Machine Learning, the team addresses perceptible artifacts that are often found in source separation algorithms. Jayaram and Thickstun are one of only 13 teams to receive a fellowship out of more than 45 finalists across North America. 

Thickstun and Jayarum have been working with their advisors, Allen School professors Sham Kakade, Steve Seitz, and Ira Kemelmacher-Shlizerman and adjunct faculty member Zaid Harchaoui, a professor in the UW Department of Statistics, on this research. Potential applications include separating reflections from an image, voices of multiple speakers or instruments from an audio recording, brain signals in an EEG and telecommunication technologies from Code Division Multiple Access (CDMA). The team’s work introduces a new algorithmic idea for solving source separation problems using a Bayesian approach. 

“In contrast to source separation models, modern generative models are largely free of artifacts,” said Thickstun. “Generative models continue to improve and one goal of our proposal is to find a way to use the latest advances in generative modeling to improve source separation results.” 

Employing a cutting-edge generative model is a powerful tool for source separation and can be applied to different data domains. Using the Bayesian approach and Langevin dynamics, Thickstun and Jayaram can decouple the source separation problem from the generative model, achieving a state-of-the-art performance for separation of low resolution images. 

By combining images and using the algorithm to then separate each of them, the team was able to illustrate how their theory works.

“Our algorithm works on mixtures of any number of components without retraining,” Jayaram said. “The only training is a generative model of the original images themselves; we never train it on mixtures of a fixed number of sources.”

Audio separation proved to be more challenging, but the two implemented Stochastic gradient Langevin dynamics to speed up the process and make it more practical. Their approach can be modified to tackle many different strains of optimization problems by modifying the reconstruction objective.

“John and Vivek’s work takes a fundamentally new and promising approach that leverages the power of deep networks to help separate out signals,” Kakade said. “The reason this approach is so exciting is that deep learning methods have already demonstrated remarkable abilities to model distributions, and their work looks to harness these models for the classical signal processing problem of source separation.”

Jayaram and Thickstun have each published additional papers in advancing source separation at the 2020 IEEE Conference on Computer Vision and Pattern Recognition (Jayaram), which focuses on background matting in images, and the 2019 International Society for Music Information Retrieval (Thickstun), which investigates end-to-end learnable models for attributing composers to musical scores.

Since 2009, the Qualcomm Innovation Fellowship program has recognized and supported innovative graduate students across a broad range of technical research areas. Previous Allen School recipients include Vincent Lee and Max Willsey (2017), Hanchuan Li and Alex Mariakakis (2016), Carlo del Mundo and Vincent Lee (2015), Vincent Liu and Vamsi Talla (2014) and Adrian Sampson and Theirry Moreau (2013). Learn more about the 2020 Qualcomm Fellows here

Congratulations, Vivek and John! 

September 23, 2020

Allen School’s Jenny Liang combines compassion with technology for social good

Jenny Liang

Our latest Allen School student spotlight features Jenny Liang, a Kenmore, Washington native and recent UW graduate who majored in computer science and informatics. Liang was named among the Husky 100 and earned the Allen School’s Undergraduate Service Award for her leadership, compassion, and focus on developing technology for social good in her work with the Information and Communication Technology for Development Lab (ICTD). 

This summer, Liang started an internship at the Allen Institute for Artificial Intelligence (AI2) after being awarded the 2020 Allen AI Outstanding Engineer Scholarship for Women and Underrepresented Minorities. The scholarship is designed to encourage diversity and equity in the field of artificial intelligence while strengthening the connection between academic and industry research. She previously held internships at Microsoft, Apple and Uber.

Allen School: Congratulations on the AI2 Scholarship! What makes this scholarship special, and who should apply?

Jenny Liang: The AI2 scholarship is an opportunity for folks in underrepresented communities in technology. Its aim is to combat the lack of representation currently seen in the tech industry and academia. As part of the scholarship, students receive one year of tuition covered by AI2 along with an internship. The winners have coincidentally been women in the past couple of years, but I’d like to emphasize that anyone who identifies with any underrepresented identity is qualified to apply. I would encourage any Allen School student who belongs to any minoritized identity to take advantage of this opportunity. 

Allen School: How has the experience been? 

JL: It’s been a positive career-changing experience, and my time at AI2 has been really awesome so far. I currently work on the MOSAIC team headed by the Allen School’s own Yejin Choi, where I’m building a research demo using cutting-edge computer vision and natural language processing (NLP) models. The exciting thing is it’ll be released soon to the public. I’m also dabbling with conducting my own research on building NLP models to detect social bias in language, as well as interpreting the predictions of these models. The goal of this is to illuminate how and why these models behave the way they do, and whether they can be improved to be more than just black boxes that predict complex phenomena in natural language. This provides more context in how these models interact with society, which ultimately has real-life consequences on people. Both the engineering and research aspects of this internship are all very new and challenging experiences for me. It’s been my first time working with computer vision and NLP deep learning models, which has given me a new perspective into challenges that developers face. I feel like this has pushed me to learn and adapt as a budding researcher, and provided me with lots of tools and skill sets I’ll be using in the future.

Allen School: What initially interested you in computer science and informatics? 

JL: At the time of choosing my major, I loved software engineering, and I still do. This meant I was interested in both the theoretical and applications of technology. The theory is so important to understand what makes technology systems work and why. But, understanding how technologies are applied is equally important in building software that is usable and performant and serves people in fair and ethical ways. To me, the Allen School taught me the theoretical foundations of computer science, while the iSchool provided the ability to build technology applications. Being in both CSE and INFO has allowed me to become a well-rounded technologist, where I can both build technologies quickly but also understand the complicated theoretical underpinnings of these systems.

Allen School: You have had a lot of industry experience with your internships. Do you plan to continue on that path to a career in industry?

JL: In the past year, I’ve decided to switch to academia after working in the industry. So I am applying to Ph.D. programs this fall. I’ve always enjoyed software engineering, but after a while, I found the engineering work I did in industry personally unfulfilling since I wanted to learn the fundamental properties of what makes software “good” or “bad” and why, especially as software scales. I didn’t think my trajectory in industry would quite allow me to gain that expertise because of its focus on building new technologies. Thanks to some outstanding and involved mentorship from iSchool professor Amy Ko, postdoc Spencer Sevilla in the Allen School’s ICTD Lab, and AI2 researchers Chandra Bhagavatula and Swabha Swayamdipta, I’ve been slowly convinced that academia is the space for me to do that.

Allen School: What is the best thing about being a student in the Allen School?

JL: To me, the best thing is the breadth of high-quality opportunities this school has to offer. I’m really grateful and feel so privileged for the opportunities I’ve been given because I’m in CSE. For the past five years, I’ve known I wanted to work with technology after I taught myself to code my freshman year and totally loved it. What has not been clear is how and to what capacity I’d like to do that. Because of the many opportunities the Allen School provides, I’ve really been able to find my own fulfilling niche in tech. I’m really fortunate to have developed my career as a software engineer, but also quickly pivot to a career in academia. Due to the school’s industry connections, I’ve been able to work on the world’s largest technology systems and with the best engineers; thanks to the opportunities I’ve had to do undergraduate research, serve as a TA, and take graduate-level courses, I’ve gotten a taste of what it’s like to be a Ph.D. student and really enjoyed it. Most importantly, though, my connections with the school’s faculty and staff have supported so much of my growth, and I would be nowhere without them.

Allen School: What interested you in becoming a member of the Allen School’s Student Advisory Council and continuing to serve in it? 

JL: Working with the SAC has been important to my CSE experience. I struggled a lot my first several years at UW with my mental health, which also compromised my academics for a while. Without the support of my friends and professors in the Allen School, I would not be the same person I am today. Being involved with SAC is my way of giving back to the community that supported me, as well as deriving meaning from my painful experiences. Because I understand what it’s like to struggle while being a CSE student, I’m committed to finding the ways in which CSE as a system could improve in supporting the undergraduate experience and advocating for change.

I’ve stayed with SAC because through listening to my peers’ diverse experiences and struggles, I’ve realized this work really matters. Although change can be slow-going and allyship is hard, the work we do allows students to be more successful academically, builds community within the Allen School, and creates a welcoming environment where everyone can thrive. This has been especially important in response to the tumultuous current events, and I’m really proud of how all the other student groups are committed to this mission too.

Thank you for all that you’ve done for the Allen School and UW community, Jenny — and good luck with those grad school applications! 

September 2, 2020

UW earns NSF grant to lead creation of Institute for Foundations of Data Science

Researchers at the University of Washington will lead a team of universities in creating the Institute for Foundations of Data Science (IFDS) to tackle important theoretical and technical questions in the field. Supported by a $12.5 million, five-year grant from National Science Foundation, IFDS is one of two institutes nationwide to receive funding from the latest phase of the agency’s Transdisciplinary Research in Principles of Data Science (TRIPODS) program.

The IFDS — a collaboration between UW and the Universities of Wisconsin-Madison, California Santa Cruz and Chicago — brings together theoretical computer scientists, mathematicians and statisticians to continue the study of complex and entrenched problems in data science. Together, they aim to create scalable and robust algorithmic tools that can read the unprecedented growth of large datasets, ultimately accelerating the pace of science and engineering. 

“As data science is increasingly incorporated in all facets of our lives, its success is uncovering pressing challenges that call for new theories,” said lead principal investigator Maryam Fazel, a professor in the UW Department of Electrical & Computer Engineering and adjunct professor in the Allen School, in a UW News release. “We need the expertise of all core disciplines to understand the mysteries and to address the pitfalls of data science and artificial intelligence algorithms.”

In 2017, Fazel and professor Sham Kakade, who holds a joint appointment in the Allen School and Department of Statistics, led the UW’s successful proposal to establish the Algorithmic Foundations of Data Science Institute (ADSI). They received a $1.5 million award from NSF’s TRIPODS. Three UW teams subsequently earned TRIPODS-X grants in 2018 designed to expand their original work to broader areas of science, engineering and mathematics.

“Phase II is a scaled-up version of phase one,” said co-principal investigator Yin Tat Lee, a professor in the Allen School’s Theory group. “It involves more students and more PIs, which will foster more collaborations between facilities in different areas. It supports us to visit other TRIPODS partners, to get new kinds of research going.”

The UW team, from top left: Maryam Fazel, Zaid Harchaoui, Kevin Jamieson. Bottom left: Dmitriy Drusvyatskiy, Abel Rodriguez and Yin Tat Lee. University of Washington

UW researchers have already hosted workshops and hackathons to recruit more diverse participants to the field. Their continued mission is to improve accuracy and decrease bias in algorithmic decision making processes, as well as methods to cope with ever-changing data that may be corrupted by noise or even malicious intent.

“One thrust of the TRIPODS phase II program that I’m particularly excited about aims to advance the understanding and practice of closed-loop learning. In this paradigm of data science, data collection and inference feed off of each other so that inferences on past data inform what data should be collected next,” said co-principal investigator and Allen School professor Kevin Jamieson. “A well-executed closed-loop learning protocol can often accomplish data science tasks using just a small fraction of the data necessary for traditional methods. This can accelerate the discovery of novel materials or medicines, and our ability to learn useful machine learning models to predict things like health outcomes from a given treatment.”

Jamison added that even small advances in collecting data more efficiently and smarter could have huge aggregated benefits across many fields because the algorithms ADSI develops are broadly applicable across disciplines. 

“While the last decade has enjoyed an embrace and democratization of machine learning and AI across fields like biology, chemistry, and even physics, tools for helping these practitioners collect dataset are essentially nonexistent,” he said. “One of the aims of the close-loop data thrust is to develop these tools.” 

In addition to Fazel, Kakade, Lee and Jamieson, participants in the UW IFDS include professor Dmitriy Drusvyatskiy of the UW Department of Mathematics, Zaid Harchaoui, professor of statistics and adjunct faculty member in the Allen School and Abel Rodriguez, who recently arrived from UC Santa Cruz to serve as diversity liaison for the IFDS.

Read the NSF’s announcement here and the UW News release here

September 1, 2020

Allen School’s Joseph Jaeger and Cornell Tech’s Nirvan Tyagi honored at CRYPTO 2020 for advancing new framework for analyzing multi-user security

Joseph Jaeger (left) and Nirvan Tiyagi

Allen School postdoctoral researcher Joseph Jaeger and visiting researcher Nirvan Tyagi, a Ph.D. student at Cornell Tech, received the Best Paper by Early Career Researchers Award at the 40th Annual International Cryptology Conference (Crypto 2020) organized by the International Association for Cryptologic Research (IACR). Jaeger and Tyagi, who have been working with professor Stefano Tessaro of the Allen School’s Theory and Cryptography groups, earned the award for presenting a new approach to proving multi-user security in “Handling Adaptive Compromise for Practical Encryption Schemes.” 

Jaeger and Tyagi set out to explore a classic problem in cryptography: How can the security of multi-party communication be assured in cases where an adversary is able to adaptively compromise the security of particular parties? In their winning paper, the authors aim to answer this question by presenting a new, extensible framework enabling formal analyses of multi-user security of encryption schemes and pseudorandom functions in cases where adversaries are able to adaptively compromise user keys. To incorporate an adversary’s ability to perform adaptive compromise, they expanded upon existing simulation-based, property-based security definitions to yield new definitions for simulation-based security under adaptive corruption in chosen plaintext attack (SIM-AC-CPA) and chosen ciphertext attack (SIM-AC-CCA) scenarios. Jaeger and Tyagi also introduced a new security notion for pseudorandom functions (SIM-AC-PRF), to simulate adaptive compromise for one of the basic building blocks of symmetric encryption schemes. This enabled the duo to pursue a modular approach that reduces the complexity of the ideal model analysis by breaking it into multiple steps and splitting it from the analysis of the high-level protocol — breaking from tradition in the process.

“Traditional approaches to formal security analysis are not sufficient to prove confidentiality in the face of adaptive compromise, and prior attempts to address this gap have been shown to be impractical and error-prone,” explained Jaeger. “By employing idealized primitives combined with a modular approach, we avoid the pitfalls associated with those methods. Our framework and definitions can be used to prove adaptive security in a variety of well-studied models, and they are easily applied to a variety of practical encryption schemes employed in real-world settings.”

One of the schemes for which they generated a positive proof was BurnBox, a system that enables users to temporarily revoke access from their devices to files stored in the cloud to preserve their privacy during compelled-access searches — for example, when an agent at a border crossing compels a traveler to unlock a laptop or smartphone to view its contents. In another analysis, the authors applied their framework to prove the security of a commonly used searchable symmetric encryption scheme for preserving the confidentiality of data and associated searches stored in the cloud. In both of the aforementioned examples, Jaeger and Tyagi showed that their approach produced simpler proofs while avoiding bugs contained in previous analyses. They also discussed how their framework could be extended beyond randomized symmetric encryption schemes currently in use to more modern nonce-based encryption — suggesting that their techniques will remain relevant and practical as the use of newer security schemes becomes more widespread.

“Joseph and Nirvan’s work fills an important void in the cryptographic literature and, surprisingly, identifies important aspects in assessing the security of real-world cryptographic systems that have been overlooked,” said Tessaro. “It also defines new security metrics according to which cryptographic systems ought to be assessed, and I can already envision several avenues of future research.”

Read the full research paper here.

Congratulations to Joseph and Nirvan!

August 31, 2020

New NSF AI Institute for Foundations of Machine Learning aims to address major research challenges in artificial intelligence and broaden participation in the field

National Science Foundation logo

The University of Washington is among the recipients of a five-year, $100 million investment announced today by the National Science Foundation (NSF) aimed at driving major advances in artificial intelligence research and education. The NSF AI Institute for Foundations of Machine Learning (IFML) — one of five new NSF AI Institutes around the country — will tap into the expertise of faculty in the Allen School’s Machine Learning group and the UW Department of Statistics in collaboration with the University of Texas at Austin, Wichita State University, Microsoft Research, and multiple industry and government partners. The new institute, which will be led by UT Austin, will address a set of fundamental problems in machine learning research to overcome current limitations of the field for the benefit of science and society.

“This institute tackles the foundational challenges that need to be solved to keep AI on its current trajectory and maximize its impact on science and technology,” said Allen School professor and lead co-principal investigator Sewoong Oh in a UW News release. “We plan to develop a toolkit of advanced algorithms for deep learning, create new methods for coping with the dynamic and noisy nature of training datasets, learn how to exploit structure in real-world data, and target more complex and real-world objectives. These four goals will help solve research challenges in multiple areas, including medical imaging and robot navigation.”

Oh is part of a group led by UW colleague Sham Kakade that will collaborate on the development of a toolkit of fast and efficient algorithms for training neural networks with provable guarantees. The group also aims to eliminate human bottlenecks associated with training machine learning models by constructing a new theoretical and algorithmic framework for neural architecture optimization (NAO). The latter has received minimal attention from researchers despite a broad range of potential applications, including the deployment of energy-efficient networks for edge computing and the Internet of Things, more transparent interpretable models to replace so-called blackbox predictions, and automated, user-friendly systems that enable developers to apply deep learning to real-world problems.

Sham Kakade (left) and Sewoong Oh

“The lack of science around NAO is a structural deficit within machine learning that makes us reliant on human intervention for hyper-parameter tuning, which is neither scalable nor efficient,” explained Kakade, who holds a joint appointment in the Allen School and the Department of Statistics. “Using techniques from mathematical optimization and optimal transport, we will automate the process to speed up the training pipeline while significantly reducing its carbon footprint to meet the growing need for academic and commercial applications. Our work will also provide a rigorous theoretical foundation for driving future advances in the field.”

In addition to making progress on NAO and other core machine learning problems, IFML researchers are keen to demonstrate how the results of their work can have real-world impact. To that end, they will apply the new tools and techniques they have developed to multiple use cases where machine learning holds the potential to advance the state of the art, including video compression and recognition, imaging tools for medical applications and circuit design, and robot navigation. The latter effort, which will be spearheaded by Allen School professor Byron Boots, seeks to overcome current limitations on the ability of robots to operate in unstructured environments under dynamic conditions while simultaneously reducing the training burden.

“Room layouts vary, objects can be moved, and humans are generally unpredictable. These conditions pose a challenge to the safe and reliable operation of robots alongside the many users, co-workers, and random passers-by who may share the same space,” noted Boots. “We need to broaden our concept of what constitutes a robot perception task, from one of pure recognition to one where the robot is capable of viewing the environment in the context of goals shaped by interaction and intention. I’m looking forward to working with this team to translate our foundational research into practical solutions for supporting this new paradigm.”

Byron Boots (left) and Jamie Morgenstern

On the human side, a major goal of the IFML is the broadening of participation in AI education and careers to meet expanding workforce needs and to ensure that the field reflects the diversity of society. Institute members will focus their education and workforce development efforts along the entire pipeline, from K-12 to graduate education. Their plans include development of course content for high school students who currently lack access to AI curriculum, the launch of a new initiative aimed at engaging more undergraduate students in AI research, and the build-out of a multi-state, online Master’s program that will leverage faculty from all three member institutions. Allen School professor Jamie Morgenstern, whose research focuses on the social impacts of machine learning, will lead the charge to implement Project 40×24, which aims to increase the number of women participating in AI to represent at least 40% of the field by the year 2024.

“Given the skyrocketing demand for expertise in AI across academia and industry, it should be a national priority to give students and working professionals access to high-quality educational opportunities in this field,” Morgenstern said. “We need to prepare more people from diverse backgrounds to actively participate in shaping the technologies that will have a growing impact on everyone’s lives. And we have a responsibility to ensure that new knowledge and economic opportunities generated by innovations in machine learning are broadly accessible to all.”

Zaid Harchaoui

Zaid Harchaoui, a professor in the Department of Statistics and an adjunct faculty member in the Allen School, rounds out the UW team.

The IFML is one of two NSF AI Institutes announced today with UW involvement. The other is the NSF AI Institute for Research on Trustworthy AI in Weather, Climate, and Coastal Oceanography led by the University of Oklahoma in collaboration with UW’s Evans School of Public Policy & Governance and other academic and industry partners. 

Each of the five inaugural NSF AI Institutes will receive $20 million over five years. NSF has cast today’s announcement as the start of a longer term commitment, as the agency anticipates making additional institute announcements in future. The initiative, which represents the United States’ most significant federal investment in AI research and education to date, is a partnership between NSF and the U.S. Department of Agriculture, U.S. Department of Homeland Security, and U.S. Department of Transportation.

Read the NSF announcement here, the UW News release here, and UT Austin’s IFML press release here. Learn more about the NSF AI Institutes here.

August 26, 2020

Allen School summer camp increases access to AI education

In July, the Allen School kicked off its inaugural AI4ALL summer program, online. Created to introduce artificial intelligence (AI) to underrepresented pre-college students, AI4ALL is a national program that works to diversify AI by recruiting students who identify with other groups underrepresented in AI. The University of Washington’s debut this summer is the first instance of AI4ALL to focus on students with disabilities and their representation in AI.

The University of Washington joined the program this year, offering a free, two-week data science and AI workshop. Organized by the Allen School’s Taskar Center for Accessible Technology (TCAT), directed by Anat Caspi, the program shows students how to understand, analyze, interpret and discuss real-world applications of data science and machine learning with the ultimate goal of understanding the impact and being comfortable pursuing further work in data science and machine learning.

“The camp aims to increase diverse youth representation in computer science and to promote fair practices among data scientists. Our specific focus at the UW instance of AI4ALL is on fairness and non-ableism in AI,” said Dylan Cottrell, a UW alum, a content writer at the Taskar Center, and a teaching assistant in AI4ALL. “Students learn about artificial intelligence, with a specific emphasis on creating accessible technologies with non-biased data that is fit to serve the diverse community of people who use AI in their daily lives.” 

Students enrolled in the two-week camp had the opportunity to practice using tools in data science and machine learning while making connections with computer scientists and exploring the impacts and ethical implications of AI. Initially, it was intended to be a day camp for local students. But because the COVID-19 pandemic forced it to go online, many students around the country plus one from Germany, were able to participate. were able to participate. In addition to reaching more students, Caspi found other benefits to the online program as well.

“By offering the AI4ALL program virtually this year, TCAT was able to avoid some of the logistic difficulties of getting students to the UW campus, whether due to geographic distance, travel disadvantage, or home situations that wouldn’t allow them to be away from home. We were able to include students with disabilities and students from other diverse backgrounds from both the East and West Coasts,” Caspi said. “I feel that working concurrently through multiple virtual platforms, including teleconferencing, team communication technologies, collaborative programming environments and more, we were able to encourage active learning among students who may not have felt comfortable — or able — to speak in a classroom setting. Instead, they had a choice of voicing their knowledge, questions, opinions or concerns via one of the many platforms we were using. Importantly, I believe every single student had their voice heard, knew they were valued, and understood that there was a spot for them at the broad ‘AI table’.”

Some of the planned camp  activities that we did not get to use virtually were intended to integrate tactile, tangible technologies for students to work with, removing the primary dependency on vision that many demos rely on, Caspi explained. But she hopes that next year the program will be in-person so that they can show students a robot demonstrating clustering algorithms. Such demonstrations would make the learning outcomes perceptible through multiple modes- involving visual, auditory and motion outputs — making one format accessible if another was not. Since it was virtual this year, they were unable to integrate those elements into the course.

Caspi said that while some participants had some exposure to programming and FIRST Robotics, this was most likely the first time many were exposed to more formal principles of  AI. For campers like Sophia Lin, a junior high school student from Bellevue, Washington, AI4ALL was a great opportunity to learn more about it.

“AI4ALL is an extremely fast-paced and challenging data science program. Throughout the first week, I gained insight into various conceptual aspects of the process of training a machine to classify based on interpretation of training data, and was introduced to rigorous programming libraries and data collections that visualize data,” Lin said. “I have been able to open my mind to a wide range of fields that machine learning and data visualization can be applied to, and appreciate having the opportunity to expand my analytical mindset and critical thinking skills. I hope to apply the knowledge and programming skills to future projects that address people with disabilities, to enable them to access and utilize products and services.”

Krithika Subramanian, a high school junior from Oviedo, Florida, thought the program gave her exposure to an area of computer science that she had never considered before.

“My experience in AI4ALL has been thought-provoking, overwhelming, and fun. It has shown me an infinite amount of ways on how AI can be used and the problems it can solve. AI4ALL has opened my eyes to the other sides of the computer science world,” Subramanian said. “Even with coding experience, it has pushed my boundaries, expanding them to encompass an unknown world. A world where plotting a regression line does not seem so hard or using large data sets to classify a small almost insignificant point. Overall, this is one of the most enjoyable classes I have taken this summer and AI4ALL has exposed me to one of the most interesting fields ever.”

This summer the UW’s AI4ALL introduced 18 underrepresented students to data science and machine learning. The camp will continue next year, learn more about the program and how to enroll on the website

August 18, 2020

Older Posts »