Skip to main content

Porcupine molecular tagging scheme offers a sharp contrast to conventional inventory control systems

MISL logo
MISL logo

Many people have had the experience of being poked in the back by those annoying plastic tags while trying on clothes in a store. That is just one example of radio frequency identification (RFID) technology, which has become a mainstay not just in retail but also in manufacturing, logistics, transportation, health care, and more. And who wouldn’t recognize the series of black and white lines comprising that old grocery-store standby, the scannable barcode? That invention — which originally dates back to the 1950s — eventually gave rise to the QR code, whose pixel patterns serve as a bridge between physical and digital content in the smartphone era.

Despite their near ubiquity, these object tagging systems have their shortcomings: they may be too large or inflexible for certain applications, they are easily damaged or removed, and they may be impractical to apply in high quantities. But recent advancements in DNA-based data storage and computation offer new possibilities for creating a tagging system that is smaller and lighter than conventional methods.

That’s the point of Porcupine, a new molecular tagging system introduced by University of Washington and Microsoft researchers that can be programmed and read within seconds using a portable nanopore device. In a new paper published in Nature Communications, the team in the Molecular Information Systems Laboratory (MISL) describe how dehydrated strands of synthetic DNA can take the place of bulky plastic or printed barcodes. Building on recent developments in nanopore-based DNA sequencing technologies and raw signal processing tools, the team’s inexpensive and user-friendly design eschews the need for access to specialized labs and equipment. 

“Molecular tagging is not a new idea, but existing methods are still complicated and require access to a lab, which rules out many real-world scenarios,” said lead author Kathryn Doroschak, a Ph.D. student in the Allen School. “We designed the first portable, end-to-end molecular tagging system that enables rapid, on-demand encoding and decoding at scale, and which is more accessible than existing molecular tagging methods.”

Diagram of steps in Porcupine tagging system

Instead of radio waves or printed lines, the Porcupine tagging scheme relies on a set of distinct DNA strands called molbits — short for molecular bits — that incorporate highly separable nanopore signals to ease later readout. Each individual molbit comprises one of 96 unique barcode sequences combined with a longer DNA fragment selected from a set of predetermined sequence lengths. Under the Porcupine system, the binary 0s and 1s of a digital tag are signified by the presence or absence of each of the 96 molbits.

“We wanted to prove the concept while achieving a high rate of accuracy, hence the initial 96 barcodes, but we intentionally designed our system to be modular and extensible,” explained MISL co-director Karin Strauss, senior principal research manager at Microsoft Research and affiliate professor in the Allen School. “With these initial barcodes, Porcupine can produce roughly 4.2 billion unique tags using basic laboratory equipment without compromising reliability upon readout.”

Although DNA is notoriously expensive to read and write, Porcupine gets around this by presynthesizing the fragments of DNA. In addition to lowering the cost, this approach has the added advantage of enabling users to arbitrarily mix existing strands to quickly and easily create new tags. The molbits are prepared for readout during initial tag assembly and then dehydrated to extend shelf life of the tags. This approach protects against contamination from other DNA present in the environment while simultaneously reducing readout time later.

Another advantage of the Porcupine system is that molbits are extremely tiny, measuring only a few hundred nanometers in length. In practical terms, this means each molecular tag is small enough to fit over a billion copies within one square millimeter of an object’s surface. This makes them ideal for keeping tabs on small items or flexible surfaces that aren’t suited to conventional tagging methods. Invisible to the naked eye, the nanoscale form factor also adds another layer of security compared to conventional tags.

The Porcupine team: (top, from left) Kathryn Doroschak, Karen Zhang, Melissa Queen, Aishwarya Mandyam; (bottom, from left) Karin Strauss, Luis Ceze, Jeff Nivala

“Unlike existing inventory control methods, DNA tags can’t be detected by sight or touch. Practically speaking, this means they are difficult to tamper with,” explained co-author Jeff Nivala, a research scientist at the Allen School. “This makes them ideal for tracking high-value items and separating legitimate goods from forgeries. A system like Porcupine could also be used to track important documents. For example, you could envision molecular tagging being used to track voters’ ballots and prevent tampering in future elections.”

To read the data in a Porcupine tag, a user rehydrates the tag and runs it through a portable Oxford Nanopore Technologies’ MinION device. To demonstrate, the researchers encoded and then decoded their lab acronym, “MISL,” reliably and within a few seconds using the Porcupine system. As advancements in nanopore technologies make them increasingly affordable, the team believes molecular tagging could become an increasingly attractive option in a variety of real-world settings.

“Porcupine is one more exciting example of a hybrid molecular-electronic system, combining molecular engineering, new sensing technology and machine learning to enable new applications.” said Allen School professor and MISL co-director Luis Ceze

In addition to Ceze, Doroschak, Nivala and Strauss, contributors to the project include Allen School undergraduate Karen Zhang, master’s student Aishwarya Mandyam, and Ph.D. student Melissa Queen. This research was funded in part by the Defense Advanced Research Project Agency (DARPA) under its Molecular Informatics Program and gifts from Microsoft.

Read the paper in Nature Communications here.

Read more →

Professor Jeffrey Heer and alumnus Dominik Moritz honored at IEEE VIS for outstanding contributions in data visualization

Heer (left) and Moritz

Professor Jeffrey Heer, who leads the Allen School’s Interactive Data Lab, and former student Dominik Moritz (Ph.D., ‘19), who was co-advised by Allen School adjunct professor and information science professor Bill Howe, were each honored for their impact on interactive visualization research at IEEE VIS 2020 this week, the flagship conference in the field of visualization and visual analytics. Heer received the IEEE InfoVis 10-Year Test of Time Award for his 2010 paper, “Narrative Visualization: Telling Stories with Data,” while Moritz received the IEEE Visual Graphics and Technical Community (VGTC) Doctoral Dissertation Award for his thesis “Interactive Systems for Scalable Visualization and Analysis.” 

In the winning Test of Time paper, Heer and co-author Edward Segel explored how visual data enhances journalistic storytelling and studied design strategies for narrative visualization. The paper helped to frame and advance research into the use of visualization for journalistic reporting and storytelling. Since then, it has been widely cited and influential in the fields of both visualization and data-driven journalism.

Fascinated by the growing use of visualizations in online journalism, Heer and Segel built a catalog of examples to identify distinct genres of narrative visualization. The two characterized the design differences and messaging and found that many samples could be more dynamic with the help of more sophisticated online tools — including those that allow interactive exploration by the reader. 

When the paper was originally published, Heer was a professor of computer science at Stanford University and Segel was a master’s student. Together, they created a comprehensive framework of design strategies for narrative visualization. 

“We wanted to better understand the innovative work of data journalists and designers whose insights we hoped to give further reach with our paper,” Heer said. “From the framework of our research, we found promising yet under-utilized approaches to integrating visualization with other media, and the potential for improved user interfaces for crafting data stories.”

Heer had already started to develop a series of robust tools for producing interactive visualizations on the web. As a graduate student, he helped to create Prefuse, one of the first software frameworks for information visualization, and Flare, a version of Prefuse built for Adobe Flash that was partly informed by his work in animated transitions. This latest research with Segel focused on a central concern in the design of narrative visualizations: the balance between author-driven elements that provide narrative structure and messaging, and reader-driven elements that enable interactive exploration and social sharing. This work helped to identify successful design practices that guided the development of new narrative visualization tools.

Since joining the Allen School faculty in 2013, Heer has worked on a suite of complementary tools for data analysis and visualization design built on Vega, a declarative language for producing interactive visualizations. These tools include Lyra, an interactive environment for generating customized visualizations, and Voyager, a recommendation-powered visualization browser. In 2017 he was recognized with the IEEE Visualization Technical Achievement Award and the ACM Grace Murray Hopper Award for his significant technical contributions early in his career.

Vega led to Vega-Lite, a project that earned Heer and Moritz — now a professor in the Human-Computer Interaction Institute at Carnegie Mellon — a Best Paper Award at InfoVis 2016 along with their collaborators. Vega-Lite is a high-level grammar for rapid and concise specification of interactive data visualizations. The goal was to enable non-programmers to create sophisticated visualizations that can be generated automatically. That project and others formed the basis of Moritz’s 2019 dissertation, which made a number of contributions spanning formal languages, automatic reasoning for visualization design, and novel approaches for scaling interactive visualization to massive datasets for which he was honored at this year’s VIS conference. 

One of those contributions was Draco, an open-source, constraint-based system that formalized guidelines for visualization design and their application in visualization tools. The system, which earned Moritz and his colleagues a Best Paper Award at InfoVis 2018, offers a one-stop shop for researchers and practitioners to apply and test a set of accepted design principles and preferences and to make adjustments to their visualizations based on the results. To expand the application of user-friendly visualization tools to larger datasets, Moritz introduced Pangloss, which enables analysts to interactively explore approximate results pending completion of long-running queries. Pangloss generates visualizations based on samples while queries are ongoing, with the ability to detect and correct errors later. Moritz followed that up with Falcon, a web-based system that supports real-time exploration of billion-record datasets by enabling low-latency interactions across linked visualizations.

Moritz’s interactive systems for visualization and analysis have seen widespread adoption in the Python and JavaScript communities. 

Congratulations Jeff and Dominik! 

Read more →

Inspired by personal experience, Allen School student and ACM-W leader Nayha Auradkar is working to advance accessibility and inclusion both in and out of the lab

Nayha Auradkar

Our first undergraduate student spotlight of the new academic year features Nayha Auradkar, a junior from Sammamish, Washington who is majoring in computer science with a minor in neural computation and engineering. Auradkar currently serves as chair of the University of Washington chapter of the Association for Computing Machinery for Women (ACM-W), working to cultivate a strong, supportive community of women in the Allen School. In her leadership role, she hopes to increase programming, engagement and awareness among the organization’s members while building relationships with other minority students pursuing education in technology-related fields. 

Allen School: Congratulations on becoming chair of ACM-W! What interested you in the position?

Nayha Auradkar: I believe that diversity in tech is essential for innovation and for building equitable communities. When we bring together people from various backgrounds, we gain new ideas and approaches when solving problems so that everyone, regardless of their background, benefits from technology. 

I joined ACM-W my freshman year because I am passionate about diversity in tech and ensuring that women of all backgrounds are able to reach their full potential. In my sophomore year, I served as the public relations officer of ACM-W. This year, I was elected as chair. It is an honor to be leading an organization with such an important and powerful mission. I love the community of strong, supportive, and inspiring women I have met along the way. 

Allen School: What are your goals for the group during your tenure as chair? 

NA: As chair, my goals are to lead and strengthen the ACM-W community by creating opportunities to enable more active involvement of CSE in ACM-W. We recently created a membership system to make ACM-W more of a close-knit affinity group, and we have incorporated regular member meetings and socials into our event timeline. Additionally, we are launching committees to create more leadership opportunities within ACM-W. 

Another goal I have is to recognize the intersectionality of being a woman in tech with other aspects of identity, such as a disability and race. This is important to discuss when talking about diversity, so we don’t leave behind groups that are marginalized the most. We are working on incorporating intersectionality themes into our quarterly diversity discussion events.

Allen School: How are you handling the challenges of organizing remote meetings and programs during a pandemic?

NA: Building community remotely and welcoming new students is especially challenging, which is why we are working to create more ACM-W member events and activities like member meetings, social events, and committee leadership positions. 

Something I have really emphasized is making our events as accessible as possible. This is critically important, especially in a virtual world, and should be established as a norm. 

Allen School: Why did you choose to study computer science?

NA: I chose to study CS because of the positive impact it can have on so many different disciplines. I am particularly interested in using CS in fields like neural engineering as well as applying CS to create technology that supports accessibility and inclusion for people with disabilities.

Allen School: What do you enjoy most about the Allen School?

NA: The people! I know I can always walk into the labs and find a friendly face. The advisers are willing to talk to and support students in anything. I feel a strong sense of community in the Allen School.

I also enjoy the rigor of the classes and the incredible research opportunities. Faculty at the Allen School are leaders in their fields, and I am grateful for the opportunity to learn from such accomplished and passionate individuals.

Allen School: Speaking of research opportunities, what kind of projects are you working on in the Make4all Lab, and why did you choose that lab in particular?

NA: As someone with a disability, I am deeply passionate about creating technologies that support accessibility and inclusion. I have had a stutter, a neurological condition, for my whole life, and my experiences as a person who stutters have shaped my interests and have allowed me to have empathy for others with similar experiences. Working in the Make4all Lab, led by Professor Jennifer Mankoff, has given me experience in Human-Computer Interaction (HCI) studies and accessible technology. I hope to carry this knowledge to wherever I end up going, so I can have a positive impact on accessibility efforts.

The project I am currently working on with my mentors Kelly Mack and Megan Hofmann is an HCI study focusing on quantitative and qualitative analysis of features of personal protective equipment designed in response to the COVID-19 pandemic.

Allen School: You also had the opportunity to intern at JP Morgan Chase over the summer. How was that experience? 

NA: My intern team worked on JP Morgan’s Tech for Social Good team, and we built a web application to support a computer science education nonprofit. I loved having the opportunity to make a measurable impact on a nonprofit that supports underrepresented minorities in tech. I also worked with financial applications of technology, such as data visualization in investment banking and security in asset management. It was interesting to see so many applications of CS to areas I previously did not know much about. 

The Allen School and ACM-W community is lucky to have a thoughtful, inclusive leader like you, Nayha. Thanks for all that you do! 

Read more →

Allen School’s Distinguished Lecture Series delves into the future of computing and its influence in our lives

Top row from left: Cory Doctorow, Scott Aaronson, Kunle Olukotun; bottom row from left: Brad Calder and Sarita Adve

Save the dates! Another exciting season of the Allen School’s Distinguished Lecture Series kicks off on Oct. 29. Join us to hear experts in technology activism, quantum computational supremacy, multi-chip processing, cloud infrastructure, and computer architecture.

All lectures with the exception of the November 19th Lytle Lecture, will be live streamed on the Allen School’s YouTube channel at 3:30 p.m. Pacific Time on the presentation date.

Oct. 29: Cory Doctorow, journalist, activist, science-fiction author

Cory Doctorow, an author of science-fiction and non-fiction books, a journalist and technology activist, will deliver a talk on Oct. 29 about “Early-Onset Oppenheimers.” His presentation will highlight the liberatory power technology workers have in what they design, build and share with the public, and how to convince these workers to use their power to deliver the same liberation to their users, rather than confiscating their users’ freedom.

Doctorow, who created craphound.com, champions liberalizing copyright laws and co-founded the UK Open Rights Group. He works in digital rights management, file sharing and post-scarcity economics. He has written more than 20 books and has published an extensive collection of short stories and essays. He is a contributing writer to Wired magazine, The New York Times Sunday Magazine, The Globe and Mail, Asimov’s Science Fiction magazine, and the Boston Globe. Doctorow is the former European Director of the Electronic Frontier Foundation and is an MIT Media Lab research affiliate, a visiting professor of computer science at Open University, and a visiting professor of practice at the University of North Carolina’s School of Library and Information Science.

Nov. 19: Scott Aaronson, David J. Bruton Centennial Professor of Computer Science and founding director, Quantum Information Center at the University of Texas at Austin 

Scott Aaronson is the Department of Electrical and Computer Engineering’s 2020 Lytle Lecturer and will deliver a talk on Nov. 19 about “Quantum Computational Supremacy and its Applications.” He will discuss Google’s first-ever demonstration of Quantum Computational Supremacy, which is a 53-qubit programmable superconducting chip called Sycamore. His talk will address questions about the chip, such as what problem Sycamore solved, how to verify the outputs using a classical computer, and how confident researchers are that the problem is classically hard — especially in light of subsequent counterclaims by IBM and others.

Aaronson focuses on the capabilities and limitations of quantum computers as well as computational complexity in general. His most recent work aims to demonstrate the quantum computing speedup that future technologies will help to create. In addition to his research, Aaronson writes about quantum computing for Scientific American, The New York Times, and on his own popular blog, Shtetl-Optimized. Prior to joining UT Austin, Aaronson spent nine years as a professor in electrical engineering and computer science at MIT. During that time he wrote his first book, Quantum Computing Since Democritus, about deep ideas in math, computer science and physics.

The Lytle Lecture will be broadcast via Zoom. When available, the link will be posted here.

Dec. 3: Kunle Olukotun, Cadence Design Systems Professor in the School of Engineering and professor of electrical engineering and computer science at Stanford University 

Known as the father of the multicore processor and the leader of the Stanford Hydra chip multiprocessor (CMP) research project, Kunle Olukotun has been a trailblazer in processor design. He founded Afara Websystems, a company that built high-throughput, low-power multicore processors for server systems, saving power and space for data centers which was subsequently acquired by Sun Microsystems. Olukotun is actively involved in research in computer architecture, parallel programming environments and scalable parallel systems, and he currently co-leads the Transactional Coherence and Consistency project aimed at making parallel programming accessible to average programmers. He has designed multicore CPUs and GPUs, transactional memory technology and domain-specific languages programming models. He also directs the Stanford Pervasive Parallelism Lab (PPL) which focuses on the use of heterogeneous parallelism in all application areas using domain specific languages. 

Dec. 10: Brad Calder, vice president of Product and Engineering of Technical Infrastructure and Cloud at Google

Allen School alumnus (B.S. ‘91) Brad Calder is the Vice President of Product and Engineering of Technical Infrastructure and Cloud at Google. There, he oversees the computer, networking, storage, databases, and data analytics services to provide customers more ways to connect to Google’s cloud computing services. Prior to joining Google, Calder was a vice president of engineering at Microsoft Azure and was on the founding team that started Azure in 2006. Before that, Calder was a tenured professor in the University of California, San Diego’s Department of Computer Science and Engineering, where he published over 100 papers in the areas of systems, architecture and compilers and co-directed the High Performance Processor Architecture and Compilation Lab.

Feb. 11: Sarita Adve, Richard T. Cheng Professor of Computer Science at the University of Illinois at Urbana-Champaign

With research interests that span computer architecture, programming languages, operating systems and applications, Sarita Adve has devoted her career to advancing innovation at the hardware-software interface. Adve co-developed the memory models for C++ and Java programming languages, based on her work in data-race-free (DRF) models, and has made significant contributions to cache coherence, hardware reliability, and the exploitation of instruction-level parallelism (ILP) for memory level parallelism. She also led the design of one of the first systems to implement cross-layer energy management as well as the development of the widely used RSIM architecture simulator. Her current research focuses on scalable system specialization and approximate computing. Adve was the first woman of South Asian origin to be named a fellow of the Association for Computing Machinery, and was the first woman to earn a career award for computer architecture research when she received the ACM SIGARCH Maurice Wilkes Award.

For more details and future updates, be sure to check out our Distinguished Lecture Series page. And please plan to join us online!

Read more →

Allen School professor Yin Tat Lee earns Packard Fellowship to advance the fundamentals of modern computing

Yin Tat Lee, a professor in the Allen School’s Theory of Computation group and visiting researcher at Microsoft Research, has earned a Packard Fellowship for Science and Engineering for his work on faster optimization algorithms that are fundamental to the theory and practice of computing and many other fields, from mathematics and statistics, to economics and operations research. Each year, the David and Lucile Packard Foundation bestows this prestigious recognition upon a small number of early-career scientists and engineers who are at the leading edge of their respective disciplines. Lee is among just 20 researchers nationwide — and one of only two in the Computer & Information Sciences category — to be chosen as members of the 2020 class of fellows. 

“In a year when we are confronted by the devastating impacts of a global pandemic, racial injustice, and climate change, these 20 scientists and engineers offer us a ray of hope for the future,” Frances Arnold, Packard Fellowships Advisory Panel Chair and 2018 Nobel Laureate in Chemistry, said in a press release. “Through their research, creativity, and mentorship to their students and labs, these young leaders will help equip us all to better understand and address the problems we face.”

Lee’s creative approach to addressing fundamental problems in computer science became apparent during his time as a Ph.D. student at MIT, where he earned the George M. Sprowls Award for outstanding doctoral thesis for advancing state-of-the-art solutions to important problems in linear programming, convex programming, and maximum flow. Lee’s philosophy toward research hinges on a departure from the conventional approach taken by many theory researchers, who tend to view problems in continuous optimization and in combinatorial, or discrete, optimization in isolation. Among his earliest successes was a new general interior point method for solving general linear programs that produced the first significant improvement in the running time of linear programming in more than two decades — a development that earned him and his collaborators both the Best Student Paper Award and a Best Paper Award at the IEEE Symposium on Foundations of Computer Science (FOCS 2014). Around that same time, Lee also contributed to a new approximate solution to the maximum flow problem in near-linear time, for which he and the team were recognized with a Best Paper Award at the ACM-SIAM Symposium on Discrete Algorithms (SODA 2014). The following year, Lee and his colleagues once again received a Best Paper Award at FOCS, this time for unveiling a faster cutting plane method for solving convex optimization problems in near-cubic time.

Since his arrival at the University of Washington in 2017, Lee has continued to show his eagerness to apply techniques from one area of theoretical computer science to another in unexpected ways — often to great effect. 

“Even at this early stage in his career, Yin Tat is regarded as a revolutionary figure in convex optimization and its applications in combinatorial optimization and machine learning,” observed his Allen School colleague James Lee. “He often picks up new technical tools as if they were second nature and then applies them in remarkable and unexpected ways. But it’s at least as surprising when he uses standard tools and still manages to break new ground on long-standing open problems!”

One of those problems involved the question of how to optimize non-smooth convex functions in distributed networks to enable the efficient deployment of machine learning applications that rely on massive datasets. Researchers had already made progress in optimizing the trade-offs between computation and communication time for smooth and strongly convex functions in such networks; Lee and his collaborators were the first to extend a similar theoretical analysis to non-smooth convex functions. The outcome was a pair of new algorithms capable of achieving optimal convergence rates for this more challenging class of functions — and yet another Best Paper Award for Lee, this time from the flagship venue for developments in machine learning research, the Conference on Neural Information Processing Systems (NeurIPS 2018).

Since then, Lee’s contributions have included the first algorithm capable of solving dense bipartite matching in nearly linear time, and a new framework for solving linear programs as fast as linear systems for the first time. The latter work incorporates new techniques that are extensible to a broader class of convex optimization problems.

Having earned a reputation as a prolific researcher — he once set a record for the total number of papers from the same author accepted at one of the top theory conferences, the ACM Symposium on Theory of Computing (STOC), in one year — Lee also has received numerous accolades for the quality and impact of his work. These include a Sloan Research Fellowship, a Microsoft Research Faculty Fellowship, a National Science Foundation CAREER Award, and the A.W. Tucker Prize from the Mathematical Optimization Society.

“Convex optimization is the workhorse that powers much of modern machine learning, and therefore, modern computing. Yin Tat is not only a pivotal figure in the theory that underpins our field, but also one of the brightest young stars in all of computer science,” said Magdalena Balazinska, professor and director of the Allen School. “Combined with his boundless curiosity and passion for collaboration, Yin Tat’s depth of knowledge and technical skill hold the promise for many future breakthroughs. We are extremely proud to have him as a member of the Allen School faculty.”

Lee is the fifth Allen School faculty member to be recognized by the Packard Foundation. As one of the largest nongovernmental fellowships in the country supporting science and engineering research, the Packard Fellowship provides $875,000 over five years to each recipient to grant them the freedom and flexibility to pursue big ideas.

Read the Packard Foundation announcement here.

Congratulations, Yin Tat!

Read more →

Allen School, UCLA and NTT Research cryptographers solve decades-old problem by proving the security of indistinguishability obfuscation

Allen School professor Rachel Lin helped solve a decades-old problem of how to prove the security of indistinguishability obfuscation (iO)

Over the past 20 years, indistinguishability obfuscation (iO) has emerged as a potentially powerful cryptographic method for securing computer programs by making them unintelligible to would-be hackers while retaining their functionality. While the mathematical foundation of this approach was formalized back in 2001 and has spawned more than 100 papers on the subject, much of the follow-up work relied upon new hardness assumptions specific to each application — assumptions that, in some cases, have been broken through subsequent cryptanalysis. Since then, researchers have been stymied in their efforts to achieve provable security guarantees for iO from well-studied hardness assumptions, leaving the concept of iO security on shaky ground.

That is, until now. In a new paper recently posted on public archives, a team that includes University of California, Los Angeles graduate student and NTT Research intern Aayush Jain; Allen School professor Huijia (Rachel) Lin; and professor Amit Sahai, director of the Center for Encrypted Functionalities at UCLA, have produced a theoretical breakthrough that, as the authors describe it, finally puts iO on terra firma. In their paper “Indistinguishability Obfuscation from Well-Founded Assumptions,” the authors show, for the first time, that provably secure iO is constructed from subexponential hardness of four well-founded assumptions, all of which have a long history of study well-rooted in complexity, coding, and number theory: Symmetric External Diffie-Hellman (SXDH) on pairing groups, Learning with Errors (LWE), Learning Parity with Noise (LPN) over large fields, and a Boolean Pseudo-Random Generator (PRG) that is very simple to compute. 

Previous work on this topic has established that, to achieve iO, it is sufficient to assume LWE, SXDH, PRG in NC0 — a very simple model of computation in which every output bit depends on a constant number of input bits — and one other object. That object, in this case, is a structured-seed PRG (sPRG) with polynomial stretch and special efficiency properties, the seed of which consists of both a public and a private part. The sPRG is designed to maintain its pseudo-randomness even when an adversary can see the public seed as well as the output of the sPRG. One of the key contributions from the team’s paper is a new and simple way to leverage LPN over fields and PRG in NC0 to build an sPRG for this purpose.

Co-authors Aayush Jain (left) and Amit Sahai

“I am excited that iO can now be based on well-founded assumptions,” said Lin. “This work was the result of an amazing collaboration with Aayush Jain and Amit Sahai, spanning over more than two years of effort.

“The next step is further pushing the envelope and constructing iO from weaker assumptions,” she explained. “At the same time, we shall try to improve the efficiency of the solutions, which at the moment is very far from being practical.”

This is not the first time Lin has contributed to significant advancements in iO in a quest to bring one of the most advanced cryptographic objects into the mainstream. In previous work, she established a connection between iO and PRG to prove that constant-degree, rather than high-degree, multilinear maps are sufficient for obfuscating programs. She subsequently refined that work with Allen School colleague Stefano Tessaro to reduce the degree of multilinear maps required to construct iO from more than 30 to just three. 

More recently, Lin worked with Jain, Sahai, and UC Santa Barbara professor Prabhanjan Ananth and then-postdoc Christian Matt on a new method for constructing iO without multilinear maps through the use of certain pseudo-random generators with special properties, formalized as Pseudo Flawed-smudging Generators (PFG) or perturbation resilient generators (ΔRG). In separate papers, Lin and her co-authors introduced partially hiding functional encryption for constant-degree polynomials or even branching programs, based only on the SXDH assumption over bilinear groups. Though these works still assumed new assumptions in order to achieve iO, they offered useful tools and ideas that paved the way to the recent new construction.  

Lin, Jain and Sahai aim to build on their latest breakthrough to make the solution more efficient so that it works not just on paper but also in real-world applications.

“These are ambitious goals that will need the joint effort from the entire cryptography community. I look forward to working on these questions and being part of the effort,” Lin concluded.

Read the research paper here, and the press release here. Read a related Quanta magazine article here.

Read more →

Manaswi Saha wins 2020 Google Fellowship for advancing computing research with social impact

Manaswi Saha, a Ph.D. student working with Allen School professor Jon Froehlich, has been named a 2020 Google Ph.D. Fellow for her work in human computer interaction focused on assistive technologies and artificial intelligence for social good. Her research focuses on collecting data and building tools that can improve the understanding of urban accessibility and serve as a mechanism for advocacy, urban planning, and policymaking.

Saha, who is one of 53 students throughout the world to be selected for a Google Fellowship, will use those tools to fill an informational gap between citizens and the local government and stakeholders showing where improvements in sidewalks need to be made to make them accessible to all. 

“Since the beginning of my academic career, my research interests have been towards socially impactful projects. Public service, especially for underrepresented communities, runs in my family,” Saha said. “The driving force for the work I do stems from my role model, my father, who dedicated his life towards rural and agricultural development in India. His selfless efforts inspired me to explore how technology can be used for the betterment of society. With this goal in mind, I set out to do my Ph.D. with a focus on high-value social problems.”

Saha works with Froehlich in the Makeability Lab on one of its flagship ventures, Project Sidewalk. The project has two goals: to develop and study data collection methods for acquiring street-level accessibility information using crowdsourcing, machine learning, and online map imagery and to design and develop navigation and map tools for accessibility. 

To start, Saha led the pilot deployment of the Project Sidewalk tool for data collection in Washington DC. During the 18-month study, which consisted of 800 volunteers collecting sidewalk accessibility labels in Washington, D.C., crowdworkers virtually walked city streets using Google Street View and remotely reported on pedestrian-related accessibility problems such as missing curb ramps, cracked sidewalks, missing sidewalks and obstacles. Saha was the lead author on the paper presenting the team’s work, Project Sidewalk: A Web-based Crowdsourcing Tool for Collecting Sidewalk Accessibility Data At Scale, which earned a Best Paper Award at CHI 2019

“Because Project Sidewalk is publicly deployed, the tool must work robustly — so code is carefully tested and reviewed — a somewhat slow and arduous process, particularly for academics used to building fast research prototypes,” Froehlich said. “Project Sidewalk is not an easy project; however, Manaswi performed admirably. As lead graduate student, Manaswi helped co-manage the team of students, ideate, design, and implement new features, brainstorm research questions and corresponding study protocols, and help execute the studies themselves.”

Since the initial data was completed — and is also being gathered currently in Seattle; Newburg, Oregon, Columbus, Ohio; and Mexico City and San Pedro Garza García in Mexico,  Saha conducted a formative study to understand visualization needs. She is using what she learns to build an interactive web visualization tool that will answer questions about accessibility for a variety of stakeholders, including people with mobility disabilities, caregivers, local government officials, policymakers, and accessibility advocates. The tools will allow cities to see where they need to allocate resources to resolve the lack of accessibility. Saha won an Amazon Catalyst Award last year to help fund this research.

Additionally, Saha has conducted research going beyond looking at the physical barriers challenging people with disabilities to also understand the socio-political challenges that impede accessible infrastructure development. She will publish and present a paper detailing her findings later this month at the 23rd ACM Conference on Computer-Supported Cooperative Work and Social Computing

Saha also authored a paper that appeared at last year’s ACM SIGACCESS Conference on Computers and Accessibility (ASSETS 2019) on  shortcomings in GPS wayfinding tools that lead to problems for visually impaired pedestrians. Saha and her collaborators found that while users can get to a desired vicinity, they often struggle to find the exact location because the GPS tools are not specific enough. The paper, which Saha worked on while an intern at Microsoft Research, addressed this challenge, along with exploring implications for future systems to support precise navigation for people with visual impairments.

In addition to being a student and researcher, Saha is a teaching assistant (CSE482A: Capstone Software Design to Empower Underserved Populations; CSE599H: Crowdsourcing, Citizen Science, and Large-scale Online Experimentation; CSE599S: The Future of Access Technologies; CSE441: Advanced HCI: Advanced User Interface Design, Prototyping, And Evaluation; CSE440: Introduction to HCI), a volunteer (CHI ‘16, CHI ‘18 and Girls Who Code at Adobe) and a mentor to undergraduate researchers. 

Since 2009, the Google Ph.D. Fellowship program has recognized and supported exceptional graduate students working in core and emerging areas of computer science. Previous Allen School recipients include Hao Peng (2019), Joseph Redmon (2018), Tianqi Chen and Arvind Satyanarayan (2016), Aaron Parks and Kyle Rector (2015) and Robert Gens and Vincent Liu (2014). Learn more about the 2020 Google Fellowships here

Congratulations, Manaswi — and thanks for all of your contributions to the Allen School community! 

Read more →

Garbage in, garbage out: Allen School and AI2 researchers examine how toxic online content can lead natural language models astray

Metal garbage can in front of brick wall
Photo credit: Pete Willis on Unsplash

In the spring of 2016, social media users turned a friendly online chatbot named Tay — a seemingly innocuous experiment by Microsoft in which the company invited the public to engage with its work in conversational learning  — into a racist, misogynistic potty mouth that the company was compelled to take offline the very same day that it launched. Two years later, Google released its Smart Compose tool for Gmail, a feature designed to make drafting emails more efficient by suggesting how to complete partially typed sentences that also had an unfortunate tendency to suggest a bias towards men — leading the company to eschew the use of gendered pronouns altogether. 

These and other examples serve as a stark illustration of that old computing adage “garbage in, garbage out,” acknowledging that a program’s outputs can only be as good as its inputs. Now, thanks to a team of researchers at the Allen School and Allen Institute for Artificial Intelligence (AI2), there is a methodology for examining just how trashy some of those inputs might be when it comes to pretrained neural language models — and how this causes the models themselves to degenerate into purveyors of toxic content. 

The problem, as Allen School Master’s student Samuel Gehman (B.S., ‘19) explains, is that not all web text is created equal.

“The massive trove of text on the web is an efficient way to train a model to produce coherent, human-like text of its own. But as anyone who has spent time on Reddit or in the comments section of a news article can tell you, plenty of web content is inaccurate or downright offensive,” noted Gehman. “Unfortunately, this means that in addition to higher quality, more factually reliable data drawn from news sites and similar sources, these models also take their cues from low-quality or controversial sources. And that can lead them to churn out low-quality, controversial content.”

The team analyzed how many tries it would take for popular language models to produce toxic content and found that most have at least one problematic generation in 100 tries.

Gehman and the team set out to measure how easily popular neural language models such as GPT-1, GPT-2, and CTRL would begin to generate problematic outputs. The researchers evaluated the models using a testbed they created called RealToxicityPrompts, which contains 100,000 naturally occurring English-language prompts,  i.e., sentence prefixes, that models have to finish. What they discovered was that all three were prone to toxic degeneration even with seemingly innocuous prompts; the models began generating toxic content within 100 generations, and exceeded expected maximum toxicity levels within 1,000 generations.

The team — which includes lead author Gehman, Ph.D. students Suchin Gururangan and Maarten Sap, and Allen School professors and AI2 researchers Yejin Choi and Noah Smithpublished its findings in a paper due to appear at the next conference on Findings of Empirical Methods in Natural Language Processing (Findings of EMNLP 2021).

“We found that if just 4% of your training data is what we would call ‘highly toxic,’ that’s enough to make these models produce toxic content, and to do so rather quickly,” explained Gururangan. “Our research also indicates that existing techniques that could prevent such behavior are not effective enough to safely release these models into the wild.”

That approach, in fact, can backfire in unexpected ways, which brings us back around to Tay — or rather, Tay’s younger “sibling,” Zo. When Microsoft attempted to rectify the elder chatbot’s propensity for going on racist rants, it scrubbed Zo clean of any hint of political incorrectness. The result was a chatbot that refused to discuss any topic suggestive of religion or politics, such as the time a reporter simply mentioned that they live in Iraq and wear a hijab. When the conversation steered towards such topics, Zo’s response would become agitated; if pressed, the chatbot might terminate the conversation altogether.

As an alternative to making certain words or topics automatically off-limits — a straightforward solution but one that lacked nuance, as evidenced by Zo’s refusal to discuss subjects that her filters deemed controversial whether they were or not — Gururangan and his collaborators explored how the use of steering methods such as the fine-tuning of a model with the help of non-toxic data might alleviate the problem. They found that domain-adaptive pre-training (DAPT), vocabulary shifting, and PPLM decoding showed the most promise for reducing toxicity. But it turns out that even the most effective steering methods have their drawbacks: in addition to being computationally and data intensive, they could only reduce, not prevent, neural toxic degeneration of a tested model.

The Allen School and AI2 team behind RealToxicityPrompts, top row from left: Samuel Gehman, Suchin Gururangan, and Maarten Sap; bottom row from left: Yejin Choi and Noah Smith

Having evaluated more conventional approaches and found them lacking, the team is encouraging an entirely new paradigm when it comes to pretraining modern NLP systems. The new framework calls for greater care in the selection of data sources and more transparency around said sources, including public release of original text, source URLs, and other information that would enable a more thorough analysis of these datasets. It also encourages researchers to incorporate value-sensitive or participatory design principles when crafting their models.

“While fine-tuning is preferable to the blunt-instrument approach of simply banning certain words, even the best steering methods can still go awry,” explained Sap. “No method is foolproof, and attempts to clean up a model can have had the unintended consequence of shutting down legitimate discourse or failing to consider language within relevant cultural contexts. We think the way forward is to ensure that these models are more transparent and human-centered, and also reflect what we refer to as algorithmic cultural competency.”

Learn more by visiting the RealToxicityPrompts project page here, and read the research paper here. Check out the AI2 blog post here, and a related Fortune article here.

Read more →

Vivek Jayaram and John Thickstun win 2020 Qualcomm Innovation Fellowship for their work in source separation

Vivek Jayaram (left) and John Thickstun

Allen School Ph.D. students Vivek Jayaram and John Thickstun have been named 2020 Qualcomm Innovation Fellows for their work in signal processing, computer vision and machine learning using the latest in generative modeling to improve source separation. In their paper, “Source Separation with Deep Generative Priors,” published at the 2020 International Conference on Machine Learning, the team addresses perceptible artifacts that are often found in source separation algorithms. Jayaram and Thickstun are one of only 13 teams to receive a fellowship out of more than 45 finalists across North America. 

Thickstun and Jayarum have been working with their advisors, Allen School professors Sham Kakade, Steve Seitz, and Ira Kemelmacher-Shlizerman and adjunct faculty member Zaid Harchaoui, a professor in the UW Department of Statistics, on this research. Potential applications include separating reflections from an image, voices of multiple speakers or instruments from an audio recording, brain signals in an EEG and telecommunication technologies from Code Division Multiple Access (CDMA). The team’s work introduces a new algorithmic idea for solving source separation problems using a Bayesian approach. 

“In contrast to source separation models, modern generative models are largely free of artifacts,” said Thickstun. “Generative models continue to improve and one goal of our proposal is to find a way to use the latest advances in generative modeling to improve source separation results.” 

Employing a cutting-edge generative model is a powerful tool for source separation and can be applied to different data domains. Using the Bayesian approach and Langevin dynamics, Thickstun and Jayaram can decouple the source separation problem from the generative model, achieving a state-of-the-art performance for separation of low resolution images. 

By combining images and using the algorithm to then separate each of them, the team was able to illustrate how their theory works.

“Our algorithm works on mixtures of any number of components without retraining,” Jayaram said. “The only training is a generative model of the original images themselves; we never train it on mixtures of a fixed number of sources.”

Audio separation proved to be more challenging, but the two implemented Stochastic gradient Langevin dynamics to speed up the process and make it more practical. Their approach can be modified to tackle many different strains of optimization problems by modifying the reconstruction objective.

“John and Vivek’s work takes a fundamentally new and promising approach that leverages the power of deep networks to help separate out signals,” Kakade said. “The reason this approach is so exciting is that deep learning methods have already demonstrated remarkable abilities to model distributions, and their work looks to harness these models for the classical signal processing problem of source separation.”

Jayaram and Thickstun have each published additional papers in advancing source separation at the 2020 IEEE Conference on Computer Vision and Pattern Recognition (Jayaram), which focuses on background matting in images, and the 2019 International Society for Music Information Retrieval (Thickstun), which investigates end-to-end learnable models for attributing composers to musical scores.

Since 2009, the Qualcomm Innovation Fellowship program has recognized and supported innovative graduate students across a broad range of technical research areas. Previous Allen School recipients include Vincent Lee and Max Willsey (2017), Hanchuan Li and Alex Mariakakis (2016), Carlo del Mundo and Vincent Lee (2015), Vincent Liu and Vamsi Talla (2014) and Adrian Sampson and Theirry Moreau (2013). Learn more about the 2020 Qualcomm Fellows here

Congratulations, Vivek and John! 

Read more →

Allen School’s Jenny Liang combines compassion with technology for social good

Jenny Liang

Our latest Allen School student spotlight features Jenny Liang, a Kenmore, Washington native and recent UW graduate who majored in computer science and informatics. Liang was named among the Husky 100 and earned the Allen School’s Undergraduate Service Award for her leadership, compassion, and focus on developing technology for social good in her work with the Information and Communication Technology for Development Lab (ICTD). 

This summer, Liang started an internship at the Allen Institute for Artificial Intelligence (AI2) after being awarded the 2020 Allen AI Outstanding Engineer Scholarship for Women and Underrepresented Minorities. The scholarship is designed to encourage diversity and equity in the field of artificial intelligence while strengthening the connection between academic and industry research. She previously held internships at Microsoft, Apple and Uber.

Allen School: Congratulations on the AI2 Scholarship! What makes this scholarship special, and who should apply?

Jenny Liang: The AI2 scholarship is an opportunity for folks in underrepresented communities in technology. Its aim is to combat the lack of representation currently seen in the tech industry and academia. As part of the scholarship, students receive one year of tuition covered by AI2 along with an internship. The winners have coincidentally been women in the past couple of years, but I’d like to emphasize that anyone who identifies with any underrepresented identity is qualified to apply. I would encourage any Allen School student who belongs to any minoritized identity to take advantage of this opportunity. 

Allen School: How has the experience been? 

JL: It’s been a positive career-changing experience, and my time at AI2 has been really awesome so far. I currently work on the MOSAIC team headed by the Allen School’s own Yejin Choi, where I’m building a research demo using cutting-edge computer vision and natural language processing (NLP) models. The exciting thing is it’ll be released soon to the public. I’m also dabbling with conducting my own research on building NLP models to detect social bias in language, as well as interpreting the predictions of these models. The goal of this is to illuminate how and why these models behave the way they do, and whether they can be improved to be more than just black boxes that predict complex phenomena in natural language. This provides more context in how these models interact with society, which ultimately has real-life consequences on people. Both the engineering and research aspects of this internship are all very new and challenging experiences for me. It’s been my first time working with computer vision and NLP deep learning models, which has given me a new perspective into challenges that developers face. I feel like this has pushed me to learn and adapt as a budding researcher, and provided me with lots of tools and skill sets I’ll be using in the future.

Allen School: What initially interested you in computer science and informatics? 

JL: At the time of choosing my major, I loved software engineering, and I still do. This meant I was interested in both the theoretical and applications of technology. The theory is so important to understand what makes technology systems work and why. But, understanding how technologies are applied is equally important in building software that is usable and performant and serves people in fair and ethical ways. To me, the Allen School taught me the theoretical foundations of computer science, while the iSchool provided the ability to build technology applications. Being in both CSE and INFO has allowed me to become a well-rounded technologist, where I can both build technologies quickly but also understand the complicated theoretical underpinnings of these systems.

Allen School: You have had a lot of industry experience with your internships. Do you plan to continue on that path to a career in industry?

JL: In the past year, I’ve decided to switch to academia after working in the industry. So I am applying to Ph.D. programs this fall. I’ve always enjoyed software engineering, but after a while, I found the engineering work I did in industry personally unfulfilling since I wanted to learn the fundamental properties of what makes software “good” or “bad” and why, especially as software scales. I didn’t think my trajectory in industry would quite allow me to gain that expertise because of its focus on building new technologies. Thanks to some outstanding and involved mentorship from iSchool professor Amy Ko, postdoc Spencer Sevilla in the Allen School’s ICTD Lab, and AI2 researchers Chandra Bhagavatula and Swabha Swayamdipta, I’ve been slowly convinced that academia is the space for me to do that.

Allen School: What is the best thing about being a student in the Allen School?

JL: To me, the best thing is the breadth of high-quality opportunities this school has to offer. I’m really grateful and feel so privileged for the opportunities I’ve been given because I’m in CSE. For the past five years, I’ve known I wanted to work with technology after I taught myself to code my freshman year and totally loved it. What has not been clear is how and to what capacity I’d like to do that. Because of the many opportunities the Allen School provides, I’ve really been able to find my own fulfilling niche in tech. I’m really fortunate to have developed my career as a software engineer, but also quickly pivot to a career in academia. Due to the school’s industry connections, I’ve been able to work on the world’s largest technology systems and with the best engineers; thanks to the opportunities I’ve had to do undergraduate research, serve as a TA, and take graduate-level courses, I’ve gotten a taste of what it’s like to be a Ph.D. student and really enjoyed it. Most importantly, though, my connections with the school’s faculty and staff have supported so much of my growth, and I would be nowhere without them.

Allen School: What interested you in becoming a member of the Allen School’s Student Advisory Council and continuing to serve in it? 

JL: Working with the SAC has been important to my CSE experience. I struggled a lot my first several years at UW with my mental health, which also compromised my academics for a while. Without the support of my friends and professors in the Allen School, I would not be the same person I am today. Being involved with SAC is my way of giving back to the community that supported me, as well as deriving meaning from my painful experiences. Because I understand what it’s like to struggle while being a CSE student, I’m committed to finding the ways in which CSE as a system could improve in supporting the undergraduate experience and advocating for change.

I’ve stayed with SAC because through listening to my peers’ diverse experiences and struggles, I’ve realized this work really matters. Although change can be slow-going and allyship is hard, the work we do allows students to be more successful academically, builds community within the Allen School, and creates a welcoming environment where everyone can thrive. This has been especially important in response to the tumultuous current events, and I’m really proud of how all the other student groups are committed to this mission too.

Thank you for all that you’ve done for the Allen School and UW community, Jenny — and good luck with those grad school applications! 

Read more →

« Newer PostsOlder Posts »