Allen School News

Allen School undergraduate team wins Tech For Change Civic Tech Hackathon with project to boost participation in local elections

Samuel Levy, wearing a black face mask and a blue t-shirt, leans forward over a Mac silver laptop next to Ian Mahoney, wearing glasses, a grey sweater and a black hat. To his right, Aditi Joshi, wearing a red sweater, smiles. Across from Joshi, Vrishab Sathish Kumar smiles, wearing glasses and a black long-sleeve shirt. To Sathish Kumar’s right, Masia Wisdom, wearing a grey sweater and holding a phone, smiles. They are sitting around a rectangular table in a room with glass windows overlooking a river. Other students are sitting at tables around them. — The team competed at the Tech For Change Civic Tech Hackathon hosted by Boston University, winning the election turnout track. *Photo by Impact++*

In February, University of Washington student group Impact++ won one of the tracks at the Tech For Change Civic Tech (TFC) Hackathon held at Boston University. The hackathon tasked student teams with building creative solutions in the interest of changing public policy. This year’s competition included three tracks: education, election turnout and government policy rooted in social welfare.

It was the first time Impact++, which focuses on projects combining computer science and social good, has sent a team to the TFC Hackathon hosted by Boston University’s Spark program. The team consisted of: Vrishab Sathish Kumar, a senior studying computer science; Aditi Joshi, a junior majoring in computer science and minoring in environmental studies; Samuel Levy, a senior majoring in computer science and minoring in law societies and justice; and Ian Mahoney, a senior majoring in computer science and minoring in history. Masia Wisdom, a sophomore studying computer engineering at Howard University, also joined the UW team during the first day after meeting the group at the event.

“The hackathon helped me understand that even without formal internship experience or past in-person hackathon experience, our training through the Allen School and Impact++ projects were truly translational to other, perhaps different settings,” Sathish Kumar said. “It was a full-circle experience to see a project come together through teamwork.”

The team’s project tackled the election turnout challenge. Called Vote Real, it provided a gamified platform in which users act as a city council member. Through the platform they could better understand bills being voted on besides the intricacies of the policy-making process. Then users could see how their own city council members voted.

“Over time, this ‘closeness’ metric of how a user voted, opposed to how council members voted, will keep them in the loop,” Joshi said. “Instead of voting for representatives once a year and forgetting about it for the rest, the goal is to keep our leaders accountable.”

The team based its idea on BeReal, a social app gaining popularity among 18 to 24-year-olds. After multiple rounds of brainstorming, the group decided to focus on improving voter turnout in local elections, which historically have lower participation among younger voters.

Masia Wisdom, wearing a black shirt and tan pants, stands next to Vrishab Sathish Kumar, wearing a black hooded sweatshirt and grey pants. To Sathish Kumar’s right stands Samuel Levy, wearing a black coat and blue jeans, Ian Mahoney, wearing a blue jacket and blue jeans, and Aditi Joshi, wearing a red sweater and grey pants. The students are smiling for a photo in front of a glass windowed wall overlooking a river. — The team included Howard University sophomore Masia Wisdom (left) and UW Impact++ undergraduates Vrishab Sathish Kumar, Samuel Levy, Ian Mahoney and Aditi Joshi. *Photo by Impact++*

“We recognized a gap here and wanted to build something to help around this issue,” Sathish Kumar said. “Since we are in the same shoes as our target audience, we thought about what mattered to us, what motivated us and the mediums that we thought were most effective in doing so.”

Since its formation in 2018, Impact++ has provided opportunities for students to gain hands-on experience and build connections with industry mentors through social-good projects. The student-run organization runs five to six annual projects, Sathish Kumar said, with support from mentors from local tech companies and startups.

Experiences like the TFC Hackathon, for instance, can broaden perspectives. For several of the team members, participating helped them think more deeply about technology’s role in society.

“I had not really thought too much about the topic of creating social and policy change through tech and computing before the TFC Hackathon,” Mahoney said. “Through the hackathon and our project in particular, I realized there are spaces in which technology can really have an impact in creating these changes.”

There was also time for fun. Less so for sleep. Between making presentation slides and games of Jeopardy and Kahoot, the hours flew by in a whirlwind of creating and camaraderie.

“In the morning, we were so delirious after staying up most of the night that we spent a solid 30 minutes crying with laughter,” Levy said. “None of us could figure out why.”

After more than 30 hours of hacking, it was the only answer that eluded the team. Read more →

Researchers unveil BioTranslator, a machine learning model that bridges biological data and text to accelerate biomedical discovery

Dense swirls and plumes of brightly colored cellular material in blue, green, purple, orange and red form an irregular mass in the center of the frame. Overlaid on the red portion is a section of a chain of hexagonal shapes in red and blue representing an enzyme, highlighted in white with red circles radiating out from the center. The cellular material is pictured against a grey background patterned with tiny floating matter. — A visualization of p97, an enzyme that plays a crucial role in regulating proteins in cancer cells, inhibited from completing its normal reaction cycle by a potential small molecule drug. With BioTranslator, the first multilingual translation framework for biomedical research, scientists will be able to search potential drug targets like p97 and other non-text biological data using free-form text descriptions. *National Cancer Institute, National Institutes of Health*

When the novel coronavirus SARS-Cov-2 began sweeping across the globe, scientists raced to figure out how the virus infected human cells so they could halt the spread.

What if scientist had been able to simply type a description of the virus and its spike protein into a search bar, and received information on the angiotensin-converting enzyme 2 — colloquially known as the ACE2 receptor, through which the virus infects human cells — in return? And what if, in addition to identifying the mechanism of infection for similar proteins, this same search also returned potential drug candidates that are known to inhibit their ability to bind to the ACE2 receptor?

Biomedical research has yielded troves of data on protein function, cell types, gene expression and drug formulas that hold tremendous promise for assisting scientists in responding to novel diseases as well as fighting old foes such as Alzheimer’s, cancer and Parkinson’s. Historically, their ability to explore these massive datasets has been hampered by an outmoded model that relied on painstakingly annotated data, unique to each dataset, that precludes more open-ended exploration.

But that may be about to change. In a recent paper published in Nature Communications, Allen School researchers and their collaborators at Microsoft and Stanford University unveiled BioTranslator, the first multilingual translation framework for biomedical research. BioTranslator — a portmanteau of “biological” and “translator” — is a state-of-the-art, zero-shot classification tool for retrieving non-text biological data using free-form text descriptions.

Portrait of Hanwen Xu wearing glasses and a dark button-down shirt open over a white t-shirt, with a neutral expression and standing outdoors with blurred green and pink foliage behind him. The sun illuminates the right side of his face (left side from the viewer's perspective), side by side with a portrait of Addie Woicik outdoors on a snow-covered glacier, with hair pulled back and sunglasses perched on her head. She is wearing a periwinkle scarf around her neck and a pale red t-shirt with the straps of her backpack visible. — Hanwen Xu (left) and Addie Woicik

“BioTranslator serves as a bridge connecting the various datasets and the biological modalities they contain together,” explained lead author Hanwen Xu, a Ph.D. student in the Allen School. “If you think about how people who speak different languages communicate, they need to translate to a common language to talk to each other. We borrowed this idea to create our model that can ‘talk’ to different biological data and translate them into a common language — in this case, text.”

The ability to perform text-based search across multiple biological databases breaks from conventional approaches that rely on controlled vocabularies (CVs). As the name implies, CVs come with some constraints. Once the original dataset is created via the painstaking process of manual or automatic annotation according to a predefined set of terms, it is difficult to extend them to the analysis of new findings; meanwhile, the creation of new CVs is time consuming and requires extensive domain knowledge to compose the data descriptions.

BioTranslator frees scientists from this rigidity by enabling them to search and retrieve biological data with the ease of free-form text. Allen School professor Sheng Wang, senior author of the paper, likens the shift to when the act of finding information online progressed from combing through predefined directories to being able to enter a search term into open-ended search engines like Google and Bing.

Portrait of Sheng Wang wearing glasses and a navy blue suit jacket over a pink button-down shirt, standing in front of a window on a high floor overlooking low-rise buildings surrounded by trees with a mountain range barely visible in the background. — Sheng Wang

“The old Yahoo! directories relied on these hierarchical categories like ‘education,’ ‘health,’ ‘entertainment’ and so on. That meant that If I wanted to find something online 20 years ago, I couldn’t just enter search terms for anything I wanted; I had to know where to look,” said Wang. “Google changed that by introducing the concept of an intermediate layer that enables me to enter free text in its search bar and retrieve any website that matches my text. BioTranslator acts as that intermediate layer, but instead of websites, it retrieves biological data.”

Wang and Xu previously explored text-based search of biological data by developing ProTranslator, a bilingual framework for translating text to protein function. While ProTranslator is limited to proteins, BioTranslator is domain-agnostic, meaning it can pull from multiple modalities in response to a text-based input — and, as with the switch from old-school directories to modern search engines, the person querying the data no longer has to know where to look.

BioTranslator does not merely perform similarity search on existing CVs using text-based semantics; instead, it translates the user-generated text description into a biological data instance, such as a protein sequence, and then searches for similar instances across biological datasets. The framework is based on large-scale pretrained language models that have been fine-tuned using biomedical ontologies from a variety of related domains. Unlike other language models that are having a moment — ChatGPT comes to mind — BioTranslator isn’t limited to searching text but rather can pull from various data structures, including sequences, vectors and graphs. And because it’s bidirectional, BioTranslator not only can take text as input, but also generate text as output.

“Once BioTranslator converts the biological data to text, people can then plug that description into ChatGPT or a general search engine to find more information on the topic,” Xu noted.

A diagram from the paper illustrating how BioTranslator converts Input: user-written text to Output: non-text biological data. On the left are three examples of text descriptions, fed through BioTranslator symbolized by a collection of circles illuminated in the center and connected to each other by lines, and on the right are the corresponding biological data instances. A cell found in the embryo before the formation of all the gem layers is complete returns gene expression data in the form of a row of boxes of varying shades of maroon, pink and lavender; The removal of sugar residues from a glycosylated protein returns a protein sequence SVLLRSGLGPLCAARAA….VVAGFELAWQ; A complex network of interacting proteins and enzymes is required for DNA replication returns a pathway illustrated by a collection of 11 circles connected to one or more of the other circles by lines. — BioTranslator functions as an intermediate layer between written text descriptions and biological data. The framework, which is based on large-scale pretrained language models that have been refined using biological ontologies from a variety of domains, translates user-generated text into a non-text biological data instance — for example, a protein sequence — and searches for similar instances across multiple biological datasets. *Nature Communications*

Xu and his colleagues developed BioTranslator using an unsupervised learning approach. Part of what makes BioTranslator unique is its ability to make predictions across multiple biological modalities without the benefit of paired data.

“We assessed BioTranslator’s performance on a selection of prediction tasks, spanning drug-target interaction, phenotype-gene association and phenotype-pathway association,” explained co-author and Allen School Ph.D. student Addie Woicik. “BioTranslator was able to predict the target gene for a drug using only the biological features of the drugs and phenotypes — no corresponding text descriptions — and without access to paired data between two of the non-text modalities. This sets it apart from supervised approaches like multiclass classification and logistic regression, which require paired data in training.”

BioTranslator outperformed both of those approaches in two out of the four tasks, and was better than the supervised approach that doesn’t use class features in the remaining two. In the team’s experiments, BioTranslator also successfully classified novel cell types and identified marker genes that were omitted from the training data. This indicates that BioTranslator can not only draw information from new or expanded datasets — no additional annotation or training required — but also contribute to the expansion of those datasets.

Portrait of Haifung Poon wearing a blue button-down shirt against a grey background side by side with a portrait of Russ Altman wearing a blue and white striped button-down shirt against a blue sky. — Hoifung Poon (left) and Dr. Russ Altman

“The number of potential text and biological data pairings is approaching one million and counting,” Wang said. “BioTranslator has the potential to enhance scientists’ ability to respond quickly to the next novel virus, pinpoint the genetic markers for diseases, and identify new drug candidates for treating those diseases.”

Other co-authors on the paper are Allen School alum Hoifung Poon (Ph.D., ‘11), general manager at Microsoft Health Futures, and Dr. Russ Altman, the Kenneth Fong Professor of Bioengineering, Genetics, Medicine and Biomedical Data Science, with a courtesy appointment in Computer Science, at Stanford University. Next steps for the team include expanding the model beyond expertly written descriptions to accommodate more plain language and noisy text.

Read the Nature Communications paper here, and access the BioTranslator code package here. Read more →

From Seattle to São Paulo: Mapping the world, and improving accessibility, one sidewalk at a time

Aerial view of a city showing a network of multi-lane streets, sidewalks and crosswalks flanked by a row of tall buildings of metal, glass and concrete with smooth facades. In the middle of the main boulevard, divided by a pedestrian courtyard with red, dedicated bike lanes along one edge and featuring a copse of leafy trees in the center, two oval-shaped cut-outs reveal additional streets traversing the same section of city a level below. People are walking around the courtyard and along the sidewalks. At one end of the courtyard is a small cafe or kiosk with a bright red roof next to a trio of red and white umbrella-covered tables. Three lanes of car traffic are visible across a triangular-shaped intersection above the courtyard. — An aerial view of São Paulo, Brazil, one of five cities in the Americas that have partnered with the Taskar Center and G3ict on the award-winning “AI for Inclusive Urban Sidewalks” project supported by AI for Accessibility and Bing Maps at Microsoft. As part of the initiative, São Paulo officials aim to rehabilitate or reclassify more than 1.5 million square meters of sidewalk in key areas of the city to improve walkability and safety. *Photo by* *Gabriel Ramos* on Unsplash

To say Anat Caspi’s mission is pedestrian in nature would be accurate to some degree. And yet, when looked at more closely, one realizes it’s anything but.

In 2015, the Allen School scientist was thinking about how to build a trip planner that everyone could use, similar to Google Maps but different in striking ways. Current tools didn’t account for various types of pedestrians and the terrain they confronted on a daily basis. What if there were barriers blocking the sidewalk? A steep incline listing to and fro? Stairs but no ramp?

“Map applications make very specific assumptions about the fact that if there’s a road built there, you can walk it,” Caspi said. “They’ll just give you a time estimate that’s a little bit lower than the car and call it done.”

But Caspi was just beginning. Artificial intelligence could only do so much. These tools were powerful, sure, but treated people as “slowly moving cars.” They lacked perspective, something with resolve and purpose, a cleareyed intent. Something, perhaps, more human.

Nearly a decade later, Caspi’s quest continues. As the director of the Taskar Center for Accessible Technology (TCAT), and lead PI of the Transportation Data Equity Initiative, she spearheads an initiative that seeks to make cities smarter and safer for everyone. About one out of every six people worldwide experiences a significant disability, according to the World Health Organization, and many encounter spaces designed without them in mind.

“Which is unacceptable given that we now have the ability to convey real information,” she said. “It’s just about the political will to make these changes.”

TCAT has filled in those gaps and more. Several of its projects have gone from print to pavement to public initiative. For instance, AccessMap, an app providing accessible pedestrian route information customized to the user, garnered a large yet unanticipated fan base shortly after its release in 2017: parents pushing strollers.

Though originally designed for those with disabilities in mind, AccessMap quickly gained a following with groups whose transportation needs ran the gamut.

“I’m focused on accessibility,” Caspi said. “But as soon as we started putting this data out, it was clear that there were many other stakeholders who were interested.”

Besides those with tykes in tow, first responders also expressed their interest after seeing the app’s potential for helping negotiate tricky areas surrounding search and rescue — or moving a stretcher to a patient. City planners saw the app’s utility for helping coordinate construction, assessing walkability and transportation planning efforts.

The tactile graphic representation of the OpenSidewalks data specification shows a brown map layout with a black binding running vertically down the center. — The tactile graphic representation of the OpenSidewalks data specification. The tactile map presents an alternative approach to understanding the pedestrian experience, where roads are demoted, pedestrian spaces such as parks and pedestrian paths are elevated on the map, and pedestrian amenities and landmarks are shown for a particular travel purpose. *Photo by TCAT*

AccessMap was the first act of OpenSidewalks, TCAT’s team project that creates an ecosystem of tools including machine learning and crowdsourcing to map pedestrian pathways better, and in a standardized, consistent manner. OpenSidewalks started as a Data Science for Social Good eScience project. Now, the venture has evolved into a global effort, one with key partners such as USDOT, Microsoft and the Global Initiative for Inclusive Information and Communication Technologies (G3ict). The USDOT funding is part of a larger initiative to create public infrastructure to support multimodal accessibility-first transportation data. Sponsors for the current ongoing project include the U.S. Department of Transportation ITS4US Deployment Program. Learn more about that project here.

AI for Accessibility and Bing Maps at Microsoft also provided financial and infrastructure support for the project.

G3ict, a nonprofit with roots in the United Nations, partnered with TCAT on the shared mission of improving accessibility in cities from a transportation and mobility perspective. Prior to the partnership, G3ict had focused more on digital accessibility — procuring software to enable residents to pay utility bills online, for example.

“This was really their first time looking at the physical environment from the accessibility perspective,” Caspi said. “For us, most of our prior experience had been in the U.S., so the tools we had created before both for using model predictions and for collecting the data were U.S. specific. This really forced us to expand our thinking.”

Together, the organizations could reach further. G3ict brought in entities from municipalities around the world, providing greater access and scope to the project. TCAT, meanwhile, leveraged its expertise in mapping and data collection to take accessibility from the screen to the streets.

“We are super happy about the partnership,” Caspi said. “Without people on the ground, you really don’t have that kind of reach typically.”

In November, TCAT and G3ict won the Smart City Expo World Congress Living & Inclusion Award for their project, “AI for Inclusive Urban Sidewalks.” The project seeks to build an open AI network dedicated to improving pedestrian accessibility around the world.

TCAT has previously collaborated with 11 cities across the U.S. and is currently partnering with 10 other cities across the Americas, including São Paulo, Los Angeles, Quito, Gran Valparaiso and Santiago, with more planned for the future.

Felipe Tapia, a project participant in Latin America, rides trails on an adapted bicycle and collects sidewalks and path data in his city, Santiago in Chile. He is wearing sunglasses, a ballcap, a striped black and grey t-shirt. He is on a hill overlooking the city. — Felipe Tapia, a project participant in Latin America, rides trails on an adapted bicycle and collects sidewalk and path data in his city, Santiago in Chile. *Photo by Felipe Tapia*

While working with the cities, Caspi has found that each has its own personality and set of challenges specific to its location. Quito, for instance, has been focused on greenery and access to nature. The team in Los Angeles has emphasized studying building footprints and how structures interact with sidewalk environments. Meanwhile, in São Paulo, officials are prioritizing more than 1.5 million square meters of sidewalk rehabilitation and reclassification, with the hope of improving safety and walkability in key areas across the city.

“We found as we built the data more, we could reach further in terms of whom this data was relevant for and how they were using it,” Caspi said. “Through efforts focused on eco-friendly cities, transportation and people being able to reach transit services, you’re supporting sustainability within those communities.”

Caspi added that the collaboration, both within TCAT and without, has been essential and has also surprised her in how it has grown and changed shape over the years. Whether working with transportation officers in local governments or on the ground with students collecting the data, she’s seen firsthand how these efforts can build upon themselves into something greater.

For instance, Ricky Zhang, a Ph.D. student in Electrical & Computer Engineering who worked on the team, uses computer vision models to take datasets of aerial satellite images, street network image tiles and sidewalk annotations to infer the layout of pedestrian routes in the transportation system. His work was crucial to the project’s success, Caspi said.

“We hope to provide a data foundation for innovations in urban accessibility,” Zhang said. “The data can be used for accessibility assessment and pedestrian path network graph comparison at various scales.”

Eric Yeh developed the mobile version of the AccessMap web app while working with TCAT as an Allen School undergraduate. He saw the app’s potential for good, how routes in life or the everyday could branch out for the better.

“I originally joined TCAT because I was new to computer science,” said Yeh, now a master’s student studying computer science at the Allen School. “I wanted to gain programming experience while contributing to a project that would be meaningful to the community.”

The app lends itself to collaboration. Users can report sidewalk segments that are incorrectly labeled or are inaccessible, Yeh said, allowing for evolution and up-to-date accuracy. The team hopes to funnel these reports into a pipeline that automates the process.

It’s all part of a larger plan. Caspi outlined current work, including taking OpenSidewalks to a national data specification like GTFS to provide a consistent graph representation of pedestrians’ travel environments; democratization of data collection; improved tooling for data producers; and API’s that facilitate consuming the data at scale, all while limiting subjective assessments of what is considered accessible. At the same time, the Taskar Center is also pursuing non-technical tools like toolkits and workshop supports for transportation planners to discuss disability justice in their organizations and utilize community-based participatory design in trip planning and accessibility-first data stewardship. The center is also working with advocacy groups and communities to assess their accessibility and hold officials to account through a clear understanding of what the infrastructure does and does not support regarding accessibility.

For Caspi, it’s been a humbling experience to see how far the project has come, and how much work there is left to do. She sees the Taskar Center as part of a greater effort in building a sense of community, wherever one might be in the world.

“These kinds of projects can bring people together to have a better mutual understanding,” she said. “What does it take to run a city? It’s hard, right? So much from the municipal side is trying to understand storytelling and the lived experience of people through the city. Efforts like these can soften the edges around where cities meet their people. To me, that’s been really instructive.”

Read more about UW’s work on sidewalk equity here, and about AccessMap here.

Learn more about the Taskar Center’s work at the upcoming Open the Paths 2023: An Open Data & Transportation Equity Conference. Running from April 21 to April 22, the conference seeks to foster a meaningful conversation among stakeholders about mobility equity and the ways in which it can be enhanced through open, multimodal, accessibility-focused transportation data. It is free for all students and those interested can register here. Read more →

UW researchers show how to tap into the sensing capabilities of any smartphone to screen for prediabetes

A person holds a black smartphone with the rear of the phone facing the camera in their left hand, and a narrow rectangular glucose test strip with various tiny circuitry attached in the other hand. Only the person's hands and wrists are visible in the frame. The shot is professionally lit against a dark grey, almost black, background. — GlucoScreen would enable people to self-screen for prediabetes using a modified version of a commercially available test strip with any smartphone — no separate glucometer required. Leveraging the phone’s built-in capacitive touch sensing capabilities, GlucoScreen transmits test data from the strip to the phone via a series of simulated taps on the screen. The app applies machine learning to analyze the data and calculate a blood glucose reading. *Raymond C. Smith/University of Washington*

According to the U.S. Centers for Disease Control, one out of every three adults in the United States have prediabetes, a condition marked by elevated blood sugar levels that could lead to the development of type 2 diabetes. The good news is that, if detected early, prediabetes can be reversed through lifestyle changes such as improved diet and exercise. The bad news? Eight out of 10 Americans with prediabetes don’t know that they have it, putting them at increased risk of developing diabetes as well as disease complications that include heart disease, kidney failure and vision loss.

Current screening methods typically involve a visit to a health care facility for laboratory testing and/or the use of a portable glucometer for at-home testing, meaning access and cost may be barriers to more widespread screening. But researchers at the University of Washington’s Paul G. Allen School of Computer Science & Engineering and UW Medicine may have found the sweet spot when it comes to increasing early detection of prediabetes. They developed GlucoScreen, a new system that leverages the capacitive touch sensing capabilities of any smartphone to measure blood glucose levels without the need for a separate reader. Their approach will make glucose testing less costly and more accessible — particularly for one-time screening of a large population.

The team describes GlucoScreen in a new paper published in the latest issue of the Proceedings of the Association for Computing Machinery on Interactive, Mobile, Wearable and Ubiquitous Technologies (IMWUT).

“In conventional screening, a person applies a drop of blood to a test strip, where the blood reacts chemically with the enzymes on the strip. A glucometer is used to analyze that reaction and deliver a blood glucose reading,” explained lead author Anandghan Waghmare, a Ph.D. student in the Allen School’s UbiComp Lab. “We took the same test strip and added inexpensive circuitry that communicates data generated by that reaction to any smartphone through simulated tapping on the screen. GlucoScreen then processes the data and displays the result right on the phone, alerting the person if they are at risk so they know to follow up with their physician.”

The GlucoScreen test strip samples the electrochemical reaction induced by the mixing of blood and enzymes as an amplitude along a curve at a rate of five times per second. The strip transmits this curve data to the phone encoded in a series of touches at variable speeds using a technique called pulse width modulation. “Pulse width” refers to the distance between peaks in the signal — in this case, the length between taps. Each pulse width represents a value along the curve; the greater the distance between taps for a particular value, the higher the amplitude associated with the electrochemical reaction on the strip.

Closeup of a person conducting a glucose test by applying blood from their finger to the biosensor attached to the GlucoScreen test strip, as seen from the side. The strip is folded in half over the top of the smartphone, with tiny photodiodes and circuitry facing the flash, which is illuminated, on the rear of the phone and one end of the strip affixed to the upper third of the phone's front touch screen. — The GlucoScreen app walks the user through each step of the testing process, which is similar to a conventional glucometer-based test. Tiny photodiodes on the GlucoScreen test strip enable it to draw the power it needs to function entirely from the phone’s flash. (Note: The blood in the photo is not real.) *Raymond C. Smith/University of Washington*

“You communicate with your phone by tapping the screen with your finger,” said Waghmare. “That’s basically what the strip is doing, only instead of a single tap to produce a single action, it’s doing multiple taps at varying speeds. It’s comparable to how Morse code transmits information through tapping patterns.”

The advantage of this technique is that it does not require complicated electronic components, which minimizes the cost to manufacture the strip and the power required for it to operate compared to more conventional communication methods like Bluetooth and WiFi. All of the data processing and computation occurs on the phone, which simplifies the strip and further reduces the cost.

“The test strip doesn’t require batteries or a USB connection,” noted co-author Farshid Salemi Parizi, a former Ph.D. student in the UW Department of Electrical & Computer Engineering who is now a senior machine learning engineer at OctoML. “Instead, we incorporated photodiodes into our design so that the strip can draw what little power it needs for operation from the phone’s flash.”

The flash is automatically engaged by the GlucoScreen app, which walks the user through each step of the testing process. First, a user affixes each end of the test strip to the front and back of the phone as directed. Next, they prick their finger with a lancet, as they would in a conventional test, and apply a drop of blood to the biosensor attached to the test strip. After the data is transmitted from the strip to the phone, the app applies machine learning to analyze the data and calculate a blood glucose reading.

That stage of the process is similar to that performed on a commercial glucometer. What sets GlucoScreen apart, in addition to its novel touch technique, is its universality.

“Because we use the built-in capacitive touch screen that’s present in every smartphone, our solution can be easily adapted for widespread use. Additionally, our approach does not require low-level access to the capacitive touch data, so you don’t have to access the operating system to make GlucoScreen work.” explained co-author Jason Hoffman, a Ph.D. student in the Allen School. “We’ve designed it to be ‘plug and play.’ You don’t need to root the phone — in fact, you don’t need to do anything with the phone, other than install the app. Whatever model you have, it will work off the shelf.”

A smartphone with a glucose test strip affixed to the front and rear, with a biosensor and strip for applying a drop of blood sticking out above the phone's top edge. The phone's touch screen is displayed, with the end of the test strip that comes up over the top edge of the phone affixed to the upper third of the screen, which is blank except for a pale grey. The rest of the screen is white with text: Your glucose level is 91 mg/dl, a text link: Learn more about what this number means, and a blue button labeled: Finish. — After processing the data from the test strip, GlucoScreen displays the calculated blood glucose reading on the phone. *Raymond C. Smith/University of Washington*

Hoffman and his colleagues evaluated their approach using a combination of in vitro and clinical testing. Due to the COVID-19 pandemic, they had to delay the latter until 2021 when, on a trip home to India, Waghmare connected with Dr. Shailesh Pitale at Dew Medicare and Trinity Hospital. Upon learning about the UW project, Dr. Pitale agreed to facilitate a clinical study involving 75 consenting patients who were already scheduled to have blood drawn for a laboratory blood glucose test. Using that laboratory test as the ground truth, Waghmare and the team evaluated GlucoScreen’s performance against that of a conventional strip and glucometer.

While the researchers stress that additional testing is needed, their early results suggest GlucoScreen’s accuracy is comparable to that of glucometer testing. Importantly, the system was shown to be accurate at the crucial threshold between a normal blood glucose level at or below 99 mg/dL, and prediabetes, defined as a blood glucose level between 100 and 125 mg/dL. Given the scarcity of training data they had to work with for the clinical testing model, the researchers posit that GlucoScreen’s performance will improve with more inputs.

According to co-author Dr. Matthew Thompson, given how common prediabetes as well as diabetes are globally, this type of technology has the potential to change clinical care.

“One of the barriers I see in my clinical practice is that many patients can’t afford to test themselves, as glucometers and their test strips are too expensive. And, it’s usually the people who most need their glucose tested who face the biggest barriers,” said Thompson, a family physician and professor in the UW Department of Family Medicine and Department of Global Health. “Given how many of my patients use smartphones now, a system like GlucoScreen could really transform our ability to screen and monitor people with prediabetes and even diabetes.”

GlucoScreen is presently a research prototype; additional user-focused and clinical studies, along with alterations to how test strips are manufactured and packaged, would be required before the system could be made widely available. According to senior author Shwetak Patel, the Washington Research Foundation Entrepreneurship Endowed Professor in Computer Science & Engineering and Electrical & Computer Engineering at the UW, the project demonstrates how we have only begun to tap into the potential of smartphones as a health screening tool.

“Now that we’ve shown we can build electrochemical assays that can work with a smartphone instead of a dedicated reader, you can imagine extending this approach to expand screening for other conditions,” Patel said.

Yuntao Wang, a research professor at Tsinghua University and former visiting professor at the Allen School, is also a co-author of the paper. This research was funded in part by the Bill & Melinda Gates Foundation.

Georg Seelig, wearing a white shirt, stands in front of a brick background for a portrait. — *Photo credit: Ryan Hoover*

A little more than two decades ago, University of Washington professor Georg Seelig began planting the seeds of a career in theoretical physics, seeking elegant solutions to the mysteries of the natural world. Last month, Seelig, a faculty member in the Allen School and Department of Electrical & Computer Engineering, was hailed as the “DNA Computer Scientist of the Year” by the International Society for Nanoscale Science, Computation and Engineering (ISNSCE), who named him the winner of the 2023 Rozenberg Tulip Award in recognition of his leadership and original contributions that have advanced the field of DNA computing.

“It’s wonderful to get this recognition from my community,” Seelig said. “The field has grown quite a bit since the beginning but remains very collaborative and collegial.”

Seelig’s work with DNA strand displacement, scalable DNA data storage and retrieval, and technologies for single-cell sequencing and analysis of gene regulation has helped push the frontiers of molecular programming. For instance, he pioneered adapting strand displacement technology to living cells. Prior to his work, inputs to the circuits were synthesized chemically and not produced inside a cellular environment.

“This brings up a whole range of different challenges because the interior of cells is an infinitely more complex environment than a test tube with a bit of salt water,” Seelig said. “Cells are full of proteins that destroy foreign DNA and other molecules that sequester it in different subcellular compartments.”

Now a leader in the field, Seelig said a turning point for him came early on in his academic journey. Before his internship at Bell Laboratories, he had trained as a theoretical physicist. He didn’t think of himself as a practitioner.

But his perspective changed after meeting Bernard Yurke, a physicist at Bell who was building a synthetic molecular motor that could revolutionize the field. Dubbed “molecular tweezers” for its pincer-like mimicry, the motor could be switched between an open and a closed configuration by adding two more synthetic DNA strands.

The work struck Seelig with its simplicity — with just a few tweaks, scientists could, quite literally, bend the building blocks of life to their liking.

“The idea seemed both almost trivial,” he said, “and incredibly brilliant.”

That brilliance has followed him throughout his career. Since joining the UW faculty of the Allen School and the UW Department of Electrical & Computer Engineering in 2008, Seelig has continued to make the magical actual and sleight of hand scientific.

Seelig remembers how he grew after his experience at Bell Labs. After completing his doctorate at the University of Geneva, the Swiss scientist dove further into experimental work as a postdoc at the California Institute of Technology. There, he and Yurke joined MacArthur Fellow Erik Winfree’s lab, collaborating with some of the brightest minds in molecular engineering. Like Yurke before him, Winfree, a leading researcher in the field, mentored Seelig and fostered his potential.

“It wasn’t long after he joined my lab that I began to think of him as a rock star of science,” Winfree said. “Sometimes more Serge Gainsberg, sometimes more Richard Ashcroft, sometimes more John Prine, but always undeniably Georg Seelig.”

Together with David Soloveichik, a graduate student in the lab at the time, and David Yu Zhang, then an undergraduate, Seelig invented DNA strand displacement circuits, which allowed scientists to control the forces behind synthetic DNA devices. Being able to program the foundations of existence, to maneuver its scaffolding to one’s will, brought with it new questions besides tantalizing possibilities.

What if these reactions could target cancer cells via smart therapeutics? Could the reactions be sped up or slowed down? In DNA’s twists and turns, can the plot of a human life change for the better?

“It was a remarkably creative interaction, blending motivation from biophysics, biotechnology, theoretical computer science, the origin of life, electrical engineering, chemistry and molecular biology, and it resulted in several papers that had an enormous impact on the field,” Winfree said. “Georg’s vision, leadership, perseverance and exquisite experimental skills made the magic real and undeniable.”

The challenge of making “magic” feeds his curiosity, which Winfree likened to an artist’s muse. As head of the Seelig Lab for Synthetic Biology and a member of the Molecular Information Systems Laboratory, Seelig has now become a mentor himself, teaching the next generation of scientists to keep hunting for answers among the helices.

“When he picks up the tune of a beautiful idea, he is unstoppable in crafting it into a compelling song,” Winfree said. “It’s been great how, after coming to UW, he has released album after album of hits.”

Those first “hits” were scrawled across whiteboards at Caltech. Seelig remembers poring over them with his collaborators, searching for that elegant solution, for theory to materialize into practice.

To the group’s surprise, their effort paid off more quickly than expected. For Seelig, it foreshadowed things to come.

“Shortly afterwards, we tested the idea experimentally,” Seelig said of inventing DNA strand displacement circuits. “It worked on the first try.” Read more →

Pair of ACEs: Allen School’s Arvind Krishnamurthy and Michael Taylor will help spur innovation in distributed computing as part of new multi-university research center

Arvind Krishnamurthy, wearing a black polo, smiles in front of a blurred background of a window and a pink wall. To the right of a purple diagonal line, Michael Taylor, wearing a white shirt and a black jacket, smiles in front of a gray background. — Arvind Krishnamurthy (left) and Michael Taylor will lend their expertise to the ACE Center for Evolvable Computing, a multi-university venture focused on the development of microelectronics and semiconductor computing technologies.

Data centers account for about 2% of total electricity use in the U.S., according to the U.S. Office of Energy Efficiency and Renewable Energy, consuming 10 to 50 times the energy per floor space of a typical commercial office building. Meanwhile, advances in distributed computing have spurred innovation with the use of large, intensive applications — but at a high cost in terms of energy consumption and environmental impact.

A pair of Allen School professors will contribute to a multi-university effort focused on tackling these challenges in the distributed computing landscape. Arvind Krishnamurthy and Michael Taylor will lend their expertise to the ACE Center for Evolvable Computing, which will foster the development of computing technologies that improve the performance of microelectronics and semiconductors.

Funded by a $31.5 million grant from the Joint University Microelectronics Program 2.0 (JUMP 2.0), the ACE Center will advance distributed computing technology — from cloud-based datacenters to edge nodes — and further innovation in the semiconductor industry. Led by the University of Illinois Urbana Champaign and with additional funds from partnering institutions, the ACE Center will have a total budget of $39.6 million over five years.

“Computation is becoming increasingly planet-scale, which means not only that energy efficiency is becoming more and more critical for environmental reasons, but that we need to rethink how computation is done so that we can efficiently orchestrate computations spread across many chips distributed around the planet,” Taylor said. “This center is organizing some of the best and brightest minds across the fields of computer architecture, distributed systems and hardware design so that we may come up with innovative solutions.”

Krishnamurthy, the Short-Dooley Professor in the Allen School, is an investigator on the “Distributed Evolvable Memory and Storage” theme. His research focuses on building effective and robust computer systems, both in terms of data centers and Internet-scale systems. The ACE Center is not the only forward-looking initiative that is benefiting from Krishnamurthy’s expertise; he is also co-director of the Center for the Future of Cloud Infrastructure (FOCI) at the Allen School, which was announced last year.

“We are seeing an explosion of innovations in computer architecture, with a continuous stream of innovations in accelerators, programmable networks and storage,” Krishnamurthy said. “One key goal of this center is how to make effective use of this hardware and how to organize them in large distributed systems necessary to support demanding applications such as machine learning and data processing.”

Taylor, who leads the Bespoke Silicon Group at the Allen School, is an investigator in the “Heterogeneous Computing Platforms” theme. He’ll act as a fulcrum for research directions and guide a talented team of graduate students in designing distributed energy-efficient accelerator chips that can better adapt with ever-changing and more complicated computing environments.

“Today’s accelerator chips are very fixed function, and rapidly become obsolete, for example, if a new video encoding standard is developed,” Taylor said. “With some fresh approaches to the problem, accelerators in older cell phones would still be able to decode the newer video standards.”

Taylor has previously worked with the Defense Advanced Research Projects Agency (DARPA), which oversees JUMP, and the Semiconductor Research Corporation (SRC), helping organize a pair of 5-year research centers, including the Applications Driving Architectures (ADA) center and the Center for Future Architecture Research (C-FAR) center. The NSF Career Award winner joined the Allen School and UW Department of Electrical & Computer Engineering in 2017.

Both Krishnamurthy and Taylor will contribute to the ACE Center’s goal to create an ecosystem that fosters direct engagement and collaborative research projects with industry partners drawn from SRC member companies as well as companies in the broader areas of microelectronics and distributed systems.

In addition to Taylor and Krishnamurthy at the University of Washington, other contributors to the ACE Center include faculty from the University of Illinois, Harvard, Cornell, Georgia Tech, MIT, Ohio State, Purdue, Stanford, the University of California San Diego, the University of Kansas, the University of Michigan and the University of Texas at Austin. Read more →

Leilani Battle awarded 2023 Sloan Research Fellowship

Leilani Battle, wearing a grey and white patterned sweater and a pale blue shirt with long curly hair over her right shoulder, smiles in front of a blurred background set in what appears to be a hotel dining room, with chandeliers and floor-to-ceiling columns

The Alfred P. Sloan Foundation has named the Allen School’s Leilani Battle (B.S., ‘11) a 2023 Sloan Research Fellow, a distinction that recognizes early-career researchers whose achievements place them among the next generation of scientific leaders in the U.S. and Canada. The two-year, $75,000 fellowships support research across the sciences and have been awarded to some of the world’s most preeminent minds in their respective fields.

“My research is not traditional computer science research so it’s wonderful to be recognized,” Battle said. “I strive to be myself in everything I do, so it’s awesome to see that others appreciate my unique perspective.”

Battle co-leads the Interactive Data Lab with Allen School colleague Jeffrey Heer, the Jerre D. Noe Endowed Professor of Computer Science & Engineering. Her research investigates the interactive visual exploration of massive datasets and stands at the intersection of several academic disciplines, including healthcare, business and climate science. In each, data-driven decisions continue to drive innovation across the globe.

“What piqued my interest in data science was the juxtaposition of the incredible power of existing tools and their underutilization by the vast majority of data analysts in the world,” Battle said. “Why are we not making better use of these tools? This sparked a multi-year journey to better understand why people use or don’t use various data science tools and how those tools could be made accessible to and effective for a wider range of users.”

While pursuing her doctorate at MIT, Battle developed ForeCache, a big data visualization tool that allows researchers to explore large amounts of data with greater ease and precision. Through machine-learning, ForeCache increased browsing speeds by reducing system latency by 88% when compared with existing prefetching techniques.

Since then, Battle has built upon her previous work in data visualization. In one study, she led an international team in creating the first benchmark to test how database systems evaluate interactive visualization workloads. In another, she and Heer investigated characterizing analyst behavior when interacting with data exploration systems, providing a clearer picture of how data is inspected and ultimately used through industry tools such as Tableau Desktop.

“I’m interested in not only streamlining the data science pipeline but also making it more transparent, equitable and accountable,” Battle said. “Some of my latest ideas are headed in this direction, where my collaborators and I are investigating how the concept of interventions in psychology and human computer interaction (HCI) could bring a new perspective to promoting responsible data science work.”

Battle joined the Allen School faculty in 2021 from the University of Maryland, College Park, where she spent three years as a faculty member after completing a postdoc in UW’s Interactive Data Lab and the UW Database Group. She has previously won an Adobe Data Science Research Award, a National Science Foundation (NSF) Research Initiation Initiative Award and an NSF CAREER Award, among others. Last year, she was recognized with a TCDE Rising Star Award by the Institute of Electrical and Electronics Engineers (IEEE) and in 2020 was named an MIT Technology Review Innovator Under 35.

Battle is one of two UW researchers to be recognized in the latest class of Sloan Research Fellows, which also included Jonathan Zhu, a professor in the Department of Mathematics. Other recent honorees in the Allen School include professor Yulia Tsvetkov in 2022 and professors Hannaneh Hajishirzi and Yin Tat Lee in 2020.

Read the UW news release here and the Sloan Foundation news release here.

Congratulations, Leilani! Read more →

Ahead of the pack: Jessica Colleran finds her path as an orienteering champion and a computer science student

Jessica Colleran, wearing a red and blue Team USA jersey with a triangle pattern on the sleeve and a number 68 on the front, runs through a forest while holding a marker, compass and map and wearing a wristband.

Whether traversing new frontiers or old, Jessica Colleran keeps moving forward.

The third-year computer science major, along with University of Washington teammates Curtis Anderson and Annika Mihata, recently won the Orienteering USA (OUSA) Junior National Intercollegiate Championships, which were held in Georgia earlier this year. Their victory marks the first time in more than two decades that a team other than West Point has taken home the trophy.

“When I came to UW, I found a group of people who were excited to compete nationally and being able to be surrounded by a team was very exciting,” Colleran said. “It’s hard to describe the sheer elation we felt when Annika, our last runner of the day, came across the finish line and we realized her time was fast enough to clinch our two-day victory.”

Navigating challenging terrain has become second nature for Colleran, who juggles life as a member of the OUSA national team with her studies in computer science, as well as minors in climate science and physics. She won her first competition in elementary school, kindling what would turn out to be a continued passion for exploration into the unknown.

Orienteering, a sport in which athletes race toward checkpoints using map-reading and directional skills, combines the physical with the mental. For Colleran, who developed an early affinity for puzzles and the outdoors, it was a perfect fit.

“Being active, having a technical component and being in nature sparked all three of my interests,” she said. “It wasn’t just running, but a very technical sport that I could exercise my brain with.”

Colleran’s accomplishments have taken her to landscapes far afield. In 2021, she was named to the OUSA Junior World Orienteering Team that competed in Kocaeli, Turkey. Last year, as part of the World University Championships Orienteering Team, she raced through alpine glades near the city of Biel in Switzerland.

But back at UW, her horizons are no less grand. She plans to combine her varied academic interests to combat climate change, seeing computer science as a pathway for exploring technologies geared toward clean and renewable energy. Orienteering, she said, gave her an innate appreciation for nature — a chance to connect with sights and sounds only found when getting lost. A swishing streambed, the snap of a twig, wind rustling the leaves before they crunch underfoot.

“I have been lucky to explore so many places through orienteering,” she said. “Especially in forests or nature that one wouldn’t usually find themselves in.”

Academics, higher education and UW hold places of high esteem in the Colleran family. Colleran’s parents, Allen School alum John Colleran (B.S., ‘87) and UW Psychology alum Michelle Kastner (B.S., ‘88) established the John Colleran and Michelle Kastner Colleran Endowed Scholarship in 2011. The scholarship supports outstanding undergraduate students in computer science or computer engineering for whom the cost of a UW education would be a significant personal or family financial hardship, but who do not qualify for traditional need-based grants or scholarships. John has been at Microsoft in the Operating Systems Group for more than three decades. Michelle is the Allen School’s representative to the UW Foundation Board.

As for the recent competition at nationals, Colleran recognizes success as a culmination of collective grit, crediting her teammates and family for their support.

“They’re my true compass,” she said.

Annika Mihata, wearing a black University of Washington hooded sweatshirt and a medal around her neck, smiles with Jessica Colleran, wearing a purple University of Washington hooded sweatshirt and a medal, and Curtis Anderson, wearing a purple University of Washington hooded sweatshirt and a medal. They are making W signs with their hands and are standing in front of a wooded background. — The UW team celebrates winning the Orienteering USA (OUSA) Junior National Intercollegiate Championships in January (from left): Annika Mihata, Jessica Colleran, Curtis Anderson

Anderson, for instance, overcame a sprained ankle late in the competition. Adrenaline kicked in, Colleran recalled, and powered him through the rest of the race. Anderson is a fourth-year student majoring in environmental engineering. Mihata, Colleran’s teammate on the U.S. National Team, is a first-year student intending to major in psychology.

The trio previously competed against each other in the Washington Interscholastic Orienteering League (WIOL) and at other meets organized by the Cascade Orienteering Club, before coming together this year to compete for the title.

“We have gotten to know each other as both competitors and friends,” Colleran said. “I was really glad I could organize a new group of three from UW.”

Colleran relishes the thrill of competition, the thrum and vim of pounding feet, a quickening pulse tempered by a cool head. For this student-athlete, there is crossover among her callings. Finding a path, she said, requires “collecting features” — prominent landmarks or observable characteristics that act as anchors. With enough features collected, a mental map begins to form.

Or, in other words, divide and conquer.

“Being a computer science student takes a lot of time management skills and learning how to prioritize and set schedules,” Colleran said. “Often when I feel stressed about being assigned a lot of work in a week, I like to think about breaking it down like I would an orienteering leg as it makes a large workload seem manageable.”

For now, she’ll continue to explore new frontiers, whether they’re technological or terrestrial in nature. Another example of her shared passions: This summer, she has an internship in the San Francisco Bay Area, home to the North American Orienteering Championships taking place in July.

“Keeping your cool under pressure, finding a path, navigating the unknown — I think there are a lot of lessons that the sport teaches you and that translate to being a student, especially one nearing graduation,” Colleran said. “Whatever the challenge, you have to keep going.” Read more →

Allen School alumni Dhruv Jain and Kuikui Liu receive William Chan Memorial Dissertation Awards

Dhruv Jain, wearing black glasses, a black sweater and a black blazer, smiles in front of a blurred background of windows. — Dhruv Jain

The Allen School has recognized Dhruv Jain (Ph.D., ‘22) and Kuikui Liu (Ph.D., ‘22) with the William Chan Memorial Dissertation Award, which honors graduate dissertations of exceptional merit and is named in memory of the late graduate student William Chan. Jain was chosen for his work in advancing new sound awareness systems for accessibility, while Liu was selected for his work on a new framework for analyzing the Markov Chain Monte Carlo method.

Jain’s dissertation, titled “Sound Sensing and Feedback Techniques for Deaf and Hard of Hearing People,” investigated the creation and use of several sound awareness systems in addition to exploring how d/Deaf and hard-of-hearing (DHH) individuals feel about emerging sound awareness technology. One of the systems included HoloSound, an augmented reality system that provides real-time captioning, sound identity and sound location information to DHH users via a wearable device, as well as HomeSound, an Internet-of-Things system that integrates smart displays throughout the home to sense common sounds and produce a single visualization of sound activity within the household. Jain also led the team that developed SoundWatch, a smartwatch app that provides DHH individuals with better awareness of incoming sounds.

“Beyond accessibility, the technical innovations in the field of sound sensing and feedback proposed in the thesis have wide applications for other high-impact domains,” Jain said. “Some of which include ecological surveys, home-automation, game audio debugging and appliance repairs.”

Allen School professor Jon E. Froehlich and Human Centered Design & Engineering professor and Allen School adjunct professor Leah Findlater co-advised Jain, whose work in the Makeability Lab helped facilitate sound accessibility through systems employing human computer interaction (HCI) and artificial intelligence (AI).

“Dhruv’s dissertation research makes fundamental advances in the design of sound sensing and feedback systems for people who are deaf or hard of hearing,” Froehlich said. “Throughout his dissertation work, Dhruv has worked closely with the DHH community to understand diverse needs and evaluate his systems, including through large online surveys, interviews and field deployments.”

Jain’s own experiences as a DHH individual informed his research and helped shape his focus on the user experience.

“Dhruv’s dissertation not only exemplifies the human-centered design process in the creation of accessible technologies but also makes transformative technical innovations in integrating AI and HCI to improve information access,” Froehlich added. “As a testament to its impact, his work has received multiple paper awards and a Microsoft Research Dissertation Grant, and SoundWatch has been released and downloaded by over 2,000 Android watch users worldwide.”

Jain graduated in July and joined the University of Michigan’s Computer Science and Engineering department as a professor in September. He is also affiliated with the University of Michigan’s School of Information and Department of Family Medicine.

“I am immensely grateful to the Allen School staff and faculty in supporting me throughout my research journey,” Jain said. “Especially my advisors Jon Froehlich and Leah Findlater, committee members Jennifer Mankoff, Jacob Wobbrock and Richard Ladner, and staff members Elise Dorough, Elle Brown, Emma Gebben, Sandy Kaplan, Aaron Timss, Hector Rodriguez and Chiemi Yamaoka.”

Liu’s dissertation, titled “Spectral Independence: A New Tool to Analyze Markov,” revolutionized the classical analysis of the Markov Chain Monte Carlo (MCMC) method. Probability distributions, seen in several fields such as physics, epidemiology and data privacy, today display immense complexity and are often high-dimensional, resulting in exponentially or infinitely large domains. As a result, manipulating the data becomes impractical insofar as the amount of time needed to calculate the possible outcomes would exceed the age of the universe.

Kuikui Liu, wearing glasses, a navy jacket and red sweater, smiles in front of a blurred outdoors background with mountains and trees. — Kuikui Liu

The MCMC method, which uses sampling to efficiently estimate otherwise unruly statistics, attempts to tackle this problem. Markov chains act as random agents in a probability distribution that help explain a sequence of possible outcomes. They are used in a variety of fields due to their ease-of-implementation in high-dimensional sampling problems.

But they remain difficult to analyze. Liu’s dissertation introduced spectral independence, a framework for better understanding the MCMC method besides finding elegant solutions from complex, and sometimes chaotic, crossroads.

“Beyond practical motivations, the framework we developed also has intimate connections with beautiful and deep mathematics,” Liu said. “In particular, we also aimed to settle some of the longstanding conjectures at the intersection of pure mathematics, physics, and theoretical computer science — for example, counting certain fundamental combinatorial structures called ‘bases of matroids,’ and sampling from the hardcore gas and Ising models in statistical physics.”

Liu credited professors Shayan Oveis Gharan and Anna Karlin, his advisors in the Allen School’s Theory of Computation group, for providing mentorship and encouragement throughout his research.

“Right now, many experts are trying to absorb and employ Kuikui’s machinery to solve their own research problems,” Oveis Gharan said. “I expect to see these techniques used in areas further away from computer science, such as physics, chemistry or applied mathematics and perhaps even in the social sciences soon.”

Together, Liu and Oveis Gharan and their co-authors earned a Best Paper Award in 2019 from the Association for Computing Machinery’s Symposium on the Theory of Computing (STOC) by presenting a novel approach for counting the bases of matroids. Liu was a first-year Ph.D. student at the time.

Since then, the pair have collaborated on several other projects involving spectral independence, mathematics and statistical physics.

“In my opinion, Kuikui’s thesis is one of the deepest and strongest dissertations to have been produced in all of computer science in the last year, combining beautiful and insightful mathematical proofs with high-impact applications,” Oveis Gharan added. “The interdisciplinary aspect of his thesis makes the results applicable and important to many fields beyond computer science and I am sure that as more scientists learn about it, they will find ways to exploit Kuikui’s techniques in their own research.”

Liu will join the MIT computer science department as a professor in the fall of 2023.

“Thank you so much to the Allen School!” Liu said. “I am immensely grateful for this recognition, and even more so to my mentors Shayan Oveis Gharan and Anna Karlin, my collaborators, our theory group and family and friends for the nurturing environment. It is a reminder of how fortunate I am to be able to work with such incredible researchers. I am honored to be a part of the Allen School community and will miss it dearly.”

Congratulations to DJ and Kuikui! Read more →

Allen School’s Alisa Liu pushes the boundaries of natural language processing with human and machine collaboration

Portrait of Alisa Liu wearing a white short-sleeved shirt with gathered short sleeves and a pendant necklace standing in front of a mosaic tiled staircase and foliage of succulents, ferns, and bushes.

As people engage artificial intelligence to solve problems at a human level, reliance on such technologies has unearthed difficulties in the way that language models learn from data. Often, the models will memorize the peculiarities of a dataset rather than solving the underlying task for which they were developed. The problem has more to do with data quality than size, meaning the problem cannot be corrected by simply making the dataset larger.

Enter Alisa Liu, a Ph.D. student who works with Yejin Choi and Noah Smith in the Allen School’s Natural Language Processing group. Liu seeks to overcome shortcomings in how datasets are constructed by developing new methods of human-machine collaboration to improve the reliability of resulting models. In developing this new framework, Liu also aims to root out social biases that are present within the datasets and therefore reproduced by these models.

“I hypothesize that there is great potential in leveraging language models in a controlled way to aid humans in the dataset creation process,” Liu said.

Liu’s interest in the importance of data was sparked during her time as an undergraduate at Northwestern University. There, Liu felt drawn to the possibilities that machine learning offered to harness the potential of data and develop productive tools. She soon discovered that applying AI to language, music and audio research agendas often does not get the expected results because the external and social knowledge needed to solve certain tasks cannot easily be encoded into a dataset. And even high-performing models were not always useful for end user applications. This experience led Liu to ask questions about how researchers know whether their systems have learned that which they were asked to learn, what types of prior knowledge must be encoded in datasets by researchers, and how researchers can create meaningful tools for real people.

“I saw the importance and potential of AI that can reason about, be informed by, and serve the society in which it exists,” Liu explained.

In 2020, Liu began her graduate studies at the Allen School, where she is challenging previous modes of thinking in her field and incorporating human-centered design approaches to explore how AI can serve society. She earned a 2022 NSF Graduate Research Fellowship from the National Science Foundation to advance this work.

“Alisa’s recent work has really changed my thinking and that of many others in our group about the most impactful ways to use today’s language models,” said Smith, Amazon Professor of Machine Learning at the Allen School and senior director of NLP research at the Allen Institute for AI. “She brings so much creativity and independent thinking to our collaboration. It’s inspiring!”

In collaboration with AI2, Liu developed one of her projects, WANLI, which stands for “Worker and AI Collaboration for Natural Language Inference.” Liu was lead author of the paper published in last year’s Conference on Empirical Methods in Natural Language Processing (EMNLP 2022) Findings that introduced a novel approach to how datasets are formed using a combination of machine generation and human editing. To demonstrate, the researchers developed methods to automatically identify challenging reasoning patterns in existing data, and have GPT-3 generate new related examples that were then edited by human crowdworkers. The results indicate a potential for rethinking natural language generation techniques in addition to reenvisioning the role of humans in the process of dataset creation.

“Humans are very good at coming up with examples that are correct, but it is challenging to achieve sufficient diversity across examples by hand at scale,” said Liu. “WANLI offers the best of both worlds. It couples the generative strength of AI models with the evaluative strength of humans to build a large and diverse set of high-quality examples, and do it efficiently. The next step will be to apply our approach to problems bottlenecked by a lack of annotated datasets, especially for non-English languages.”

“Alisa’s research has been extremely well received by the research community, drawn a lot of interest and inspired thought-provoking discussions,” reflected Choi, Brett Helsel Career Development Professor at the Allen School and senior research director of Mosaic at AI2. “Her innovative work is already making an impact on the field.”

In addition to her ambitious research agenda, Liu places mentorship and service at the center of her endeavors at the Allen School. Notably, Liu mentors UW undergraduates who are interested in doing research in NLP. And having begun her Ph.D. remotely as the COVID pandemic surged, Liu found other ways to support her fellow students as co-chair of the Allen School’s CARE committee, which offers a peer support network to graduate students. She also helped coordinate the Allen School’s visit days program for prospective graduate students and helped organize the Allen School’s orientation for new graduate students once they arrive on campus.

“I chose to pursue a Ph.D. not just because I enjoy thinking about research problems,” said Liu, “but because I knew I would be in a good position to direct my work toward positive applications and to bring more diverse voices into the community.” Read more →

« Newer Posts — Older Posts »