Skip to main content

Goldwater Scholar Alex Mallen aims to make sense of the world — and make a positive impact — through research in beneficial AI

Portrait of Alex Mallen in t-shirt and fleece jacket with trees and autumn leaves in background

When he was a student in high school, computer science major Alex Mallen had what he describes as a “rough” introduction to research. Fortunately, the Bellevue, Washington, native didn’t let that experience deter him at the University of Washington, where as a freshman he decided to try again as a step toward pursuing a Ph.D. after graduation. Mallen’s persistence has paid off in the form of multiple, positive research experiences that have helped him to solidify his plans to enroll in graduate school and, most recently, a prestigious Goldwater Scholarship to support his goal of helping to build artificial intelligence that people can trust to be beneficial.

Of his renewed focus on research, Mallen describes how he cast a wide net and kept an open mind — useful advice for any student hoping to incorporate time in the lab as part of their own undergraduate experience.

“I reached out to professors, postdocs and graduate students whose research I found interesting,” he explained. “I also enrolled in a graduate class in an area I was interested in with my mentor, Professor Kutz.”

Nathan Kutz is a professor in the Department of Applied Mathematics and director of the AI Institute in Dynamic Systems at the UW. In collaboration with Kutz and postdoc Henning Lange, Mallen contributed to the development of a simple and computationally efficient new technique, Deep Probabilistic Koopman (DPK), that enables probabilistic forecasting of complex phenomena thousands of timesteps into the future with a reasonable degree of accuracy. The new class of models, which leverages recent advances in linear Koopman operator theory, returns a probability distribution that assumes parameters will vary quasi-periodically with time. Mallen and his co-authors demonstrated how their approach could be effectively applied in a variety of domains, from forecasting energy demand, to predicting atmospheric pollution levels, to modeling a mouse’s cortical function for neuroscience research. 

“I began working with Alex when he was just a freshman, and I’m not sure how you could find someone as talented and creative and productive as he has been so early in his career,” said Kutz. “He spearheaded our work on DPK, for which he provided critical missing theory for how nonstationary data relate to building Koopman embeddings to transform nonlinear dynamical systems into linear dynamical systems. When we applied this work to a challenge data set for power grid monitoring, his new method placed within the top three—whereas most of the other algorithms had been improved over several years. This is but one illustration of the quality of his work and his potential for transformative impact.”

Mallen subsequently contributed directly to neuroscience research working with members of the Allen Institute for Brain Science. There, he helped to construct and analyze the dataset underpinning the MICrONS Explorer, which offers a comprehensive visualization of the mouse visual cortex. The team developed the tool as part of the Machine Intelligence from Cortical Networks Program to pave the way for a new generation of machine learning algorithms based on an enhanced understanding of “the algorithms of the brain.” More recently, Mallen has been collaborating with a group of researchers based predominantly in Europe and members of the grassroots research collective EleutherAI on a project to direct and characterize the behavior of large pretrained transformers, such as GPT-3, using the example of a large transformer pretrained on human chess games. 

Multi-colored image of nuclei against a black background
A visualization of mouse cortex nuclei from the MICrONS Explorer gallery.

Mallen aims to combine his passion for research with a commitment to effective altruism, which espouses an evidence-based approach to developing solutions to society’s most pressing problems. To that end, he and other members of the UW Effective Altruism group are working to build a community of people on campus who are looking to apply their expertise to do good.

He believes the approach could be particularly effective for addressing the outsized influence AI could have on society in the future.

“It seems reasonably likely that AI will have a very large impact on the world in the next hundred years, and that this shift will have a large and lasting effect on people’s lives for many generations,” Mallen observed. “The effects of AI systems we design are in theory predictable and controllable, but the challenge of properly steering them gets harder as they become more capable.

“I hope to tackle some of the general problems that may arise when training capable AI systems, such as misalignment with human values,” he continued. “We can already see some of these issues in current algorithms that produce toxic or biased output, or social media that harm discourse and mental health by overoptimizing for engagement.”

Mallen is one of two UW students to be named 2022 Goldwater Scholars by the Barry Goldwater Scholarship & Excellence in Education Foundation. Sharlene Shirali, a junior majoring in neuroscience, joined him among this year’s honorees, who are chosen for their potential to make significant research contributions in the natural sciences, engineering or mathematics.

While he is interested in many disciplines, Mallen chose to pursue computer science at the Allen School as an effective means for making sense of the world around him — and for achieving the altruistic impact that he seeks.

“I’m really interested in understanding things — society, philosophy, math, the world — but also I want to do something useful to other people,” Mallen said. “I think computer science is a really important tool to do both.”

Read the Goldwater Foundation announcement here, and the UW Undergraduate Academic Affairs announcement here.

Congratulations, Alex! Read more →

With CoAI, UW researchers demonstrate how predictive AI can benefit patient care — even on a budget

Two masked and gloved emergency services professionals moving gurney with person prone under blanket near open door to helicopter
Photo: Mark Stone/University of Washington

Artificial intelligence tools have the potential to become as essential to medical research and patient care as centrifuges and x-ray machines. Advances in high-accuracy predictive modeling can enable providers to analyze a range of patient risk factors to facilitate better health care outcomes — from preventing the onset of complications during surgery, to assessing the risk of developing various diseases.

When it comes to emergency services or critical care settings, however, the potential benefits of AI in the treatment room are often outweighed by the costs. And in this case, the talk about cost in health care isn’t just about money.

“With sufficient data and the right parameters, AI models perform quite well when asked to predict clinical outcomes. But in this case, ‘sufficient data’ often translates as an impractical number of patient features to collect in many care settings,” noted Gabriel Erion (Ph.D., ‘21), who is combining an M.D. with a Ph.D. in computer science as part of the University of Washington’s Medical Scientist Training Program. “The cost, in terms of the time and effort required to collect that volume and variety of data, would be much too high in an ambulance or intensive care unit, for example, where every second counts and responders need to prioritize patient care.”

But thanks to Erion and collaborators at the UW’s Paul G. Allen School and the UW School of Medicine, providers needn’t make a choice between caring directly for patients and leveraging advances in AI to identify the interventions with the highest likelihood of success. In a paper published in Nature Biomedical Engineering, the team presents CoAI, short for Cost-Aware Artificial Intelligence, a new framework for dramatically reducing the time, effort and resources required to predict patient outcomes and inform treatment decisions without sacrificing the accuracy of more cost-intensive tools. 

To reduce the number of clinical risk factors required to be collected in real time, the researchers trained CoAI on a massive dataset combining patient features, prediction labels, expert annotations of feature cost, and a budget representing total acceptable cost. They applied Shapley values to calculate a quantitative measure of the predictive power of every single feature in the dataset; since Shapley values are additive, this approach enables CoAI to calculate the importance of a group of features relative to their cost. CoAI then recommends which subset of features would enable the most accurate prediction of patient risk within a specified budget. And some of those budgets are very tight, indeed.

Portraits of Gabriel Erion and Joseph Janizek, side by side, divided by a slanted gold line. Erion is standing in front of water and foliage with the sunset as a backdrop; Janizek is standing in front of large leafy trees and a wooden stockade fence, looking off to the side
Gabriel Erion (left) and Joseph Janizek

“Fifty seconds. That’s how long first responders told us they can spare to score patient risk factors when they are in the midst of performing a life-saving intervention,” said co-senior author and professor Su-In Lee, who leads the Allen School’s AIMS Lab focused on integrating AI and the biomedical sciences. “CoAI deals with this constraint by prioritizing a subset of features to gather while achieving the same or better accuracy in its predictions as other, less cost-aware models. And it is generalizable to a variety of care settings, such as cancer screening, where different feature costs come into play — including financial considerations.”

As co-author Joseph Janizek (Ph.D., ‘22) explained, CoAI has a significant advantage over even other cost-sensitive methods owing to its efficiency and flexibility.

“A notable difference between CoAI and other approaches is its robustness to ‘cost shift,’ wherein features become more or less expensive after the model has been trained. Since our framework decouples feature selection from training, CoAI continues to perform well even when this shift occurs,” noted Janizek, who is also pursuing his M.D. in combination with a Ph.D. from the Allen School via the MSTP. “And because it’s model-agnostic, CoAI can be used to adapt any predictive AI system to be cost-aware, enabling accurate predictions at lower cost within a wide variety of settings.”

Janizek and his AIMS Lab colleagues teamed up with clinicians at the UW School of Medicine and first responders with Airlift Northwest, American Medical Response and the Seattle Fire Department to validate the CoAI approach. In a series of experiments, the researchers evaluated CoAI’s performance compared to typical AI models in predicting the increased bleeding risk of trauma patients en route to the hospital and the in-hospital mortality risk of critical care patients in the ICU. They also surveyed first responders and nurses to understand how patient risk scoring works in practice — hence the aforementioned 50-second rule. In the case of trauma response, their experiments showed that CoAI dramatically reduces the cost of data acquisition — by around 90% — while still achieving levels of accuracy comparable to other, more cost-intensive approaches. They achieved similar results for the inpatient critical care setting.

According to co-senior author Dr. Nathan White, associate professor of Emergency Medicine at the UW School of Medicine, these results speak to what is possible when researchers break down barriers between disciplines and prioritize how new technologies will be put to real-world use.

Portraits of Su-In Lee and Dr. Nathan White, side by side, divided by a slanted gold line. Lee is seated in front of a whiteboard with an open laptop and holding a pen while looking off to the side; Dr. White is wearing his Emergency Medicine lab coat and posed in front of a generic blue studio backdrop
Su-In Lee (left) and Dr. Nathan White

“A key contributor to the success of this project included the great synergy afforded by working across traditional silos of medicine and engineering,” said White. “AI is an important component of healthcare today, but we must always be aware of the clinical situations where AI is being used and seek out input from frontline health care workers involved directly in patient care. This will ensure that AI is always working optimally for the patients it intends to benefit.”

Lee agreed, noting that the UW’s MSTP serves to enhance this synergy with each new student who enters the program.

“Gabe and Joe were the first UW MSTP students to earn their Ph.D. in the Allen School. They exemplify the best of both worlds, combining rigorous computer science knowledge with hands-on clinical expertise,” Lee said. “This nexus of knowledge, spanning two traditionally disparate disciplines, will be essential to our future progress in developing AI as an effective and efficient tool used in biomedical research and treatment decisions.”

Dr. White’s colleagues in the Department of Emergency Medicine, Drs. Richard Utarnachitt, Andrew McCoy and Michal Sayre, along with Dr. Carly Hudelson of the Division of General Internal Medicine, are co-authors of the paper. An early preview of the project earned the Madrona Prize sponsored by Madrona Venture Group at the Allen School’s annual research day in 2019. The research was funded by the National Science Foundation, American Cancer Society, and National Institutes of Health.

Read the paper in Nature Biomedical Engineering. Read more →

NLP for all: Professor and 2022 Sloan Research Fellow Yulia Tsvetkov is on a quest to make natural language tools more equitable, inclusive and socially aware

Portrait of Yulia Tsvetkov with leafy trees in the background

Less than a year after her arrival at the University of Washington, professor Yulia Tsvetkov is making her mark as the newest member of the Allen School’s Natural Language Processing group. As head of the Tsvetshop — a clever play on words that would likely stymie your typical natural language model — Tsvetkov draws upon elements of linguistics, economics, and the social and political sciences to develop technologies that not only represent the leading edge of artificial intelligence and natural language processing, but also benefit users across populations, cultures and languages. Having recently earned a 2022 Sloan Research Fellowship from the Alfred P. Sloan Foundation, Tsvetkov is looking forward to adding to her record of producing new tools and techniques for making AI and NLP more equitable, inclusive and socially aware.

“One of the goals of my work is to uncover hidden insights into the relationship between language and biases in society and to develop technologies for identifying and mitigating such bias,” said Tsvetkov. “I also aim to build more equitable and robust models that reflect the needs and preferences of diverse users, because many speakers of diverse language varieties are not well-served by existing tools.”

Her focus at the intersection of computation and social sciences has enabled Tsvetkov to make inroads when it comes to protecting the integrity of information beyond “fake news” by identifying more subtle forms of media manipulation. Even with the growing attention being paid to identifying and filtering out misleading content, tactics such as distraction, propaganda and censorship can be challenging for automated tools to detect. To overcome this challenge, Tsvetkov has spearheaded efforts to develop capabilities for discerning “the language of manipulation” automatically and at scale. 

In one project, Tsvetkov and her colleagues devised computational approaches for detecting subtle manipulation strategies in Russian newspaper coverage by applying agenda-setting and framing — two concepts from political science — to tease out how one outlet’s decisions about what to cover and how were used to distract readers from economic conditions. She also produced a framework for examining the spread of polarizing content on social media based on an analysis of Indian and Pakistani posts following the 2019 terrorist attacks in Kashmir. Given the growth in AI-generated text, Tsvetkov has lately turned her attention to semantic forensics, including the analysis of the types of misinformation and factual inconsistencies produced by large AI models with a view to developing interpretable deep learning approaches that will control for factuality and other traits of machine-generated content. 

“Understanding the deeper meaning of human- or machine-generated text, the writer’s intent, and what emotional reactions the text is likely to evoke in its readers is the next frontier in NLP,” said Tsvetkov. “Language technologies that are capable of doing such fine-grained analysis of pragmatic and social meaning will be critical for combating misinformation and opinion manipulation in cyberspace.”

Another of the ways in which Tsvetkov’s work has contributed to researchers’ understanding of the interplay between language and social attitudes is by surfacing biases in narrative text targeting vulnerable audiences. NLP researchers — including several of Tsvetkov’s Allen School colleagues — have demonstrated effective techniques for identifying toxic content online, and yet more subtle forms continue to evade moderation. Tsvetkov has been at the forefront of developing new datasets, algorithms and tools grounded in social psychology to detect discrimination, at scale and across multiple languages, based on gender, race and/or sexual orientation that manifests in online text and conversations. 

“Although there are tools for detecting hate speech, most harmful web content remains hidden,” Tsvetkov noted. “Such content is hard to detect computationally, so it propagates into downstream NLP tools that then serve to amplify systematic biases.”

One approach that Tsvetkov has employed to great effect is an expansion of contextual affective analysis (CAA), a technique for examining how people are portrayed along dimensions of power, agency and sentiment, to multilingual settings in an effort to understand how narrative text across different languages reflects cultural stereotypes. After applying a multilingual model to English, Spanish and Russian Wikipedia entries about prominent LGBTQ figures in history, Tsvetkov and her team found systematic differences in phrasing that reflected social biases. For example, entries about the late Alan Turing, who was persecuted for his homosexuality, described how he “accepted” chemical castration (English), “chose” it (Spanish), or “preferred” it (Russian) — three verbs with three very different connotations as to Turing’s agency, power and sentiment at the time. Tsvetkov applied similar analyses to uncover gender bias in media coverage of #MeToo and assist the Washington Post in tracking racial discrimination in China, and has since built upon this work to produce the first intersectional analysis of bias in Wikipedia biographies that examines gender disparities beyond cisgender women alongside racial disparities.

The fact that most existing NLP tools are grounded in a specific variant of English has been a driving force in much of Tsvetkov’s research. 

“We researchers often say that a model’s outputs are only as good as its inputs,” Tsvetkov noted. “For the purposes of natural language models, those inputs have mostly been limited to a certain English dialect — but there are multiple English dialects and over 6,000 languages besides English spoken around the world! That’s a significant disconnect between current tools and the billions of people for whom English is not the default. We can’t achieve NLP for all without closing that gap.”

To that end, Tsvetkov has recently turned her attention to developing new capabilities for NLP technologies to adapt to multilingual users’ linguistic proficiencies and preferences. For example, she envisions tools that can match the ability of bilingual and non-native speakers of English and Spanish to switch fluidly between the two languages in conversation, often within the same sentence. Her work has the potential to bridge the human-computer divide where, currently, meaning and context can get lost in translation.

“Yulia is intellectually fearless and has a track record of blending technical creativity with a rigorous understanding of the social realities of language and the communities who use it,” said Magdalena Balazinska, professor and director of the Allen School. “Her commitment to advancing language technologies that adapt to previously ignored users sets her apart from her research peers. By recognizing that AI is not only about data and math, but also about people and societies, Yulia is poised to have an enormous impact on the field of AI and beyond.”

Tsvetkov joined the Allen School last July after spending four years on the faculty of Carnegie Mellon University. She is one of two UW researchers who were honored by the Sloan Foundation in its class of 2022 Fellows, who are chosen based on their research accomplishments and creativity as rising leaders in selected scientific or technical fields. Briana Adams, a professor in the UW Department of Biology, joined Tsvetkov among a total of 118 honorees drawn from 51 institutions across the United States and Canada.

Read the Sloan Foundation press release here and a related UW News release here

Congratulations, Yulia!

Rebekka Coakley contributed to this story. Read more →

Allen School and AI2 researchers paint the NeurIPS conference MAUVE and take home an Outstanding Paper Award

Neural Information Processing Systems logo in mauve on dark grey background

Recent advances in open-ended text generation could enable machines to produce text that approaches or even mimics that generated by humans. However, evaluating the quality and accuracy of these large-scale models has remained a significant computational challenge. Recently, researchers at the Allen School and Allen Institute for AI (AI2) offered a solution in the form of MAUVE, a practical tool for assessing modern text generation models’ output compared to human-generated text that is both efficient and scalable. The team’s paper describing this new approach, “MAUVE: Measuring the Gap Between Neural Text and Human Text using Divergence Frontiers,” earned an Outstanding Paper Award at the Conference on Neural Information Processing Systems (NeurIPS 2021) in December.

The goal of open-ended text generation is to achieve a level of coherence, creativity, and fluency that mimics human text. Because the task is, as the name suggests, open-ended, there is no correct answer; this makes evaluation of a model’s performance more difficult than with more concrete tasks such as translation or summarization. MAUVE solves this problem by employing information divergence frontiers — heretofore a little-used concept in NLP — to reduce the comparison between model-generated text and human text to a computationally tractable yet effective measurement.

“For open-ended text generation to make that next leap forward, we need to be able to evaluate a model’s performance on two key aspects that are prone to error: how much weight it gives to sequences that truly resemble human text, as opposed to gibberish, and whether the generated text exhibits the variety of expression we would expect to see from humans, instead of boring or repetitive text that reads like a template,” explained lead author Krishna Pillutla, a Ph.D. candidate in the Allen School. “The beauty of MAUVE is that it enables us to quantify both, using a simple interface and an approach that is easily scaled to whatever sized model you’re working with.”

Portraits of Krishna Pillutla, Swabha Swayamdipta, and Zaid Harchaoui
Left to right: Krishna Pillutla, Swabha Swayamdipta, and Zaid Harchaoui

MAUVE computes the divergence between the model distribution and target distribution of human text for the above-mentioned pair of criteria in a quantized embedding space. It then summarizes the results as a single scalar that illustrates the gap between the machine-generated and human text. To validate MAUVE’s effectiveness, the team tested the tool using three open-ended text completion tasks involving web text, news articles and stories. The results of these experiments confirmed that MAUVE reliably identifies the known properties of machine-generated text, aligns strongly with human judgments, and scales naturally with model size — and does so with fewer restrictions than existing distributional evaluation metrics. And whereas other language modeling tools or statistical measures are typically limited to capturing a single statistic or correspond to only one point on the divergence curve, MAUVE offers expanded insights into a model’s performance.

“MAUVE enables us to identify the properties of machine-generated text that a good measure should capture,” noted co-author Swabha Swayamdipta, a postdoctoral investigator at AI2. “This includes distribution-level information that enables us to understand how the quality of output changes based on the size of the model, the length of text we are asking it to generate, and the choice of decoding algorithm.”

While Swayamdipta and her colleagues designed MAUVE with the goal of improving the quality of machine-generated text — where “quality” is defined according to how closely it resembles the human-authored kind — they point out that its capabilities also provide a foundation for future work on how to spot the difference. 

“As with every new technology, there are benefits and risks,” said senior author Zaid Harchaoui, a professor in the University of Washington’s Department of Statistics and adjunct professor in the Allen School. “As the gap narrows between machine and human performance, having tools like MAUVE at our disposal will be critical to understanding how these more sophisticated emerging models work. The NLP community can then apply what we learn to the development of future tools for distinguishing between content generated by computers versus that which is produced by people.”

Portraits of Rowan Zellers, John Thickstun, Sam Welleck and Yejin Choi, arranged in a grid
Clockwise from top left: Rowan Zellers, John Thickstun, Sean Welleck and Yejin Choi

Additional co-authors of the paper introducing MAUVE include Allen School Ph.D. student Rowan Zellers, postdoc Sean Welleck, alumnus John Thickstun (Ph.D., ‘21) — now a postdoc at Stanford University — and Yejin Choi, the Brett Helsel Career Development Professor in the Allen School and a senior research manager at AI2. The team received one of six Outstanding Paper Awards presented at NeurIPS 2021, which are chosen based on their “clarity, insight, creativity, and potential for lasting impact.”

Members of the team also studied the statistical aspects of MAUVE in another paper simultaneously published at NeurIPS 2021. Together with Lang Liu, Ph.D. candidate in Statistics at UW, and Allen School professor Sewoong Oh, they established bounds on how many human-written and machine-generated text samples are necessary to accurately estimate MAUVE. 

Read the research paper here and the NeurIPS award announcement here. Explore the MAUVE tool here.

Congratulations to the entire team! Read more →

Allen School’s Luke Zettlemoyer elected Fellow of the Association for Computational Linguistics for expanding the frontiers of natural language processing

Portrait of Luke Zettlemoyer

Luke Zettlemoyer, a professor in the Allen School’s Natural Language Processing group and a research director at Meta AI, was recently elected a Fellow of the Association for Computational Linguistics (ACL) for “significant contributions to grounded semantics, semantic parsing, and representation learning for natural language processing.” Since he arrived at the University of Washington in 2010, Zettlemoyer has focused on advancing the state of the art in NLP while expanding its reach into other areas of artificial intelligence such as robotics and computer vision.

Zettlemoyer broke new ground as a Ph.D. student at MIT, where he advanced the field of semantic parsing through the application of statistical techniques to natural language problems. He and his advisor, Michael Collins, devised the first algorithm for automatically mapping natural language sentences to logical form by incorporating tractable statistical learning methods — specifically, the novel application of a log-linear model — in a combinatory categorial grammar (CCG) with integrated semantics. He followed up that work, for which he received the Best Paper Award at the Conference of Uncertainty in Artificial Intelligence (UAI 2005), by developing techniques for mapping natural language instructions to executable actions through reinforcement learning that rivaled the performance of supervised learning methods. Those results earned him another Best Paper Award with MIT colleagues, this time from the Association for Computational Linguistics (ACL 2009). 

After he arrived at the Allen School, Zettlemoyer continued pushing the state of the art in semantic parsing by introducing the application of weak supervision and the use of neural networks, among other innovations. For example, he worked with student Yoav Artzi (Ph.D., ‘15) on the development of the first grounded CCG semantic parser capable of jointly reasoning about meaning and context to execute natural language instructions with limited human intervention. Later, Zettlemoyer teamed up with Allen School professor Yejin Choi, postdoc Ionnas Konstas, and students Srinivasan Iyer (Ph.D., ‘19) and Mark Yatskar (Ph.D., ‘17) to introduce Neural AMR, the first successful sequence-to-sequence model for parsing and generating text via Abstract Meaning Representation, a useful technique for applications ranging from machine translation to event extraction. Previously, the use of neural network models with AMR was limited due to the expense of annotating the training data; Zettlemoyer and his co-authors solved that challenge by combining a novel pretraining approach with preprocessing of the AMR graphs to overcome sparsity in the data while reducing complexity.

Question answering is another area of NLP where Zettlemoyer has made multiple influential contributions. For example, the same year he and his co-authors presented Neural AMR at ACL 2017, Zettlemoyer and Allen School colleague Daniel Weld worked with graduate students Mandar Joshi and Eunsol Choi (Ph.D., ‘19) to introduce TriviaQA, the first large-scale reading comprehension dataset that incorporated full-sentence, organically generated questions composed independent of a specific NLP task. According to another Allen School colleague, Noah Smith, Zettlemoyer’s vision and collaborative approach are a powerful combination that has enabled him to achieve a series of firsts while steering the field in exciting new directions.

“Simply put, Luke is one of natural language processing’s great pioneers,” said Smith. “From his graduate work on semantic parsing, to a range of contributions around question answering, to his extremely impactful work on large-scale representation learning, he’s shown foresight and also the ability to execute on his big ideas and the charisma to bring others on board to help.”

One of those big ideas Smith cited — large-scale representation learning — went on to become ubiquitous in NLP research. In 2018, Zettlemoyer, students Christopher Clark (Ph.D., ‘20) and Kenton Lee (Ph.D., ‘17), and collaborators at the Allen Institute for AI (AI2) presented ELMo, which demonstrated pretraining as an effective tool for enabling a language model to acquire deep contextualized word representations that could be incorporated into existing models and fine-tuned for a range of NLP tasks. ELMo, which is short for Embeddings from Language Models, satisfied the dual challenges of modeling the complex characteristics of word use such as semantics and syntax while also capturing how such uses vary across different linguistic contexts. Zettlemoyer subsequently did some fine-tuning of his own by contributing to new and improved pretrained models such as the popular RoBERTa — with more than 6,500 citations and counting — and BART. In addition to earning a Best Paper Award at the Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2018), the paper describing ELMo has been cited more than 9,200 times.

Zettlemoyer pioneered another exciting research trend when he began connecting the language and vision aspects of AI. For example, he worked with Yatskar and Allen School colleague Ali Farhadi to introduce situation recognition, which applies a linguistic framework to a classic problem in computer vision — namely, how to concisely and holistically describe the situation an image depicts. Situation recognition represented a significant leap forward from independent object or activity recognition with its ability to summarize the main activity in a scene, the actors, objects and locations involved, and the relationship among all of these elements. Zettlemoyer also contributed to some of the first work on language grounding for robotic agents, which built in part on his original contributions to semantic parsing from his graduate student days. He and a team that included Allen School professor Dieter Fox, students Cynthia Matuszek (Ph.D., ‘14) and Nicholas FitzGerald (Ph.D., ‘18), and postdoc Liefeng Bo developed an approach for joint learning of perception and language that endows robots with the ability to recognize previously unknown objects based on natural language descriptions of their physical attributes. 

“It is an unexpected but much appreciated honor to be named an ACL Fellow. I am really grateful to and want to highlight all the folks whose research is being recognized, including especially all the students and research collaborators I have been fortunate enough to work with,” Zettlemoyer said. “The Allen School has been an amazing place to work for the last 10+ years. I really couldn’t imagine a better place to launch my research career, and can’t wait to see what the next 10 years — and beyond — will bring!”

Zettlemoyer previously earned a Presidential Early Career Award for Scientists and Engineers (PECASE) and was named an Allen Distinguished Investigator in addition to amassing multiple Best Paper Awards from the preeminent research conferences in NLP and adjacent fields. In addition to his faculty role at the Allen School, he joined Facebook AI Research in 2018 after spending a year as a senior research manager at the Allen Institute for AI. He is one of eight researchers named among the ACL’s 2021 class of Fellows and the third UW faculty member to have attained the honor, following the election of Smith in 2020 and Allen School adjunct faculty member Mari Ostendorf, a professor in the Department of Electrical & Computer Engineering, in 2018.

The designation of Fellow is reserved for ACL members who have made extraordinary contributions to the field through their scientific and technical excellence, service and educational and/or outreach activities with broad impact. Learn more about the ACL Fellows program here.

Congratulations, Luke! Read more →

Allen School undergraduates recognized by the Computing Research Association for advancing health sensing, programming languages and systems research

Computing Research Association logo

The Allen School has a proud tradition of nurturing undergraduate student researchers whose work has the potential for real-world impact. This year, three of those students — Jerry Cao, Mike He and Yu Xin — earned honorable mentions from the Computing Research Association (CRA) as part of its 2022 Outstanding Undergraduate Researcher Awards competition for their contributions in health sensing and fabrication, programming languages and machine learning, and building robust computer systems.

Jerry Cao

Jerry Cao

The CRA recognized senior Jerry Cao, who is majoring in computer science and applied mathematics, for his research in health sensing and fabrication. Advised by professors Jennifer Mankoff and Shwetak Patel, his work aims to apply computing and fabrication to improve individuals’ quality of life. To reduce the burden of health monitoring and make it easier for users to prototype custom tools that fit their personalized needs, Cao is creating a wearable device in compression sleeve-form for the leg that records changes in the blood volume in the body’s superficial tissue. This can help predict the onset of adverse symptoms throughout the day for conditions such as Postural Orthostatic Tachycardia Syndrome (POTS) where blood flow is improperly regulated throughout the body.

Cao is also working on a project to rapidly prototype physical objects. He aims to reduce the number of iterations — currently requiring several to reach the final product — by reconfiguring a model to support real-time iteration. He is developing a pipeline to take a parametric model and produce a reconfigurable prototype where each parameter can be adjusted up to a specified and allowed range. Users can more easily change the size of the physical model this way and record all the necessary measurements to fabricate a final version. For example, when building a cabinet, builders must ensure it fits in its designated space. The reconfigurable prototype will limit the iterations and allow users to explore different configurations of the object, then create the final version using actual materials.

Mike He

Mike He

Mike He, a senior studying computer science, was acknowledged for his work in programming languages, formal verification, compilers and machine learning systems. Advised by professor Zachary Tatlock, He worked with the Allen School’s PLSE group on Dynamic Tensor Rematerialization (DTR), an algorithm that trains deep learning models under constrained memory budgets. Since deep learning models use up a lot of GPU memory while training, He and his colleagues created DTR to train these models under restricted memory budgets. DTR removes restrictions on classic compilers and when memory fills up, DTR evicts the oldest, stalest, cheapest-to-recompute tensor to make room for the next allocation. If the training loop later tries to access a previously evicted tensor, DTR recomputes it on demand by tracking operator dependencies. 

In addition to his contributions to DTR, He led the push to develop new flexible accelerator matching compiler techniques to easily target new hardware accelerators in deep learning frameworks. To do so, the team is enabling devices to be more easily incorporated into an existing DL framework and, in principle, for formal functional verification down to the hardware implementation. The project, 3LA, has a built-in pattern-matching algorithm that can find accelerator supported workloads in deep learning models using equality saturation. The project addressed the mapping gap between deep learning models represented in high-level domain-specific languages and specialized accelerators using instruction-level abstraction as the software-hardware interface.

Yu Xin 

Yu Xin

Yu Xin, a senior studying computer science and applied and computational mathematical science, was honored by the CRA for his work with Allen School professor Arvind Krishnamurthy in building effective and robust computer systems. In particular, Xin worked to develop a scheduler for serving deep learning inference tasks. Applications using cloud-based deep learning models, when deployed on a large scale, tend to flood data center GPU clusters, slowing down the time it takes to respond and causing delays and extra expense. To help with the cost and speed, Xin and his collaborators created Symphony, a centralized dispatcher to satisfy requests within a latency bound, ensuring load-balance across GPUs and maximizing their efficiency by using appropriate dynamically-sized batches of inference requests. By loading dozens of deep learning models on each GPU, Symphony enables burst amortization across models and has the potential to eliminate the need for overprovisioning. Enabling multiple dispatchers for better scalability, Xin designed an algorithm to partition the model space into many disjoint subsets in which each dispatcher handles one of the models. The algorithm finds the partitioning scheme that minimizes the deviation between partitions in terms of total request rates and model sizes by generating and solving a Mixed Integer Linear Programming (MILP) problem.

Xin’s previous work includes developing tools for analyzing images of proteins generated from a cryo-electron microscope. For example, filtering out high-frequency noises by generating an artificial image based on a mathematical model and comparing it against every patch of the image to see if there is a match and then output all the matched results. This approach saves researchers time while increasing their effectiveness by directing their attention to the most relevant sites.

Congratulations to Jerry, Mike and Yu! 

Read more →

Allen School alumni Maarten Sap and Ivan Evtimov earn dissertation awards for contributions to more socially aware and secure AI

Maarten Sap

During their time at the Allen School, recent alumni Maarten Sap (Ph.D., ‘21) and Ivan Evtimov (Ph.D., ‘21) tackled some of the thorniest issues raised by emerging natural language processing and machine learning technologies — from endowing NLP systems with social intelligence while combating inequity and bias, to addressing security vulnerabilities in the convolutional neural networks that fuel state-of-the-art computer vision systems. Recently, the faculty honored both for their contributions with the William Chan Memorial Dissertation Award, which was named in memory of the late graduate student William Chan to recognize dissertations of exceptional merit. Evtimov earned additional recognition for his work from the Western Association of Graduate Schools and ProQuest as the recipient of the WAGS/ProQuest Innovation in Technology Award, which recognizes distinguished scholarly achievement at the master’s or doctoral level.

Sap — who is currently a postdoctoral/young investigator at the Allen Institute for AI (AI2) — worked with Allen School professors Yejin Choi and Noah Smith. His dissertation, “Positive AI with Social Commonsense Models,” advanced new techniques for making NLP systems more human-centric, socially aware and equity-driven.

“Maarten’s dissertation presents groundbreaking work advancing social commonsense reasoning and computational models serving equity and inclusion. More specifically, his work presents technical and conceptual innovations that make deep learning methods significantly more equitable,” said Choi and Smith, both of whom are also senior research managers at AI2. “Maarten’s research steers the field of NLP and its products toward a better future.”

One example is ATOMIC, a large-scale social commonsense knowledge graph Sap and collaborators created to help machines comprehend day-to-day practical reasoning about events, causes and effects. To create equity-driven NLP systems, he also helped develop PowerTransformer, a controllable text rewriting model that helps authors mitigate biases in their writing, particularly biases related to how the public describes people of different genders. Sap also tackled the problem of detecting biases and toxicity in language by identifying issues with the current hate speech detectors that lead to racial biases. His work introduced Social Bias Frames, a linguistic framework for explaining the biased or harmful implications in text. The papers supporting this, The Risk of Racial Bias in Hate Speech Detection and Social Bias Frames: Reasoning about Social and Power Implications of Language were nominated for a Best Short Paper Award by the Association for Computer Linguistics in 2019 and won the Best Paper Award at the West Coast NLP Summit in 2020, respectively. Sap was also a member of the team that won the first Amazon Alexa Prize for a conversational chatbot called Sounding Board that engages with users about current topics.

TechCrunch, Forbes, Fortune and Vox have all covered Sap’s research. After completing his postdoc with AI2’s MOSAIC team, he will join Carnegie Mellon University’s Language Technology Institute as a professor in the fall.

Evtimov’s dissertation, “Disrupting Machine Learning: Emerging Threats and Applications for Privacy and Dataset Ownership,” makes significant contributions to the security of adversarial machine learning. His research as a member of the Allen School’s Security & Privacy Research Lab focused particularly on the vulnerabilities of convolutional neural networks (CNN) that allow maliciously crafted inputs to affect both their inference and training. Evtimov said that understanding new technologies in terms of  security and privacy is important in order to think ahead of adversarial actors. 

“Ivan’s dissertation is highly innovative, and contributed significant results to the field of real-world attacks against computer vision algorithms. His work is of fundamental importance to the field,” Allen School professor and lab co-director Tadayoshi Kohno said. “Computer vision is everywhere — in autonomous cars, in computer authentication schemes, and more. Ivan’s dissertation helps the field develop secure computer vision systems and also provides foundations for helping users protect their privacy in the face of such systems.”

Evtimov’s work shows that the vulnerabilities of CNNs exhibit a duality when it comes to security and privacy. For example, he found the computer algorithms for cameras reading traffic signs in autonomous cars could be tricked by an object as simple as a sticker on a stop sign. The sticker could fool the cameras into reading the stop sign as a speed limit sign. In the case of autonomous driving, it is critical to identify anything that could be exploited by malicious parties in such a safety-critical setting. Machine learning, Evtimov found, can also be used in an unauthorized manner. Take, for example, a search engine for facial recognition. To protect privacy, Evtimov studied the conditions in which people could flood a database full of photos gathered from the public without permission with decoys. He proposed FoggySight, a tool that involves community users uploading modified photos — for instance, labeling photos of Madonna as photos of Queen Elizabeth  — to poison the facial search database and throw off searches in it. He also found ways to protect visual data released for human consumption from misuse through machine learning, including developing a protective mechanism that can be applied to the information contained in datasets before public release to prevent unauthorized parties from training their own models using the data. 

Evtimov’s research has been covered by Ars Technica, IEEE Spectrum and more. He previously won a Distinguished Paper Award at the Workshop on Adversarial Machine Learning in Real-World Computer Vision Systems and Online Challenges for his work examining the vulnerability of combined image and text models to adversarial threats. After graduating from the Allen School, Evtimov joined Meta as a research scientist. 

Congratulations to Maarten and Ivan! 

Read more →

Richard Ladner elected AAAS Fellow for his leadership in making computing education and careers accessible to people with disabilities

Portrait of Richard Ladner with books and framed photos behind him

Allen School professor emeritus Richard Ladner has been elected a Fellow of the American Association for the Advancement of Science (AAAS) for his “distinguished contributions to the inclusion of persons with disabilities in the computing fields.” One of 26 leading scientists in the organization’s Information, Computing & Communications section to attain the rank of Fellow this year, Ladner has devoted the past two decades to research and advocacy aimed at making computing education and careers more accessible while designing technologies that empower all users.

A mathematician by training, Ladner helped establish the University of Washington’s theoretical computer science group shortly after joining the faculty in the early 1970’s. At the time, Ladner’s interest in disability issues was personal, having been raised by two parents who were deaf. Later, after he completed an American Sign Language course at a local community college, Ladner began doing volunteer work with people who were deaf and blind as well as writing about accessibility issues. Having worked on several accessibility projects in the 1980s and 1990s, his first full-time foray into accessible technology development — an experience that would alter the course of his career in terms of both research and advocacy — would not come until 2002.

That year, Ladner met Sangyun Hahn, a graduate student who was blind. Hahn related to his new advisor his frustration at being unable to easily access certain content in his textbooks, such as mathematical formulas and diagrams. Their discussions led to the launch of the Tactile Graphics project to automate the conversion of textbook figures into an accessible format. A series of accessibility projects followed, including MobileASL, a collaboration between Ladner and UW Electrical & Computer Engineering professor Eve Riskin to enable people to communicate using American Sign Language via a mobile phone; WebAnywhere, a non-visual platform enabling people who are blind to navigate the web using any browser, on any device, with Jeffrey Bigham (Ph.D., ‘09); Perkinput, a Braille-based text entry system for touchscreen devices, with Shiri Azenkot (Ph.D., ‘14) and iSchool professor Jacob Wobbrock; and Blocks4All, an accessible, touchscreen-based blocks environment for children who are blind to learn programming, with Lauren Milne (Ph.D., ‘18). 

As a student with a disability, Hahn was a relative rarity in computer science graduate programs at the time he and Ladner met. When the latter turned his attention full-time from exploring the theoretical underpinnings of computing to making computing more accessible to all users, he recognized that one of the obstacles was the lack of pathways for more people with disabilities to pursue computer science and bring their perspectives to the development of new technologies. This led him to partner with Sheryl Burgstahler, director of UW’s DO-IT Center, to establish the Alliance for Access to Computing Careers, or AccessComputing, in 2006 with support from the National Science Foundation’s Broadening Participation in Computing program. AccessComputing helps high school, undergraduate and graduate students to build skills and connections with mentors and professional opportunities in the computing fields. So far, the program has directly served more than 2,400 students with disabilities through a range of activities, from academies and workshops to research and work-based internships.

Ladner subsequently teamed up with Andreas Stefik at the University of Nevada, Las Vegas to launch AccessCSforAll, an initiative aimed at providing accessible curriculum and resources to engage students with disabilities in K-12 computer science education. That work led Code.org and the Computer Science Teachers Association to name Ladner, Stefik and the Quorum programming team 2018 Computer Science Champions. A year later, Ladner and Stefik were again recognized — this time alongside collaborators William Allee and Sean Mealin — with a Best Paper Award from the Association for Computing Machinery’s Special Interest Group in Computer Science Education at the SIGCSE 2019 conference. In the winning paper, “Computer Science Principles for Teachers of Blind and Visually Impaired Students,” the team presented the results of its partnership with Code.org to review and revamp the Advanced Placement CSP curriculum and tools for accessibility.

Richard Ladner seated across from group of three students conversing in sign language in between long rectangular tables with other students working on computers in the background
Ladner (left) converses with students in AccessComputing’s Summer Academy for Advancing Deaf and Hard of Hearing in Computing (Mary Levin)

In 2020, the National Science Board recognized Ladner with its Public Service Award for his exemplary science communication and diversity advocacy — the latest in a long line of previous accolades for his leadership on accessible technology and education that includes the Strache Leadership Award from the Center on Disabilities at University of California, Northridge, the Award for Outstanding Contributions to Computing and Accessibility from the ACM Special Interest Group on Accessible Computing (SIGACCESS), the Richard A. Tapia Achievement Award for Scientific Scholarship, Civic Science and Diversifying Computing from the Center for Minorities and People with Disabilities in Information Technology (CMD-IT), and more. Along the way, Ladner also earned the 2019 Harrold and Notkin Research and Graduate Mentoring Award — named in part for the late David Notkin, former chair of what was then known as the UW Department of Computer Science & Engineering — from the National Center for Women and Information Technology (NCWIT) for his long-standing efforts to advance gender diversity in computing.

Even after he officially attained emeritus status at the Allen School in 2017, Ladner remained active in research and mentoring students in addition to advocacy and program leadership. Over the course of his career, he has supervised or co-supervised 30 Ph.D. students and more than 100 undergraduate researchers — many of whom sought Ladner out for his focus on accessibility before that line of research entered the mainstream. Some of those same students later established the Richard E. Ladner Endowed Professorship, currently held by his faculty colleague Jennifer Mankoff, and the Richard Ladner Endowed Fund for Graduate Student Support in his honor.

Ladner also continues to build on his legacy of advocacy for engaging people with disabilities in technology development. The same year he was recognized for making the K-12 computer science curriculum more accessible, he helped establish the UW’s Center for Research and Education on Accessible Technology and Experiences (CREATE) alongside eight colleagues from multiple UW departments with an inaugural investment from Microsoft. Under the slogan “making technology accessible and making the world accessible through technology,” CREATE supports transformational, multidisciplinary research that will translate into real-world impact while building expertise in accessible technologies and increasing representation in the field for people with disabilities.

“Richard is truly a pioneer in the field of accessible computing,” said professor Magdalena Balazinska, director of the Allen School. “He understood the importance of fully including people with disabilities long before the rest of the field recognized this challenge and he continues to innovate today. He’s an inspiration to all of us.”

Ladner was previously elected a Fellow of the ACM and of the IEEE. He is one of four UW faculty members recognized in the 2021 class of AAAS Fellows, including Emily Carrington, who was honored in the Biological Sciences, and Julia A. Kovacs and Stefan Stoll, who were both honored in Chemistry. Founded in 1848, AAAS is the world’s largest general scientific society.

Learn more about the newly elected AAAS Fellows here.

Congratulations, Richard! Read more →

Allen School student Mohit Shridhar earns NVIDIA Fellowship for his work in grounding language for vision-based robots

Mohit Shridhar in front of a mountain

Mohit Shridhar, a Ph.D. student working with Allen School professor Dieter Fox, has been named a 2022-2023 NVIDIA Graduate Fellow for his research in building generalizable systems for human-robot collaboration. Shridhar’s work is focused on connecting language to perception and action for vision-based robotics.

Shridhar aims to use deep learning to connect abstract concepts to concrete physical actions with long-term reasoning to develop robot butlers. The Fellowship will help him continue his work in building robots that learn through embodied interactions rather than from static datasets. Using his own creation CLIPort, a language-conditioned imitation-learning agent, will advance precise spatial reasoning and learning generalizable semantic representations for vision and language. Shridhar’s framework combines two-streams with semantic and spatial pathways, where the semantic stream uses an internet pre-trained vision language model to bootstrap learning. This end-to-end framework can solve a variety of language-specified tabletop tasks, from packing unseen objects to folding clothes with centimeter-level precision.

“Mohit’s CLIPort work is the first to show the power of combining general language and image understanding models with fine-grained robot manipulation capabilities,” said Fox, who leads the Allen School’s Robotics & State Estimation Lab and is senior director of robotics research at NVIDIA..

In order to communicate with the butlers, Shridhar developed the Action Learning From Realistic Environments and Directives dataset (ALFRED). This is a dataset for agents to learn mapping from natural language instructions and egocentric vision to sequences of actions for household tasks. ALFRED consists of 25,000 natural language directives, including high-level instructions like “rinse off a mug and place it in the coffee maker” and lower-level language directions like “walk to the coffee maker on the right.” Tasks given to ALFRED are more complex in terms of sequence length, action space and language than previous vision-and-language task datasets.

Taking the next step beyond communicating tasks to the robots, Shridhar wants the robots to think about long-term actions without directly dealing with the complexities of the physical world. An example he gives is telling an agent to make an appetizer with sliced apples. Without any physical interactions, ALFWorld, a simulator that enables agents to learn abstract, “textual” policies in an interactive TextWorld, will train the robot to check the fruit bowl for apples and look in the drawers for a knife to make the appetizer. Before ALFWorld, agents did not have the infrastructure necessary for both reasoning abstractly and executing concretely. 

Shridhar intends to deploy ALFRED-trained models in household environments where a mobile manipulator can be commanded to perform tasks such as putting two plates on the dining table.

“I hope to build collaborative butler robots that aid and better human living,” Shridhar said.

Before coming to the Allen School, Shridhar received his Bachelor’s in Engineering from the National University of Singapore. He has interned at Microsoft Research, NVIDIA and an augmented reality startup. 

Shridhar is only one of 10 students recognized by the Graduate Fellowship Program based on their innovative research in Graphics Processing Unit (GPU) computing. Previous Allen School recipients of the NVIDIA Fellowship include Anqi Li (2020) and Daniel Gordon (2019).

Read more about the 2022-2023 NVIDIA Graduate Fellowship awards here.

Congratulations, Mohit! Read more →

Deserts, demographics and diet: UW and Stanford researchers reveal findings of nationwide study of the relationship between food environment and healthy eating

Grocery store produce shelves filled with different varieties of fruit, including apples, oranges, lemons and pears.
Credit: gemma on Unsplash

“You are what you eat,” as the saying goes. But not everyone has the same degree of choice in the matter. An estimated 19 million people in the United States live in so-called food deserts, where they have lower access to healthy and nutritious food. More than 32 million people live below the poverty line — limiting their options to the cheapest food regardless of proximity to potentially healthier options. Meanwhile, numerous studies have pointed to the role of diet in early mortality and the development of chronic diseases such as heart disease, type 2 diabetes and cancer.

Researchers are just beginning to understand how the complex interplay of individual and community characteristics influence diet and health. An interdisciplinary team of researchers from the University of Washington and Stanford University recently completed the largest nationwide study to date conducted in the U.S. on the relationship between food environment, demographics, and dietary health with the help of a popular smartphone-based food journaling app. The results of that five-year effort, published today in the journal Nature Communications, should give scientists, health care practitioners and policymakers plenty of food for thought. 

“Our findings indicate that higher access to grocery stores, lower access to fast food, higher income and college education are independently associated with higher consumption of fresh fruits and vegetables, lower consumption of fast food and soda, and less likelihood of being classified as overweight or obese,” explained lead author Tim Althoff, professor and director of the Behavioral Data Science Group at the Allen School. “While these results probably come as no surprise, until now our ability to gauge the relationship between environment, socioeconomic factors and diet has been challenged by small sample sizes, single locations, and non-uniform design across studies. Different from traditional epidemiological studies, our quasi-experimental methodology enabled us to explore the impact on a nationwide scale and identify which factors matter the most.”

Tim Althoff
Tim Althoff (Dennis Wise/University of Washington)

Althoff ‘s involvement in the study dates from when he was a Ph.D. student at Stanford working with professor and senior author Jure Leskovec and fellow student and co-author Hamed Nilforoshan. Together with co-author Dr. Jenna Hua, a former postdoctoral fellow at Stanford University School of Medicine and founder and CEO of Million Marker Wellness, Inc., the team analyzed data from more than 1.1 million users of the MyFitnessPal app — spanning roughly 2.3 billion food entries and encompassing more than 9,800 U.S. zip codes — to gain insights into how factors such as access to grocery stores and fast food, family income level, and educational attainment contribute to people’s food consumption and overall dietary health. 

The team measured the association of the aforementioned input variables with each of four dietary outcomes: fresh fruit and vegetable consumption, fast food consumption, soda consumption, and incidence of overweight or obese classified by body mass index (BMI). To understand how each variable corresponded positively or negatively with those outcomes, the researchers employed a matching-based approach wherein they divided the available zip codes into treatment and control groups, split along the median for each input. This enabled them to compare app user logs in zip codes that were statistically above the median — for example, those with more than 20.3% of the population living within half a mile of the nearest grocery store — with those below the median.

Among the four inputs the team examined, higher educational attainment than the median, defined as 29.8% or more of the population with a college degree, was the greatest positive predictor of a healthier diet and BMI. All four inputs were found to positively contribute to dietary outcomes, with one exception: high family income, defined as income at or above $70,241, was associated with a marginally higher percentage of people with a BMI qualifying as overweight or obese. But upon further investigation, these results only scratched the surface of what is a complex issue that varies from community to community.

Three maps of the United States with counties color-coded to indicate percentile in three categories: average fresh fruits and vegetables entries logged per day, average fast food entries logged per day and fraction affected by overweight/obesity (BMI 25+)
The team analyzed data on food consumption logged by fitness app users across more than 9,800 U.S. zip codes along with the percentage of residents affected by overweight/obesity in those communities. They found significant variation in dietary health across zip codes.

“When we dug into the data further, we discovered that the population-level results masked significant differences in how the food environment and socioeconomic factors corresponded with dietary health across subpopulations,” noted Nilforoshan.

As an example, Nilforoshan pointed to the notably higher association between above-median grocery store access and increased fruit and vegetable consumption in zip codes with a majority of Black residents, at a 10.2% difference, and with a majority of Hispanic residents, at a 7.4% difference, compared to those with a majority of non-Hispanic white residents, where he and his colleagues found only a 1.7% difference. These and other findings indicate that factors such as proximity to grocery stores or higher income, on their own, are not sufficient for people to bypass the drive-thru or kick the (soda) can to the curb — and that future attempts to address dietary disparities need to take variations across zip codes into account.

Portraits of Hamed Nilforoshan, Jenna Hua and Jure Leskovec
Left to right: Hamed Nilforoshan, Jenna Hua and Jure Leskovec

“People assume that if we eliminate food deserts, that will automatically lead to healthier eating, and that a higher income and a higher degree lead to a higher quality diet. These assumptions are, indeed, borne out by the data at the whole population level,” explained Hua. “But if you segment the data out, you see the impacts can vary significantly depending on the community. Diet is a complex issue! While policies aimed at improving food access, economic opportunity and education can and do support healthy eating, our findings strongly suggest that we need to tailor interventions to communities rather than pursuing a one-size-fits-all approach.”

Althoff believes that both the team’s approach and its findings can guide future research on this complex topic that has implications for both individuals and entire communities.

“We hope that this study will impact public health and epidemiological research methods as well as policy research,” said Althoff. “Regarding the former, we demonstrated that the increasing volume and variety of consumer-reported health data being made available due to mobile devices and applications can be leveraged for public health research at unprecedented scale and granularity. For the latter, we see many opportunities for future research to investigate the mechanisms driving the disparate diet relationships across subpopulations in the U.S.”

Read the paper in Nature Communications here. Access the publicly available data and code associated with the study here. Read more →

« Newer PostsOlder Posts »