Skip to main content

First-gen Allen School students and faculty reflect on finding their footing and what it means to be first

At the Allen School, 22% of Fall 2025 undergraduate students are first generation, or one of the first in their family to pursue a bachelor’s degree. For first-gen students, navigating the twists and turns of higher education without a roadmap can be a daunting challenge — what are office hours? How do I choose the right courses for me? What do I do after college? Still, these students have persevered to figure it out on their own or leverage resources such as the Allen School’s first-gen student group, GEN1, to help them find their way.

In honor of the National First-Generation College Celebration on November 8, we asked students and faculty to share their experience being first and what advice they have for others still embarking on their first-gen journey. Here are some of their stories.

‘You made it here for a reason’: Kurtis Heimerl, professor

What does it mean to you and your family to be among the first to pursue a bachelor’s degree?

Allen School professor Kurtis Heimerl holds a glowing lantern above his head, illuminating his face.
Kurtis Heimerl

Kurtis Heimerl: Not much! My parents didn’t really (and still don’t really) understand higher education and the value of it. There was an expectation of increased wages (which was true!) but not really a push to attend. 

How has being first-gen influenced your studies or career path?

KH: Dramatically, I think. I worked in industry at points but I think the lack of college being a “normal” path meant that I didn’t leave when it was economically sensible to do so; instead, I stayed in and got more degrees (as those credentials are things that are hard to lose once you have them). I’m not sure I’d advise my children to go all the way to a Ph.D. but the journey obviously worked for me.

What are some of the most challenging or most rewarding parts about being first-gen?

KH: There’s a lot of imposter syndrome. The reality is that you made it here for a reason. It has been rewarding being able to give back to the community that supported you; I do a lot of work for UW because it was instrumental in my own success and it’s great to see it continue doing that for new students as well.

What advice would you give to first-gen students?

KH: That the doors are open. I personally have struggled with feeling like opportunities (research, internships, etc.) weren’t for me; it seemed like others slid in so naturally to these paths, but I felt like I didn’t understand a lot of the details or nuances to succeed. The reality is that no one really understood and everyone was figuring it out.

‘Your success multiplies far beyond yourself’: Sanket Gautam, master’s student, Professional Master’s Program 

What does it mean to you and your family to be among the first to pursue a bachelor’s degree?

Sanket Guatum poses in front of the Golden Gate Bridge in San Francisco.
Sanket Gautam

Sanket Gautam: It means breaking generational barriers and transforming not just my future, but my entire family’s trajectory. Coming from an underrepresented community in India, my degree created financial sustainability that enabled me to support others in my family to explore opportunities they never imagined possible. This achievement represents the fulfillment of countless sacrifices and dreams from those who came before me. It carries both deep pride and the profound responsibility of being a catalyst for change in my community.

How has being first-gen influenced your studies or career path?

SG: Being first-gen taught me that creating opportunity means lifting others as I climb. Without a family roadmap, I learned to seek mentors, build networks, and advocate for myself in unfamiliar environments — experiences that shaped my commitment to serving as a GEN1 mentor and participating in GEN1 coffee chats. These skills developed through navigating my own journey now drive how I approach problem-solving and collaboration in my studies and career. My pursuit of AI specialization reflects a desire to build systems that expand access and empower underrepresented communities to thrive.

What are some of the most challenging or most rewarding parts about being first-gen?

SG: The challenge is navigating uncharted territory — creating your own path when there’s no family blueprint for financial aid, career planning, or institutional processes. You become your own guide, figuring out everything from networking strategies to academic decisions through trial, error, and determination. Yet this challenge builds incredible resilience and problem-solving skills that become your greatest assets. The reward is knowing your success multiplies far beyond yourself, opening doors and inspiring others in your family and community to pursue their own dreams.

What advice would you give to first-gen students?

SG: Your unique perspective is your strength — embrace it with confidence and pride. Seek out mentors, build community with fellow first-gen students, and never hesitate to ask questions, even when it feels intimidating. The path may feel uncertain at times, but every challenge you overcome builds the resilience that will carry you through future obstacles. Remember that by forging your own path, you’re not just succeeding for yourself — you’re creating a roadmap for everyone who comes after you.

‘You get to define your own path’: Czarin Dela Cruz, undergraduate student 

What does it mean to you and your family to be among the first to pursue a bachelor’s degree?

Allen School undergraduate student Czarin Dela Cruz poses among the cherry blossoms in the University of Washington's Quad.
Czarin Dela Cruz

Czarin Dela Cruz: Being among the first in my family to pursue a bachelor’s degree means making the most of the opportunities my family and I worked hard for after moving from the Philippines. Having access to more resources here motivates me to push forward and make my parents proud. It’s a way to honor their sacrifices and show that their efforts were worth it.

How has being first-gen influenced your studies or career path?

CD: Being first-gen has encouraged me to stand up for myself and always look toward the future. It’s motivated me to work hard in my studies and constantly find ways to improve. More importantly, it’s inspired me to pursue my passions not just for myself, but to honor the people who supported me and made it possible for me to be here.

What are some of the most challenging or most rewarding parts about being first-gen?

CD: One of the most challenging parts of being first-gen is navigating everything on your own — from understanding college systems to balancing academics and family expectations. But it’s also incredibly rewarding to see that your hard work pays off. Being first-gen means you get to define your own path, make your own choices, and work toward your goals for a brighter future.

What advice would you give to first-gen students?

CD: My advice to first-gen students is to never be afraid to ask for help. Communicate with those around you and reach out when you need support. There are so many people and resources ready to help you succeed. Being proactive opens doors to new opportunities, and you never know what meaningful connections or experiences you might discover.

Learn more about the University of Washington’s National First-Generation College Celebration here and resources available through GEN1 here. Read more →

Allen School’s 2025 Research Showcase highlights faculty and student innovation across AI, robotics, neuroscience and more

Madrona Prize winners and runner-ups pose in a line on stage after accepting their awards.
Award winners and presenters, left to right: professor Shwetak Patel; Mark Nelson and Mike Fridgen, venture partners at Madrona Venture Group; Sabrina Albert, partner at Madrona Venture Group; Madrona Prize winner Yanming Wan; Madrona Prize runner-up Baback Elmieh; Madrona Prize runner-up Mateo Guaman Castro; Rasik Parikh and Rolanda Fu, investors at Madrona Venture Group.

From a robotic arm that learns how to pick up new objects in real time, to a model that converts 2D videos into 3D virtual reality, to a curious chatbot that adapts to its users, to machine learning techniques for decoding the brain (and so much more), the 2025 edition of the annual Research Showcase and Open House highlighted the breadth and depth of computing innovation coming out of the Allen School. 

Last week, roughly 400 Industry Affiliate partners, alumni and friends gathered on the University of Washington campus to learn how the school is working toward solutions to a set of “grand challenges” to benefit society, in which the Allen School is uniquely positioned to lead; participate in technical sessions covering a range of topics, from artificial intelligence to sustainable technologies; and explore how researchers are leveraging machine learning to decode causality, learning and communication in the brain during a luncheon keynote; and learn about more than 80 research projects directly from the student researchers themselves at the evening poster session and open house — including those who walked away with the coveted Madrona Venture Prize or People’s Choice Award. 

Unlocking the mysteries of brain learning with the help of machine learning

Professor Matt Golub presenting his luncheon keynote about neuroscience and machine learning.
Allen School professor Matt Golub presenting his luncheon keynote on causality, learning and communication in the brain.

In his luncheon keynote, Allen School professor Matt Golub shared recent research from the Systems Neuroscience and AI Lab (SNAIL) that uses machine learning techniques to better understand how the brain performs computations such as decision-making and learning — insights that could, in turn, inform future developments in AI. Previous work has focused on reconstructing neuron connections ex vivo, or outside of the body, by looking at thin sections of the brain under a microscope. Instead, Golub and his team analyze active neural activity. 

“If we can crack these problems, that will set us up in future work to understand how different brain networks perform distributed computation through their connections, through the dynamics of the activity that flows through those connections,” Golub said. “Then, we can ask how the connections change as the brain learns. We might be able to discover the learning rules that govern synaptic plasticity.”

Golub and his collaborators introduced a novel approach for analyzing neural population dynamics with optogenetics and computational modeling. Using lasers, the researchers targeted neurons to stimulate their activity; if they hit one neuron with the laser and another one fired reliably, they inferred a connection between the two. Because these experiments are resource intensive, the researchers designed an active learning algorithm that selected these stimulation patterns, or chose specific neurons to target with the laser, so that they could learn an accurate model of the neural network with as little data as possible.

Zooming out for a higher-level view, Golub and his team also developed a new machine learning method that can identify communication between entire regions of the brain. The technique, called Multi-Region Latent Factor Analysis via Dynamical Systems (MR-LFADS), uses multi-region neural activity data to untangle how different parts of the brain talk to each other. Unlike existing approaches, MR-LFADS observes what unobserved brain regions are likely saying as well — the team’s custom deep learning architecture can detect when a recorded region reflects an unobserved influence. When applied to real neural recordings, the researchers found that MR-LFADS could predict how disrupting one brain region would affect another, which were effects that the system had never seen before.

Moving forward, Golub said that he is thinking about “the nexus of these two fronts.” In particular, he is interested in advancing active learning in nonlinear models and other deep learning models that can help generate AI-guided causal manipulations — experiments where the model can tell researchers what data it needs to improve.

“All of this is aimed at trying to understand how our brains compute and learn to drive our flexible behavior, and advances there can and have inspired innovation in AI systems and for new approaches to improving the health and function of our brains,” Golub said.

‘Quite a lot of amazing research’

The event culminated with the evening poster session and announcement of the Madrona Prize, an annual tradition in which Madrona Venture Group recognizes innovative student research with commercial potential. Past award winners have gone on to raise hundreds of millions of dollars in venture capital and build companies that have been acquired by tech giants such as Google and NVIDIA. 

Sabrina Albert, a partner in the firm, presented the 2025 prize to the winner and two runners-up. 

“There is quite a lot of amazing research that you guys are working on, so it was quite a challenge to pick which ones were most exciting,” Albert said.

Madrona Prize Winner / Enhancing Personalized Multi-Turn Dialogue with Curiosity Reward

Yanmin Wan, Madrona Prize winner, presenting his poster to attendees at the 2025 Research Day.
Yanming Wan presents his poster, which received the Madrona Prize, to attendees.

Allen School Ph.D. student Yanming Wan accepted the grand prize for CURIO, or Curiosity-driven User-modeling Reward as Intrinsic Objective. This new framework enhances personalization in large language models (LLMs) on multi-turn conversations. 

Conversational agents need to be able to adapt to varying user preferences and personalities as well as diverse domains such as education or health care. However, conventional methods for training LLMs often struggle with personalization because they require pre-collected user data, making them less effective for new or content-limited users.

To solve this problem, Wan and the team introduced an intrinsic motivation reward model that enables LLMs to actively learn about the user out of curiosity and then adapt to their individual preferences during the conversation. 

“We propose leveraging a user model to incorporate a curiosity-based intrinsic reward into multi-turn Reinforcement Learning from Human Feedback (RLHF),” said Wan. “This novel reward mechanism encourages the LLM agent to actively infer user traits by optimizing conversations to improve its user model’s belief. Consequently, the agent delivers more personalized interactions by learning more about the user across turns.”

Additional members of the research team include Allen School professor Natasha Jaques, Jiaxing Wu of Google DeepMind, Lior Shani of Google Research and Marwa Abdulhai, a Ph.D. student at University of California, Berkeley.

Madrona Prize Runner-up / VAMOS: A Hierarchical Vision-Language-Action Model for Capability-Modulated and Steerable Navigation

Mateo Guaman Castro presenting his poster at the 2025 Research Day.
Mateo Guaman Castro showcases his poster.

Another research team received accolades for VAMOS, a hierarchical vision language-action (VLA) model that helps robots navigate across diverse environments. In both real-world indoor and outdoor navigation courses, VAMOS achieved a three times higher success compared to other state-of-the-art models.

“Contrary to the belief that simply scaling up data improves generalization, we found that mixing data from many robots and environments can actually hurt performance — since not all robots can perform the same actions or handle the same terrains,” said Allen School Ph.D. student Mateo Guaman Castro, who accepted the award on behalf of the team. “Our system, VAMOS, tackles this by decomposing navigation into a hierarchy: a high-level VLM planner proposes multiple possible paths, and a robot-specific ‘affordance modulator,’ trained safely in simulation, selects the path best suited to the robot’s physical abilities.”

The VAMOS research team also includes Allen School professors Byron Boots and Abhishek Gupta; Ph.D. students Rohan Baijal, Rosario Scalise and Sidharth Talia; undergraduate students Sidharth Rajagopal and Daniel Gorbatov; alumni Matt Schmittle (Ph.D., ‘25), now at OverlandAI, and Octi Zhang (B.S., ‘25), now at NVIDIA; Celso de Melo of the U.S. Army DEVCOM Research Laboratory; and Emma Romig, Robotics Lab director at the Allen School.

Madrona Prize Runner-up / Dynamic 6DOF VR Reconstruction from Monocular Videos

Baback Elmieh showcasing his project at the 2025 Research Day.
Baback Elmieh with his research poster.

Allen School researchers were also recognized for their research that transforms two-dimensional videos into a dynamic three-dimensional scene that users can experience in immersive virtual reality (VR) — revealing additional depth, motion and perspective.

“A key challenge is figuring out the 3D world from a single, flat video,” said Ph.D. student Baback Elmieh, who accepted the prize. “To address this, we first use an AI model to ‘imagine’ what the scene looks like from many different angles. We then use a new technique that models the scene’s overall motion while simultaneously fine-tuning the details in each frame. This combined approach allows us to handle fast-moving action, making progress towards reliving 2D videos or even historic moments as dynamic 3D experiences.”

Elmieh developed the project working with Allen School professors Steve Seitz, Ira Kemelmacher-Shlizerman and Brian Curless.


People’s Choice Award / MolmoAct

Rounding out the evening was the announcement of the People’s Choice Award, where attendees voted for their favorite poster or demo.

Winner of the People's Choice Award at the 2025 Research Showcase pose onstage after receiving their award.
People’s Choice Award presenter and winners, left to right: professor Shwetak Patel, Shirui Chen and Jiafei Duan.

Shwetak Patel, the Allen School’s director for development and entrepreneurship and the Washington Research Foundation Entrepreneurship Endowed Professor in Computer Science & Engineering and Electrical & Computer Engineering (ECE), presented the 2025 award to Allen School Ph.D. student Jiafei Duan and UW Department of Applied Mathematics Ph.D. student Shirui Chen for MolmoAct, an open-source action reasoning model developed by a team of UW and Allen Institute for AI (Ai2) researchers that enables robots to interpret and understand instructions, sense their environment, generate spatial plans and then execute them as goal-directed trajectories. 

“MolmoAct is the first fully open action reasoning model for robotics. Our goal is to build generalist robotic agents capable of reasoning before they act — a paradigm that has already inspired a wave of subsequent research,” said Duan, who worked on the project as a graduate student researcher at Ai2.

The researchers invited attendees to test MolmoAct’s reasoning capabilities by putting an object in front of a robotic arm — such as a pen or a tube of lip gloss — and watching the robot learn to pick it up in real time.

The team also includes, on the UW side, Allen School professors Ali Farhadi, Dieter Fox and Ranjay Krishna; Ph.D. students Jieyu Zhang and Yi Ru Wang; master’s student  Jason Lee; undergraduate students Haoquan Fang, Boyang Li and Bohan Fang; ECE master’s student Shuo Liu, and recent Industrial & Systems Engineering undergraduate Angelica Wu. Farhadi is CEO of Ai2, where Fox and Krishna also spend a portion of their time leading research initiatives. The team also includes Yuquan Deng (B.S., ‘24), Sangho Lee, Winson Han, Wilbert Pumacay, Rose Hendrix, Karen Farley and Eli VanderBilt at Ai2. 

For more about the Allen School’s 2025 Research Showcase and Open House, read GeekWire’s coverage here and the Madrona Prize announcement here. Read more →

Allen School faculty and alum recognized in MIT Technology Review’s Innovators Under 35 Asia Pacific

Each year, the MIT Technology Review recognizes science and technology trailblazers whose work is helping solve global problems. Three members of the Allen School community were named part of this year’s MIT Technology Review Innovators Under 35 Asia Pacific — professors Simon Shaolei Du and Ranjay Krishna, along with alum Sewon Min (Ph.D., ‘24), now a faculty member at University of California, Berkeley and research scientist at the Allen Institute for AI (Ai2). 

The three were recognized as pioneers for their innovative research that is advancing our understanding of artificial intelligence, large language models, computer vision and more.

Simon Shaolei Du: Tackling core AI challenges from a theoretical perspective

Portrait of Simon Shaolei Du
Simon Shaolei Du

Recent innovations in large-scale machine learning models have transformed data-driven decision making, from the rise of self-driving cars to ChatGPT. However, we still don’t understand exactly how these powerful models work, and Du is interested in unraveling some of the mysteries behind machine learning. 

“My work focuses on building the theoretical foundations of modern artificial intelligence, establishing a systematic research path around deep learning’s trainability and generalization, as well as sample complexity in reinforcement learning and representation learning,” Du said.

Du’s research has already provided theoretical explanations for some of deep learning’s black boxes. He and his collaborators provided one of the first proofs for how over-parameterized machine learning models and neural networks can be optimized using a simple algorithm such as gradient descent. The researchers found that, with enough over-parameterization, gradient descent could find the global minima, or the point where the model has zero error on the training data, even if the objective function is non-convex and non-smooth. He also established the connection between deep learning and kernel methods, explaining how these models are able to generalize so well despite their large size. Beyond demystifying how these models work, Du is helping make over-parameterized models more mainstream. In 2022, he received a National Science Foundation CAREER Award to support his research in designing a resource-efficient framework to help make modern machine learning technologies more accessible.

He has also tackled core challenges in reinforcement learning, such as its high data requirements. Because agents learn through trial-and-error interactions with the environment, it can take many interactions, and data points, to build up a comprehensive understanding of the environment’s dynamics. To make the process more efficient, Du and his collaborators introduced a new algorithm to address an open problem on sample complexity that had remained unsolved for almost three decades. In their paper, the researchers prove that their algorithm can achieve optimal data efficiency and show that its complexity is not dependent on whether the planning horizon is long or short. Du also provided the first theoretical results showing that a good representation as well as data diversity are both necessary for effective pre-training. 

Prior to his TR35 recognition, Du was named a 2022 AI Researcher of the Year, 2024 Sloan Research Fellow and a 2024 Schmidt Sciences AI2050 Early Career Fellow.

Ranjay Krishna: Developing ‘visual thinking’ for vision-language models

Portrait of Ranjay Krishna
Ranjay Krishna

Krishna’s research sits at the intersection of computer vision and human computer interaction and integrates theories from cognitive science and social psychology. Through this multidisciplinary approach, his work enables machines to learn new knowledge and skills through social interactions with people and enhances models’ three-dimensional spatial perception capabilities.

“I design machines that understand the world — not just by recognizing pixels, but by reasoning about what they see, where they are and how to act to change the world around them,” said Krishna, who directs the UW RAIVN Lab and leads the PRIOR team at Ai2. 

Krishna addresses large vision-language models’ deficiencies in compositional reasoning due to their inability to combine learned knowledge on the spot to handle new problems. Krishna proved that simply scaling up the models is not an effective solution. Instead, he and his collaborators introduced an iterated learning algorithm which is inspired by the cultural transmission theory in cognitive science. This method periodically resets and retrains the model, encouraging visual representations to evolve toward compositional structures. By combining this technique with the PixMo dataset, Krishna helped develop the Molmo series of models, Ai2’s family of open source and state-of-the-art multimodal models that can both understand and generate content using text and images. 

He is also interested in tackling multimodal models’ challenges with spatial reasoning. While these models can often perform well in basic tasks like object detection, they struggle with tasks requiring a deeper understanding of geometric transformations and spatial context, such as sketching. Krishna and his collaborators developed Sketchpad, a framework that gives multimodal language models a visual sketchpad and tools to draw on it. Using this technique, a model can “think visually” like humans can by sketching with auxiliary lines and marking boxes, which helps it improve its reasoning accuracy and break down complex spatial and mathematical problems. He has taken this approach a step further with a training method that augments multimodal language models with perception tokens, improving their three-dimensional spatial perception. 

Krishna has received the 2025 Samsung START Faculty Award to study “Reasoning with Perception Tokens” as well as the 2024 Sony Faculty Innovation Award to research “Agile Machine Learning,” in addition to his TR35 Asia Pacific recognition. 

Sewon Min: Enabling models to find answers from the external world

Headshot of Sewon Min
Sewon Min

Min aims to build the next generation of AI systems that feature flexibility, superior performance and increased legal compliance. In her Allen School Ph.D. dissertation, she tackled fundamental challenges that current language models (LMs) face, such as factuality and privacy, by introducing nonparametric LMs. Her other work has only further driven the development of more open, controllable and trustworthy LMs.

“My research explores new ways of building models that use data creatively — for instance, by using retrieval (nonparametric LMs) and by designing modular LMs trained in a non-monolithic manner,” Min said.

Min has advanced retrieval-augmented and nonparametric language models on different fronts. She and her collaborators scaled a retrieval datastore, MassiveDS, to more than one trillion tokens — the largest and most diverse open-source datastore to date. The researchers also found that increasing the scale of data available at inference can improve model performance on various downstream tasks, providing a path for models to move from parametric memory toward data pluggability. At the systems interface level, Min helped develop REPLUG, a retrieval-augmented language modeling framework that augments black-box LMs with a tuneable retriever. This simple design can be applied to existing LMs and helps improve the verifiability of text generation. 

She has also developed techniques to address reliability issues and legal risks that LMs run into. While the legality of training models on copyrighted or other high-risk data is under debate, model performance significantly declines when only trained on low-risk texts due to the limited size and domain coverage, Min and her collaborators found. So the researchers introduced the SILO framework, which manages the risk-performance tradeoff by putting high-risk data in a replaceable datastore. To help quantify the trustworthiness of LM generated text, Min developed FACTSCORE, a novel evaluation that breaks a generation down into atomic facts and then computes the percentage of these facts that are corroborated by a reliable knowledge source. Since its release, the tool has been widely adopted and used to evaluate and improve the reliability of long-form LM text generation. 

The TR35 recognition is only the latest in a series of recent accolades Min has received, after her Allen School Ph.D. dissertation earned the Association for Computing Machinery Doctoral Dissertation Award honorable mention and the inaugural Association for Computational Linguistics Dissertation Award.

Read more about this year’s TR35 Asia Pacific honorees. Read more →

Allen School professor Baris Kasikci earns Rising Star in Dependability Award for keeping software reliable and secure

Baris Kasikci

Every year, as the amount of data we create grows and software becomes increasingly more complex, it is more crucial to improve the efficiency of computer systems. However, in this complex ecosystem, dependability, such as ensuring software contains fewer bugs and achieves greater security, is often considered an afterthought, said Allen School professor Baris Kasikci. Software and hardware have been plagued by bugs that can lead to data loss, security vulnerabilities and costly critical infrastructure failures.

In his research, Kasikci focuses on developing techniques for building systems that are both efficient and have a strong foundation of dependability, with an emphasis on real-world technical and societal impact. At the 55th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN 2025) in June, Kasikci was recognized with the Rising Star in Dependability Award for “his impressive track-record and contributions in the field of dependable systems, including multiple publications in highly regarded venues, and influence on current industry practice.”

“Building systems that are simultaneously efficient and dependable is challenging because there is a strong tension between techniques aimed to achieve these properties,” said Kasikci. “This tension is due to the different trade-offs involved in achieving efficiency and dependability (e.g., performance optimizations versus defenses against vulnerabilities that cause slowdown). To rise to this challenge, my work draws insights from a broad set of disciplines such as systems, computer architecture, programming languages, machine learning and security.”

Kasikci’s work helps improve society’s trust in computer systems with new secure hardware systems and bug detection techniques. He was part of the team that discovered Foreshadow, a speculative attack on Intel processors which enables an attacker to steal sensitive information that is stored inside personal computers and third-party clouds. Their work has influenced the redesign of Intel processors and motivated all major cloud vendors to deploy mitigations against the vulnerability. He and his collaborators also developed REPT, which is one of the most widely-deployed failure analysis systems in the world and is in use across all modern Microsoft Windows platforms. It has allowed Microsoft engineers to tackle bugs that have been open for many years, and the techniques behind the system have been adopted by both Intel and Facebook. 

He is also interested in developing methods for optimizing the entire computing stack, from hardware to operating systems. At the hardware level, Kasikci helped introduce techniques to improve code, such as Ripple, a software-only technique that uses program context to inform the placement of cache eviction instructions. More recently, he developed new methods for making systems that support large language models more efficient including NanoFlow, a novel LLM-serving framework with close to optimal throughput. 

In future work, Kasikci is interested in advancing reliability techniques such as debugging and testing by making production execution data such as logs and core dumps more useful. For example, he envisions using this information to quickly find bugs that production systems may suffer from, while also managing how much storage and resources these logs can consume. 

In addition to receiving the Rising Star in Dependability Award, Kasikci previously earned a National Science Foundation Career Award and an Intel Rising Star Faculty Award.

Read more about the Rising Star in Dependability AwardRead more →

Allen School professor Ratul Mahajan receives SIGCOMM Networking Systems Award for making networks more secure and reliable with Batfish

Ratul Mahajan sitting at a desk facing the camera.
Ratul Mahajan

Networks have become one of the foundational pillars for modern society. When they go down, we cannot fly airplanes, operate bank accounts or even scroll through social media. One of the long-standing challenges in networking, however, is the difficulty of accurately predicting how changes to device configurations will impact real-world traffic flows — especially when a single code line revision can easily break a network, explained Allen School professor and alum Ratul Mahajan (Ph.D., ‘05). 

To help network engineers and operators ensure that their networks operate exactly as they intend, Mahajan and his collaborators introduced Batfish, an open source network configuration analysis tool that can find errors and ensure the accuracy of planned network configurations, helping to prevent costly outages. At the ACM Special Interest Group on Data Communication (SIGCOMM) conference last month, Batfish was recognized with the SIGCOMM Networking Systems Award for its “significant impact on the world of computer networking.” 

“Over the years, networks have become super complicated and it has gotten to the point where humans cannot assure that changes they make to the network are not going to bring it down,” said Mahajan, who is one of the co-directors for the UW Center for the Future of Cloud Infrastructure (FOCI). “With Batfish, we focus on what we call proactive validation. Instead of, after the fact, discovering that something bad has happened to the network, Batfish takes your planned change to the network and makes a model of how the network will behave if this change were to go through.”

The platform was first developed in 2015 by Mahajan and  colleagues at Microsoft Research; University of California, Los Angeles; and University of Southern California. It was later commercialized by Intentionet, where Mahajan was the CEO and co-founder. Today, Batfish is managed by Amazon Web Services (AWS), and more than 75 companies rely on the tool to help design and test their networks. 

Batfish uses an offline “snapshot” of the network to build a model and infer if there are any issues present within the configuration. The platform takes in device configurations from various vendors including Cisco, Juniper and Arista, and it then converts these configurations into a unified and vendor-independent model of the network. Once the model is built, engineers can query Batfish about topics such as the reachability between various network parts, potential routing loops, access control list (ACL) configurations such as incorrectly assigned permissions, or other policy and security constraints. Batfish then provides specific information needed to find and fix the misconfigurations.

While Batfish’s main architecture and original goal has stood the test of time, many of its underlying techniques have been revamped and enhanced to tackle scalability and usability challenges that complex, real-world networks face. For example, for each violated property, Batfish originally only provided one counterexample packet that was randomly picked by the SMT solver from violating headspace, however, these counterexamples lacked context and could be confusing. To help engineers understand what went wrong, Batfish now provides a positive example, or a packet that does not violate the property, alongside a counterexample that engineers can compare to pinpoint the issue.

As one of the earliest and most widely-adopted network verification platforms, it has helped shape key areas of research such as control-plane and data-plane modeling and network automation. From tech giants to small startups, multiple organizations rely on Batfish every day to both validate network configurations and drive innovations in network designs and operations. 

“The main lasting impact of Batfish, beyond the code itself, would be changing the practice of networking to use these types of tools,” Mahajan said. “It was one of the first pieces of technology that made automated reasoning for networks a reality.”

In addition to Mahajan, SIGCOMM also recognized Batfish team members Matt Brown, Ari Fogel, Spencer Fraint, Daniel Halperin (Ph.D., ‘12) and Victor Heorhiadi at AWS; Todd Millstein (Ph.D., ‘03), a faculty member at UCLA; Corina Miner at Sesame Sustainability; and Samir Parikh at Cisco.

Read more about the SIGCOMM Networking Systems Award, along with a related UCLA Samueli School of Engineering story. Read more →

Ph.D. alum Akari Asai recognized as one of MIT Technology Review’s Innovators Under 35 for reducing LLM hallucinations

Headshot of Akari Asai
Akari Asai

Akari Asai (Ph.D., ‘25), research scientist at the Allen Institute for AI (Ai2) and incoming faculty at Carnegie Mellon University, is interested in tackling one of the core challenges with today’s large language models (LLMs). Despite their increasing potential and popularity, LLMs can often get facts wrong or even combine tidbits of information into a nonsensical response, also known as hallucinations. This can be especially concerning when these LLMs are used for scientific literature or software development, where accuracy is vital. 

For Asai, the solution is developing retrieval-augmented language models, a new class of LLMs that pull relevant information from an external datastore using a query that the LLM generates. Her research has helped establish the foundations for retrieval-augmented generation (RAG) and showcase its effectiveness at reducing hallucinations. Since then, she has gone on to add adaptive and self-improvement capabilities and apply these innovations to practical applications such as multilingual natural language processing (NLP).

Asai was recently named one of MIT Technology Review’s Innovators Under 35 2025 for her pioneering research improving artificial intelligence. The TR35 award recognizes scientists and entrepreneurs from around the world who “stood out for their early accomplishments and the ways they’re using their skills and expertise to tackle important problems.”

“With the rapid adoption of LLMs, the need to investigate their limitations, develop more powerful models and apply them in safety-critical domains has never been more urgent,” said Asai.

Traditional LLMs generate responses to user inputs based solely on their training data. In comparison, RAG enhances the LLM with an additional information retrieval component that utilizes the user input to first pull information from a new, external datastore. This allows the model to generate responses that incorporate up-to-date information without needing additional training data. By checking this datastore, an LLM can better detect when it is generating a falsehood, which it can then verify and correct using the retrieved information. 

Asai took that research a step further and she and her collaborators introduced Self-reflective RAG, or Self-RAG, that improves LLMs’ quality and factual accuracy with retrieval and self-reflection. With Self-RAG, a model uses reflection tokens to decide when to retrieve relevant external information and critique the quality of its own generations. While RAG can only retrieve relevant information a fixed number of time steps, Self-RAG can retrieve multiple times — making it useful for diverse downstream queries including instruction following.

She is interested in utilizing these retrieval-augmented language models to solve real-world problems. In 2024, Asai introduced OpenScholar, a new model that can help scientists more effectively and efficiently navigate and synthesize scientific literature. She has also investigated how retrieval-augmented language models can be useful for code generation, and helped develop frameworks that can improve information access across multiple languages such as AfriQA, the first cross-lingual question answering dataset focused on African languages.

“Akari is among the pioneers in advancing retrieval-augmented language models, introducing several paradigm shifts in this area of research,” said Allen School professor Hannaneh Hajishirzi, Asai’s Ph.D. advisor and also senior director at Ai2. “Akari’s work not only provides a foundational framework but also highlights practical applications, particularly in synthesizing scientific literature.” 

This award comes on the heels of another MIT Technology Review recognition. Last year, Asai was named one of the publication’s Innovators Under 35 Japan. She has also received an IBM Ph.D. Fellowship and was selected as one of this year’s Forbes 30 Under 30 Asia in the Healthcare and Science category. 

Read more about the MIT Technology Review’s Innovators Under 35. Read more →

Allen School researchers recognized at ACL 2025 for shaping the present and future of natural language processing

An artist’s illustration of artificial intelligence depicting language models which generate text.
Wes Cockx/Unsplash

At the 63rd Annual Meeting of the Association for Computational Linguistics (ACL 2025), Allen School researchers brought home multiple awards for their work that is moving natural language processing research forward. Their projects ranged from laying the foundation for how artificial intelligence systems understand and follow human instructions to exploring how large language models (LLMs) pull responses from their training data — and more. 

TACL Test of Time Award: Weakly Supervised Learning of Semantic Parsers for Mapping Instructions to Actions

From robotics to voice assistants, today’s AI systems rely on their ability to understand human language and interact naturally with it.

Portrait of Luke Zettlemoyer
Luke Zettlemoyer

In their 2013 paper “Weakly Supervised Learning of Semantic Parsers for Mapping Instructions to Actions,” Allen School professor Luke Zettlemoyer and then student Yoav Artzi (Ph.D., ‘15), now a professor at Cornell Tech, set the groundwork for this capability with an approach for teaching models to follow human instructions without the need for detailed manual explanations. For their lasting contributions to the field, the researchers received the Test of Time Award as part of the inaugural Transactions of the Association for Computational Linguistics (TACL) Paper Awards presented at ACL 2025.

“This was the first paper in a line of work on learning semantic parsers from easily gathered interactions with an external world, instead of requiring supervised training data,” Zettlemoyer said. “Although the form of the models has changed significantly over time, this style of learning is still relevant today as an early precursor to techniques such as Reinforcement Learning with Verifiable Rewards (RLVR).”

Zettlemoyer and Artzi developed the first comprehensive model that tackles common issues that arise with learning and understanding unrestricted natural language. For example, if you told a robot to follow the navigation instructions “move forward twice to the chair and at the corner, turn left to face the blue hall,” it would need to solve multiple subproblems to interpret the instructions correctly. It would have to resolve references to specific objects in the environment such as “the chair,” clarify words based on context, and also understand implicit requests like “at the corner” that provide goals without specific steps. 

To address these challenges without the need for intensive engineering effort, the duo developed a grounded learning approach that can jointly reason about meaning and context, and then continue to learn from their interplay. It uses Combinatory Categorial Grammar (CCG), which is a framework that assigns words to different syntactic categories such as noun or prepositional phrase, efficiently parsing complex instructional language for meaning by mapping them into logical expressions. Then, the weighted CCG ranks possible meanings for each instruction. 

This joint model of meaning and context allows for the system to continue to learn from situated cues, such as the visible objects in the environment. For example, in the earlier set of navigation instructions, “the chair” can refer to multiple different objects such as chairs, barstools or even recliners. While the CCG framework would include a lexical item for each meaning of “the chair,” the execution of the task might fail depending on what objects are in the world. It allows for the system to learn by watching examples play out, and then following if the actions lead to successful outcomes such as completing the task or reaching a destination.

The researchers tested their method using a benchmark navigational instructions dataset and found that their joint approach successfully completed 60% more instruction sets compared to the previous state-of-the-art methods.

Read the full paper, as well as a related Cornell Tech story

Outstanding Paper Award: Byte Latent Transformer: Patches Scale Better Than Tokens

Allen School Ph.D. students Artidoro Pagnoni and Margaret Li earned an ACL Outstanding Paper Award for research done at Meta with their advisor Zettlemoyer, who is also the senior research director at Meta FAIR.

Alongside their collaborators, they introduced the Byte Latent Transformer (BLT), a new byte-level LLM architecture that is the first to be able to match the more standard tokenization-based LLM performance at scale. At the same time, BLT is also able to improve efficiency and increase robustness to noisy data. 

Many existing LLMs are trained using tokenization, where raw text is broken down into more manageable tokens which then serve as the model’s vocabulary. This process was essential because training LLMs directly with bytes was cost prohibitive at scale. However, these tokens can influence how string is compressed and lead to issues such as domain sensitivity. 

Instead, BLT groups bytes into dynamically-sized patches, serving as the primary units of computation. These patches are then segmented based on the entropy of the next byte, allowing the system to allocate more model capacity where needed. For example, higher entropy indicates a more complex sequence which can then prompt a new, shorter patch. In the first Floating-Point Operations (FLOP) controlled scaling study of byte-level models, the team found that BLT’s performance was on par or superior to models such as Llama 3. With its efficiency and adaptability, the researchers position BLT as a promising alternative to the traditional token-based models available.

Additional authors include Ram Pasunuru, Pedro Rodriguez, John Nguyen, Benjamin Muller, Chunting Zhou, Lili Yu, Jason Weston, Gargi Ghosh, former Allen School postdoc Mike Lewis and Srinivasan Iyer (Ph.D., ‘19) at Meta, and Ari Holtzman (Ph.D., ‘23), now faculty at University of Chicago.

Read the full paper on BLT here.

Best Demo Paper: OLMoTrace

As LLMs become increasingly popular in higher-stake scenarios, it is important to understand why they generate certain responses and where they get their answers from. Fully open language models such as OLMo have been trained on trillions of tokens that everyone can access, but current behavior tracing methods are not scaled to work within this multi-trillion-token setting.

To address this, a team of researchers at the Allen School and the Allen Institute for AI (Ai2) introduced OLMoTrace, the first system that allows users in real time to explore how LLM outputs connect back to their training data. For the research’s innovation and practical application, the team received the ACL 2025 Best Demo Paper Award.

Jiacheng Liu

“Today’s large language models are so complex that we barely understand anything about how they generate the responses we see,” said lead author and Allen School Ph.D. student Jiacheng Liu. “OLMoTrace is powered by a technology I previously developed at UW, ‘infini-gram,’ with numerous system optimizations enabling us to deliver instant insights to how LLMs likely have learned certain phrases and sentences.”

The OLMoTrace inference pipeline works by scanning the LLM’s output and identifying long, unique and relevant text spans that appear verbatim in the model’s training data. For each span, the system retrieves up to 10 snippets from the training data that contain the span, prioritizing the most relevant documents. Finally, the system does some post-processing on the spans and document snippets, and presents them to the user through the chat interface. OLMoTrace is publicly available in the Ai2 model playground with the OLMo 2 family of models.

The researchers proposed multiple practical applications for OLMoTrace. For example, if the model generates a fact, users can look back to the training data to fact check the statement. It can also reveal the potential source of seemingly creative and novel LLM-generated expressions. In addition, OLMoTrace can help debug erratic LLM behaviors, such as hallucinations or incorrect self-knowledge, which are crucial to address as LLMs become increasingly more commonplace, Liu explained.

Additional authors include Allen School professors and Ai2 researchers Ali Farhadi, Hannaneh Hajishirzi, Pang Wei Koh and Noah A. Smith, along with Allen School Ph.D. students Arnavi Chheda-Kothary and Rock Yuren Pang. The team also includes Taylor Blanton, Yanai Elazar, Sewon Min (Ph.D., ‘24), Yen-Sung Chen, Huy Tran, Byron Bischoff, Eric Marsh, Michael Schmitz (B.S., ‘08), Cassidy Trier, Aaron Sarnat, Jenna James, Jon Borchardt (B.S., ‘01), Bailey Kuehl, Evie Cheng, Karen Farley, Sruthi Sreeram, Taira Anderson, David Albright, Carissa Schoenick, Luca Soldaini, Dirk Groeneveld, Sophie Lebrecht and Jesse Dodge of Ai2, along with former Allen School professor Yejin Choi, now a faculty member at Stanford University.

Read the full paper on OLMoTrace.

ACL Dissertation Award: Rethinking Data Use in Large Language Models

For her Allen School Ph.D. dissertation titled “Rethinking Data Use in Large Language Models,” Sewon Min, now faculty at University of California, Berkeley, received the inaugural ACL Dissertation Award. In her work, Min tackled fundamental issues that current language models face, such as factuality and privacy, by introducing a new class of language models and alternative approaches for training such models. 

This new class of models, called nonparametric language models, is able to identify and reason with relevant text from its datastore during inference. Compared to conventional models that have to remember every applicable detail from their training set, models with a datastore available at inference time have the potential to be more efficient and flexible.

Nonparametric language models can also help address the legal constraints that traditional models often face. Language models are commonly trained using all available online data, which can lead to concerns with copyright infringement and crediting data creators. Min developed a new approach where language models are trained solely on public domain data. Copyrighted or other high-risk data is then kept in a datastore that the model can only access during inference and which can be modified at any time. 

In addition to receiving the ACL Dissertation Award, Min has also earned honorable mentions for the ACM Doctoral Dissertation Award from the Association for Computing Machinery and the Association for the Advancement of Artificial Intelligence Doctoral Dissertation Award.

Read more →

Two Allen School teams receive Qualcomm Innovation Fellowships to advance research in programming languages and AI

Each year, Qualcomm recognizes exceptional Ph.D. students whose research proposals promote the company’s core values of innovation, execution and teamwork with the Qualcomm Innovation Fellowship. With the goal of enabling “students to pursue their futuristic innovative ideas,” the winning teams receive a one-year fellowship as well as mentorship from Qualcomm engineers to help their projects succeed.

Two of this year’s winning teams from North America feature Allen School students. Andrew Alex and Megan Frisella received support for their proposed project “Productive Programming of Multi-Threaded Hardware Accelerators.” Fellow Allen School Ph.D. student Zixian Ma and Yushi Hu, a Ph.D. student in the University of Washington’s Department of Electrical & Computer Engineering (ECE), also earned a fellowship for their project “Learning Multi-modal Agents to Reason and Act for Complex Multi-modal Tasks.”

Productive Programming of Multi-Threaded Hardware Accelerators 

Headshot of Andrew Alex
Andrew Alex

The hardware landscape of modern systems-on-chips (SoCs), which are made of a variety of digital signal processors (DSP), graphic processing units (GPU) and differing sizes of general purpose cores, is constantly evolving. Because each new generation of hardware requires an optimized kernel library to ensure that their machine learning and signal processing workloads run smoothly and efficiently, it can be difficult for performance engineers to keep up. 

Performance engineers are turning to user-scheduled languages (USL), an emerging class of programming languages designed for heterogeneous hardware. They work by dividing the algorithm that specifies the program’s functional behavior from the schedule, which defines how that computation is carried out. Alex and Frisella aim to build a language system that is an extension of Exo, a popular USL that can optimize high-performance computing kernels on to new hardware accelerators, but lacks support for asynchronous parallelism or concurrency. Their proposed language system is capable of scheduling programs to exploit asynchronous and concurrent SoC targets, while also ensuring that the program’s behavior is preserved

Headshot of Megan Frisella
Megan Frisella

“The tools for producing the highly-performant code that is important for fields like machine learning and signal processing have not kept pace with the ever-expanding capabilities of the hardware that runs the code,” said Alex, who is advised by Allen School professor Gilbert Bernstein. “Our project aims to remedy this gap by enabling a programmer to express this highly-performant, concurrent code as a sequence of equivalence-preserving transformations of a sequential program.”

Using such a system, performance engineers will not be limited to choices made by the optimizing compiler to transform their code using a cost model that may not reflect all the newly available features in the hardware it is targeting. Instead, engineers can apply their own domain and hardware-specific knowledge to the problem without existing tools such as compilers getting in the way — helping them write code faster and with less effort.

“Extending the power of user-scheduling to asynchronous and concurrent SoCs will unlock productivity in programming emerging hardware,” said Frisella, who is co-advised by Bernstein and faculty colleague Stephanie Wang.

Learning Multi-modal Agents to Reason and Act for Complex Multi-modal Tasks

Headshot of Zixian Ma
Zixian Ma

Real-world multi-modal foundation models can help with various tasks ranging from answering simple visual questions about objects in daily life to solving more difficult problems about travel planning. Although these state-of-the-art models can answer generic and straightforward questions well, they struggle with complex questions and with generalizing about new tasks. For example, a user may take a picture of a panel showing different gas prices and ask the model how many gallons they can buy within a certain budget, but the model will have trouble answering. 

To address these challenges, Ma and Hu propose to develop multi-modal agents that can explicitly reason about and act on these complex tasks using chains-of-thought-and-action. They aim to curate a new dataset with images from across various domains — such as daily life images, web screenshots and medical images — and pair it with a novel learning method.

“Our work aims to enhance open-source foundation multi-modal models’ capabilities to not only perform complex tasks through reasoning and actions but also do so in a more interpretable manner,” said Ma, who is co-advised by Allen School professor Ranjay Krishna and professor emeritus Daniel Weld.

With the large-scale dataset, Ma and Hu plan to train generalizable multi-modal agents using heterogeneous pretraining and domain-specific supervised finetuning and reinforcement learning. The researchers will build a similar architecture to that of heterogeneous pretrained transformers, which are able to combine a huge amount of data from multiple sources into one system to teach a robot an array of tasks using stems, trunks and heads. 

Headshot of Yushi Hu
Yushi Hu

In their proposed system, each stem features a domain-specific vision encoder that maps the visual data from various domains to visual features, or the numerical representations of an image’s visual content. The shared trunk is a transformer encoder block which connects these domain-specific visual features to shared representations in the same dimension of the text embeddings. Then the shared head, which is a decoder-only language model, takes both the visual tokens from the shared encoder as well as the text tokens of the input query, and generates the next set of text tokens following the inputs.

“This research focuses on developing artificial intelligence that can seamlessly understand, reason and generate across vision, language, audio — mirroring the way people interact with the world. By unifying these diverse streams, we aim to move beyond passive chatbots toward truly helpful agents that collaborate with humans and complete real-world tasks,” said Hu, who is co-advised by Allen School professor Noah A. Smith and ECE professor Mari Ostendorf.

The Allen School-affiliated teams are among three UW winners of the Qualcomm Innovation Fellowship this year. They are joined by ECE Ph.D. students Marziyeh Rezaei and Pengyu Zeng, who earned a fellowship to pursue their research proposal titled “Ultra-low Power Coherent Front-haul Optical Links to enable multi-Tb/s Capacity for 6G Massive MIMOs and Edge AI Datacenters.”

Read more about this year’s Qualcomm Innovation Fellowship North America recipients. Read more →

Allen School researchers receive ICWSM Best Paper Award for analyzing how Reddit rules influence online community outcomes

A close-up image of various social media icons including Reddit, Facebook and TikTok on a smart phone.
Photo by Ralph Olazo/Unsplash

Rules are vital for building a safe and healthy functioning online community, and Reddit is no exception. For community moderators, however, it can be difficult to make data-driven decisions on what rules are best for their community.

A team of researchers in the Allen School’s Social Futures Lab and Behavioral Data Science Lab conducted the largest-to-date analysis of rules on Reddit looking at over 67,000 rules and their evolution across more than 5,000 communities over a period of five years — accounting for almost 70% of all content on the platform. This study is the first to connect Reddit rules to community outcomes. They found that rules on who participates, how content is formatted and tagged as well as rules about commercial activities were the most strongly associated with community members speaking positively about how their community is governed.

The team presented their work titled “Reddit Rules and Rulers: Quantifying the Link Between Rules and Perceptions of Governance Across Thousands of Communities” at the 2025 International AAAI Conference on Web and Social Media (ICWSM 2025) in June and received the sole Best Paper Award out of 138 papers.

“This was my first paper, and I am extremely grateful for it to be named best paper of ICWSM 2025,” said lead author Leon Liebmann (B.S., ‘25), now at the online privacy company Westbold. “The work was difficult at times, and my advisors and co-authors Allen School Ph.D. student Galen Weld and professor Tim Althoff provided me with the direction and methods I needed to get it done. These people shaped my time in the Allen School and gave me a love for research I’d love to revisit.”

To better understand the rules on Reddit, the team first had to map out which communities had what rules and when. The researchers developed a retrieval-augmented GPT-4o model to classify rules into different categories based on their target, tone and topic. They then assessed the rules based on how common they were and how they varied across different communities, and also collected timelines on how the communities’ rules changed over time. At the same time, the researchers used a classification pipeline to identify posts and comments discussing community governance.

Taken together, this study can help inform Reddit moderators and community leaders on what rules their communities should have. The researchers found that the most common rules across communities covered post content, spam and low quality content, and respect for others — guidelines that platforms could use to create “starter packs” for new communities. They also found that how moderators word rules could influence how positively or negatively communities view their governance. For example, prescriptive rules, or those that describe what community members should do, are viewed more favorably than restrictive rules that focus on what community members should not do. By choosing to phrase rules prescriptively, moderators can help communities have a positive view of their governance.

In addition to Leibmann, Weld and Althoff, Allen School professor Amy X. Zhang was also a co-author on the paper.

Read the full paper here and the datasets for Reddit rules and outcomes are available here. Read more →

MythBusters, computer science edition: Why an Allen School degree continues to be a great choice for students

Magnifying glass resting on the keys of a laptop computer
“The industry will continue to need smart, creative software engineers who understand how to build and harness the latest tools — including AI.” Allen School leaders Magdalena Balazinska and Dan Grossman examine some of the myths surrounding computer science careers in the era of artificial intelligence. Photo by Yevhen Smyk, Vecteezy

There has been a lot of chatter lately, online and in various media outlets, about the supposed dwindling prospects for new computer science graduates in the artificial intelligence era. Recent layoffs in the technology sector have students, parents and educators worried that a degree in computing, once seen as a sure path to a fulfilling career, is no longer a reliable bet.

“The alarmist doom and gloom prevalent in the news is not consistent with the experiences of the vast majority of our graduates,” said Magdalena Balazinska, professor and director of the Allen School. “The industry will continue to need smart, creative software engineers who understand how to build and harness the latest tools — including AI. And the fact remains that a computer science degree is great preparation for a broad range of fields within and beyond technology, including the natural sciences, finance, medicine and law.”

In our own version of MythBusters, we asked Balazinska and professor Dan Grossman, vice director of the Allen School, to examine the myths and realities surrounding AI and the prospects for current and future Allen School majors. Their answers indicate that the rumored demise of software engineering as a career path has been greatly exaggerated — and that no matter what path computer science majors choose after graduation, they can use their education to change the world.

Also be sure to check out our fact sheet, Computer Science Careers and AI: Myth vs. Reality.

Let’s start with the question on everyone’s mind: What’s the job market like for computer science graduates these days?

Dan Grossman: Individuals’ mileage may vary depending on a range of factors, but what’s being reported in some media outlets doesn’t reflect what we’re seeing here at the University of Washington. More than 120 different companies hired this year’s Allen School graduates into software engineering roles. Amazon alone hired more than 100 graduates from the Allen School’s 2024-25 class. Google and Meta didn’t hire at that scale, but still more than last year — they each hired 20 graduates this year. Microsoft hired more than two dozen graduates from this latest class. I expect these numbers to grow as we hear from more graduates. 

So, while the job market is tighter now than it was a few years ago, the sky is not falling. It’s important to remember that the Allen School is one of the top computer science programs in the nation, with a cutting-edge curriculum that evolves alongside the industry while remaining grounded in the fundamentals of our field. Our graduates are highly sought after, so their experience in the job market doesn’t necessarily reflect the experience of others. That has always been the case, even before this latest handwringing over AI. A B.S. in CS is not a uniform credential. The Allen School has always produced highly competitive graduates.

Magdalena Balazinska: In addition to those who found employment after graduation, more than 100 of our recent graduates opted to continue their education by enrolling in a master’s or Ph.D. program, which, of course, also makes us immensely proud!

How is AI impacting software engineering jobs?

Portrait of Magdalena Balazinska
Magdalena Balazinska: “The industry will continue to need smart, creative software engineers who understand how to build and harness the latest tools — including AI.”

MB: There are two factors: (1) AI’s impact on the work of a software engineer, and (2) AI’s impact on the job market for software engineers. Regarding the latter, it’s not so much that AI is taking the jobs, but that companies are having to devote tremendous resources to the infrastructure behind AI, which is very expensive. Also, many companies over-hired during COVID, and now they’re doing a course-correction for the AI era. I look at this as more of a reset. There’s no question that AI is affecting many areas of computing, just as it’s affecting just about every other sector of the economy. Companies will continue to invest in the people who know how to build and leverage these and other tools.

To my first point: With AI, we should expect the work of a software engineer to change, but to change in a really exciting way! The task of coding, or the translation of a very precise design into software instructions, can largely be handled by AI. But that’s not the most exciting or challenging part of software engineering. Understanding the requirements, figuring out an appropriate design, and articulating it as a precise specification are the hard parts. Going forward, software engineers will spend more time imagining what systems to build and how to organize the implementation of those systems, and then let AI handle many of the details of converting those ideas into code.

DG: One of our former faculty colleagues, Oren Etzioni, said, “You won’t be replaced by an AI system, but you might be replaced by a person who uses AI better than you.” I think that’s the direction we’re headed. Not AI as a replacement for people, but as a differentiator. Here at the Allen School, one of our goals is to enable students to differentiate themselves in this rapidly evolving landscape. For example, we are introducing a course on AI-aided software development, which will teach students how to effectively harness these tools. 

How has AI affected student interest in the Allen School?

DG: Student interest remains strong — we received roughly 7,000 applications for Fall 2025 first-year admission.

MB: That may sound like a daunting number. However, we were able to offer admission to 37% of the Washington high school students who applied. That’s not as high as we would like it to be, but it’s far higher than public perception. We achieve this by heavily favoring applicants from Washington. For Fall 2025, we offered admission to only 4% of applicants from outside Washington.

If AI can write code, why should students major in computer science?

MB: Because computer science is so much more than coding! Creating a new system or application, perhaps a system to help the elderly take care of their daily tasks and manage their paperwork or a new approach for doctors to perform long-distance tele-operations, isn’t just a matter of “writing code.” A software engineer begins by clearly understanding the requirements — what the system needs to provide. Then the software engineer will decompose the problem into pieces, understand how those pieces will fit together, and anticipate failures. What happens if there is a power or network failure, or someone tries to hack the system? This gets progressively more challenging with the complexity and scale of systems that software engineers build, typically on teams with many people working together. Coding is the relatively easier part.

DG: In that spirit, here in the Allen School, we do teach students how to code, but as a component of how to envision, design and build systems and applications that solve complex problems and touch people’s lives. The principles, precision, and reasoning gained from reading and writing code is a necessary foundation that serves our students very well — including graduates who now use AI in industry. It is the software engineers with the deepest knowledge who will be most effective at using AI to write their code, because they will know how and where AI can go wrong and how to steer it toward producing a correct output.

MB: Engineers have always used tools, and their tools have always advanced with time and opened the door to innovation. Thanks to developments like modern coding libraries and languages, online repositories like StackOverflow, GitHub, automated testing, cloud computing, and more, software engineers today are far more efficient and can develop applications more quickly than ever before. And this was before AI for coding had really taken off. And yet, there are more software engineers doing more interesting and important things than ever before!

How does the Allen School prepare students for a workplace — and a world — being transformed by AI?

Portrait of Dan Grossman
Dan Grossman: “One of our goals is to enable students to differentiate themselves in this rapidly evolving landscape.”

MB: As a leader in AI research, the Allen School is ideally positioned to help students learn how to use AI, how to build AI, and how to move the field of AI forward to benefit humanity. We give students multiple opportunities to explore AI topics and tools as part of our curriculum. Dan mentioned our AI-assisted software development course, and many of our other courses allow for using AI assistance in well-prescribed ways. This enables students to focus on core course concepts, generate more complex projects, and so on. Gaining experience with any AI tool can give a sense of what the technology can help with — along with its limitations. That said, we will continue in some courses to expect students to build, design, test, and document software without AI assistance.

DG: Our courses sometimes use the same cutting-edge tools used in industry, and other times will provide a simpler setting for pedagogical purposes. Software engineering tools change rapidly, so we tend not to get into the weeds on any one particular tool but give students the confidence to pick up future tools. Importantly, we don’t just teach students how to build and use AI. We also help them to think critically about the ethics and societal impacts of these technologies, such as their potential to reinforce bias or be used as a surveillance tool, and ways to mitigate those impacts.

MB: Another advantage we have at the Allen School is that we are a leading research institution, and our faculty are among the foremost experts in the field. This gives us the ability to incorporate new concepts and techniques into our coursework quickly. We also have a program devoted to supporting undergraduates in engaging in hands-on research in our labs alongside those very same faculty and our amazing graduate students and postdocs. Many students choose to get involved in research during their undergraduate studies.

What if a student is interested in computing, but not AI?

DG: Great! There are many open challenges across computing, from systems development, to human-facing interactive software design, to hardware design, to data management, and many others. Even if you are not using or developing AI itself, building systems that can run AI efficiently is driving a lot of exciting work in the field these days. While the big breakthroughs that have been driving rapid change over the last few years are AI-centered, computing remains a broad field.

MB: A student can major in computer science and follow their passion wherever it takes them. A subset of students will choose to study AI and build the next AI technologies, but the vast majority will use AI as a tool while building systems for medicine, education, transportation, the environment, and other important purposes. Or they will build back-end infrastructure at global companies like Google, Amazon, or Microsoft, or tackle other challenges like those that Dan mentioned. The more we advance computing, the more we open new opportunities. I think that’s why the number of software engineers just keeps growing. There is always more to do. The job is never done.

What is your advice to current and aspiring computer science majors who worry about their career prospects with the rise of AI?

MB: First, if you think you want to be a computer scientist or a computer engineer, pursue that! If you choose a major that you are excited about, you will not mind spending hours deepening your knowledge and sharpening your skills, which will help you to become an expert and to enjoy your chosen profession even more. My advice to every student is to take a broad range of challenging courses. Learn how to use the current tools, with the understanding that the tools you use today will not be the ones you use tomorrow. This field moves fast, which is what makes it exciting.

When it’s time to start your job search, whether for an internship or a full-time job, apply broadly. Apply to large companies, small companies, companies in various sectors, non-profits, and so on. Many organizations need software engineers! And not all interesting technical jobs that use a computing degree have the title of software engineer. Pick the position where you will learn the most. It’s important to optimize for learning and for growth, especially early on in one’s career.

DG: I would also remind students that a UW education is not about vocational training; our goal is that students graduate with the knowledge and skills to succeed in their chosen career, yes, but also to be engaged citizens of the world. While you’re here, make the most of your education — take a range of challenging courses and put in the time to learn the material. After all, it is a multi-year investment on your part, and the faculty have invested a lot of time and effort into creating a challenging, coherent curriculum for you. Take the hardest classes that you think are also the most exciting ones, and then focus on learning as much as you can.

Any final thoughts?

DG: Don’t choose a major solely because it’s popular. Choose a major that you’re passionate about. If that’s computer science or computer engineering, we’d love to see you at the Allen School. If it’s something else, we’d still love to see you in some of our classes.

MB: Everyone can benefit from learning at least a little computer science, especially now in the AI era!

Download the fact sheet: Computer Science Careers and AI: Myth vs. Reality

Read more →

« Newer PostsOlder Posts »