Fascinated by the inner workings of machine learning models for data-driven decision-making, Allen School professor Simon Shaolei Du constructs their theoretical foundations to better understand what makes them tick and then designs algorithms that translate theory into practice. Du’s faculty colleague Adriana Schulz, meanwhile, has clocked how to make the act of making more accessible and sustainable through novel techniques in computer-aided design and manufacturing, drawing upon advances in machine learning, fabrication, programming languages and more.
Those efforts received a boost from the Alfred P. Sloan Foundation earlier this year, when Schulz and Du were recognized among the 2024 class of Sloan Research Fellows representing the next generation of scientific leaders.
Simon Shaolei Du: Unlocking the mysteries of machine learning
Deep learning. Reinforcement learning. Representation learning. Recent breakthroughs in the training of large-scale machine learning models are transforming data-driven decision-making across a variety of domains and fueling developments ranging from self-driving vehicles to ChatGPT. But while we know that such models work, we don’t really know why.
“We still don’t have a good understanding of why these paradigms are so powerful,” Du explained in a UW News release. “My research aims to open the black box.”
Already, Du has been able to poke several holes in said box by demystifying several principles underlying the success of such models. For example, Du offered the first proof for how gradient descent optimizes the training of over-parameterized deep neural networks — so-called because the number of parameters significantly exceeds the minimum required relative to the size of the training dataset. Du and his co-authors showed that, with sufficient over-parameterization, gradient descent could find the global minima to achieve zero training loss even though the objective function is non-convex and non-smooth. Du was also able to explain how these models generalize so well despite their enormous size by proving a fundamental connection between deep neural network learning and kernel learning.
Another connection Du has investigated is that between representation learning and recent advances in computer vision and natural language processing. Representation learning bypasses the need to train on each new task from scratch by drawing upon the commonalities underlying different but related tasks. Du was keen to understand how using large-scale but low-quality data to pre-train foundation models in the aforementioned domains effectively improves their performance on downstream tasks for which data is scarce — a condition known as few-shot learning. He and his collaborators developed a novel theoretical explanation for this phenomenon by proving that a good representation combined with a diversity of source training data are both necessary and sufficient for few-shot learning on a target task. Following this discovery, Du contributed to the first active learning algorithm for selecting pre-training data from the source task based on their relevance to the target task to make representation learning more efficient.
From representation to reinforcement: When it comes to modeling problems in data-driven decision-making, the latter is the gold standard. And the standard wisdom is that long planning horizons and large state spaces are why it is so difficult — or at least it was. Du and his collaborators turned the first assumption on its head by showing that sample complexity in reinforcement learning is not dependent upon whether the planning horizon is long or short. Du further challenged prevailing wisdom by demonstrating that a good representation of the optimal value function — which was presumed to address the state space problem — is not sufficient to ensure sample-efficient reinforcement learning across states.
“My goal is to design machine learning tools that are theoretically principled, resource-efficient and broadly accessible to practitioners across a variety of domains,” said Du. “This will also help us to ensure they are aligned with human values, because it is apparent that these models are going to play an increasingly important role in our society.”
Adriana Schulz: Making a mark by remaking manufacturing-oriented design
AI’s influence on design is already being felt in a variety of sectors. But despite its promise to enhance quality and productivity, its application to design for manufacturing has lagged. So, too, has the software side of the personalized manufacturing revolution, which has failed to keep pace with hardware advances in 3D-printing, machine knitting, robotics and more. This is where Schulz aims to make her mark.
“Design for manufacturing is where ideas are transformed into products that influence our daily lives,” Schulz said. “We have the potential to redefine how we ideate, prototype and produce almost everything.”
To realize this potential, Schulz develops computer-aided design tools for manufacturing that are grounded in the fundamentals of geometric data processing and physics-based modeling and also draw from domains such as machine learning and programming languages. The goal is to empower users of varying skill levels and backgrounds to flex their creativity while optimizing their designs for functionality and production.
One strategy is to treat design and fabrication as programs — that is, a set of physical instructions — and leverage formal reasoning and domain-specific languages to enable users to adjust plans on the fly based on their specific goals and constraints. Schulz and her collaborators took this approach with Carpentry Compiler, a tool for exploring tradeoffs between production time, cost of materials and other factors of their design before generating fabrication plans. She subsequently parlayed advances in program synthesis into a new tool for efficiently optimizing plans for both design and fabrication at the same time. Leveraging a technique called equivalence graphs, or e-graphs, Schulz and her team took advantage of inherent redundancies across design variations and fabrication alternatives to eliminate the need to recompute the fabrication cost from scratch with every design change. In a series of experiments, the new framework was shown to reduce project costs by as much as 60%.
Rising capabilities in AI have also given rise to a new field in computer science known as neurosymbolic reasoning, a hybrid approach to representing visual and other types of data that combines techniques from machine learning and symbolic program analysis. Schulz leveraged this emerging paradigm to make it easier for users of parametric CAD models for manufacturing to explore and manipulate variations of their designs while automatically retaining essential structural constraints. Typically, CAD users who wish to engage in such exploration have to go to the time and trouble of modifying multiple parameters simultaneously and then sifting through a slew of irrelevant outcomes to identify the meaningful ones. Schulz and her team streamlined the process by employing large language and image models to infer the space of meaningful variations of a shape, and then applying symbolic program analysis to identify common constraints across designs. Their system, ReparamCAD, offers a more intuitive, efficient and interactive approach to conventional CAD programs.
In addition to introducing more flexible design processes, Schulz has also contributed to more flexibility on the factory floor. Many assembly lines rely on robots that are task-specific, making it complex and costly to pivot the line to new tasks. Schulz and her colleagues sidestepped this problem by enabling the creation of 3D-printable passive grippers that can be swapped out at the end of a robotic arm to handle a variety of objects — including irregular shapes that would be a challenge for conventional grippers to manipulate. She and her team developed an algorithm that, when fed a 3D model of an object and its orientation, co-optimizes a gripper design and lift trajectory that will enable the robot to successfully pick up the item.
Whether it’s repurposed robots or software that minimizes material waste, Schulz’s past work offers a glimpse into manufacturing’s future — one that she hopes will be friendlier not just to people, but also to the planet.
“Moving forward, I plan to expand my efforts on sustainable design, exploring innovative design solutions that prioritize reusability and recyclability to foster circular ecosystems,” she told UW News.
Two other researchers with Allen School connections were among 22 computer scientists across North America to have been recognized among the 2024 class of Sloan Research Fellows. Justine Sherry (B.S., ‘10) is a professor at Carnegie Mellon University, where she leads research to modernize hardware and software for implementing middleboxes to make the internet more reliable, efficient, secure and equitable for users. Former visiting student Arvind Satyanarayan, who earned his Ph.D. from Stanford University while working with Allen School professor Jeffrey Heer in the UW Interactive Data Lab, is a professor at MIT, where he leads the MIT Visualization Group using interactive data visualization to explore intelligence augmentation that will amplify creativity and cognition while respecting human agency.
In addition, a third UW faculty member, chemistry professor Alexandra Velian, earned a Sloan Research Fellowship for her work on new materials to advance decarbonization, clean energy and quantum information technologies.
For more on the 2024 Sloan Research Fellows, see the Sloan Foundation’s announcement and a related story by UW News.