The Association for Computing Machinery’s Special Interest Group on the Management of Data honored Allen School professor Dan Suciu with its 2022 Edgar F. Codd Innovations Award in recognition of his “lasting contributions to the foundations of novel data management trends.” The award recognizes a member of the ACM SIGMOD community who has made enduring and highly significant contributions to the development, understanding or use of databases and database systems over the course of their career. In the same week he was recognized for his influential body of work, Suciu also collected a Best Paper Award for his latest contribution — an indication that he has no intention of resting on his laurels.
Throughout his career, Suciu has shown a propensity for drawing deep connections between logic, database theory, algorithms and systems. According to Dan Olteanu, professor and head of the Data Systems and Theory Group at the University of Zurich, Suciu’s approach to advancing new data management paradigms has also set a new standard for research in the field.
“Dan Suciu is the most prominent researcher bridging database systems and theory,” said Olteanu. “His numerous contributions to the theory and practice of databases have fundamentally transformed a wide range of areas of research, such as semistructured data, data security, querying unreliable and inconsistent sources, probabilistic databases, data pricing, distributed and parallel query processing, and causality inference. They established clean formal foundations and practical and elegant data processing techniques. They further changed how younger generations of computer science researchers, including myself, approach research problems, and we relentlessly strive to meet the bar set by his work.”
Suciu, who holds the Microsoft Endowed Professorship in the Allen School, has been setting the bar high ever since he joined the University of Washington nearly 22 years ago. For example, in a paper appearing at the 2002 Symposium on Principles of Database Systems (PODS) Suciu and then-student Gerome Miklau (Ph.D., ‘05), now a faculty member at the University of Massachusetts Amherst, examined the complexity of containment and equivalence for a core fragment of the XPath query language for XML applications. The fragment in question covers attributes that are frequently applied in practice, specifically queries that contain branching, label wildcards and can express descendant relationships between nodes. Whereas prior work had established that efficient containment algorithms exist for any combination of two of those, Suciu and Miklau established — to their own surprise — that the problem of checking containment of all three is coNP-complete. Based on their findings, the duo designed an efficient containment algorithm capable of running in polynomial time for several cases with practical significance that involve all three. Their paper was singled out for its impact with the PODS Alberto O. Mendelzon Test of Time Awards in 2012 — one of two PODS Test of Time Awards that Suciu has received in his career so far.
In another paper that combined rigorous theoretical evaluation with real-world concerns — in this case, those of every individual who has purchased a product or service or surfed online — Suciu and his collaborators introduced a framework for the pricing of private data. This work, published in 2017, conceived of a market balancing the interests of individuals in safeguarding their personal information with the sometimes incompatible interests of companies and organizations seeking to extract value from said data for the purposes of offering more personalized services, targeting their marketing to specific interests, and direct sale to third parties. To find that elusive balance, Suciu, Miklau, and co-authors Chao Li and Daniel Yang Li drew from and expanded upon elements of differential privacy and data markets to construct a framework in which a “market maker” acts as an intermediary between individual data owners and institutional data buyers. In this scenario, the market maker responds to queries from the latter and sets prices to compensate the former based on a variety of factors, including the amount of perturbation — or noise — in the data, the option to source data from less expensive query sources, and individuals’ own risk tolerance for potential privacy loss. Suciu and the team earned a Best Paper Award at the 16th International Conference on Database Theory (ICDT 2013) for this work.
Around the same time, against the backdrop of the rapid rise in cloud computing and big data, Suciu tackled the algorithmic aspects of parallel data processing over large-scale distributed systems such as the MapReduce framework and UW’s own Myria system. Working with colleague Paul Beame of the Allen School’s Theory of Computation group and former student Paraschos Koutris (Ph.D., ‘15), now a faculty member at the University of Wisconsin-Madison, Suciu introduced a new theoretical model, Massively Parallel Computation (MPC), that separates the cost of computation from that of communication. By focusing the cost of parallel processing exclusively on the latter based on the amount of communication and the number of communication rounds, Suciu and his collaborators led a paradigm shift in how the community analyzed the complexity of distributed large-scale data queries — from run-time or the number of disk input/output operations, to the amount of data being reshuffled while maintaining server-load balance. Suciu and Koutris subsequently applied the MPC in a comprehensive survey of algorithms for different data processing tasks in collaboration with Semih Salihoglu, a faculty member at University of Waterloo.
In addition to publishing nearly 300 conference or journal papers, Suciu has contributed to multiple highly-cited books in data management. One of those, published in 2011, was an authoritative work on probabilistic databases he co-authored with Olteanu; former student Christopher Ré (Ph.D., ‘09), now a faculty member at Stanford University; and Christoph Koch, a faculty member at the École Polytechnique Fédérale de Lausanne. In the book, Suciu and his collaborators put forward novel representation formalisms and query processing techniques for modeling and processing probabilistic data used in information extraction, scientific data management, data cleaning, and other use cases that involve large volumes of uncertain data.
Three of Suciu’s aforementioned student collaborators — Miklau, Ré and Koutris — earned the ACM SIGMOD Doctoral Dissertation Award working with him. According to his UW Database Group colleague Magdalena Balazinska, Suciu’s impact in advancing new paradigms in data management is rooted not only in his vision and leadership in bridging theory and practice, but also in his devotion to developing the next generation of researchers to carry that work forward.
“As a database researcher myself, I have appreciated Dan’s approach to breaking new intellectual ground while offering a path to practical implementation,” said Balazinska, professor and director of the Allen School. “He has a knack for setting new directions for theoreticians to explore while also guiding engineers in the actual development of systems aligned with emerging trends. Not only is Dan a leader in the database research community, but he is a wonderful mentor to all who have the privilege of studying with him in addition to being a treasured colleague and friend.”
In his award talk at the SIGMOD/PODS conference in Philadelphia, Pennsylvania earlier this month, Suciu credited one of his own early mentors, Val Tannen, with “teaching me, and teaching me how to teach” and setting him on the path to his life’s work. It was Tannen who first introduced Suciu, back when he was a college student in Romania, to a new kind of mathematics that featured lattices, category theory and universal algebras — elements that Suciu found compelling and “strangely relevant” to his newfound passion for programming. Suciu later followed Tannen to the University of Pennsylvania, where he began his career in database research as a Ph.D. student working to redesign query languages grounded in mathematics — collecting the first of his many conference paper awards, at ICDT 1995, in the process.
Fast forward 27 years later, and Suciu collected his latest Best Paper Award, this time from PODS, for his work on “Convergence of Datalog over (Pre-) Semirings.” In the winning paper, Suciu and his co-authors — Mahmoud Abo Khamis and Hung Q. Ngo at RelationalAI, Reinhard Pichler at TU Wien, and Allen School Ph.D. student Remy Wang — make progress on an open problem related to enabling recursive queries beyond Boolean space as required by modern data processing and tensor computations that power applications ranging from program analysis and machine learning, to graph algorithms and linear algebra. To enable this progress, the team introduced datalogo, a powerful language for expressing recursive computations over general semi-rings.
This latest accolade brings Suciu’s career tally at influential database, data management, and related conferences to six Best Paper or Distinguished Paper Awards, three Best Demo Awards, and five Test of Time or Influential Paper Awards.
Congratulations, Dan!