Skip to main content

Allen School professor Dan Suciu receives Best Paper Award for a novel solution to the cardinality estimation problem

Photo by GuerrillaBuzz on Unsplash

The cardinality estimation problem, or the challenge of accurately predicting the size of the output to a query without actually evaluating the query, is one of the oldest and most important problems in databases and data management. Cardinality estimation helps guide decisions on every aspect of query execution, from how much memory should be allocated for storing the query result to the number of servers needed to successfully process an expensive query. However, cardinality estimation is notoriously difficult; current methods can often have large errors, leading to poor decisions downstream. 

Headshot of Dan Suciu
Dan Suciu

A team of researchers led by Allen School professor Dan Suciu of the UW Database Group introduced a new pessimistic cardinality estimator called LpBound which provides a guaranteed upper bound on the query output size. This method offers a strong, theoretical guarantee that for any database that meets the given statistics, the query output size will always be below the bound set by LpBound. They presented their research titled “LpBound: Pessimistic Cardinality Estimation using Lp-Norms of Degree Sequences” at the 2025 ACM SIGMOD/PODS International Conference on Management of Data last month and received a Best Paper Award for their work.

“Cardinality estimation is difficult, because it needs to rely on a very small amount of information (statistics on the input data), and needs to produce an accurate estimate,” said senior author Suciu, who also holds the Microsoft Endowed Professorship in the Allen School. “The novel solution described in the paper estimates the cardinality of the output by using simple statistics on the input data, and applying Shannon inequalities from information theory. The method outperforms not only traditional cardinality estimators, but also novel estimators based on machine learning.”

The LpBound cardinality estimator provides several advantages over other learned estimators currently available, including FactorJoin, BayesCard and DeepDB. In addition to the guaranteed upper bounds, it has a low estimation time and error as well as space requirements, making it useful for practical applications. LpBound also works for both cyclic and acyclic queries — meaning it can estimate the cardinality in traditional SQL workloads, which are often acyclic, and in graph pattern matching or SparQL queries, which are more likely to be cyclic. When integrated into the query optimization framework PostgreSQL, the researchers found that LpBound’s estimates led to query plans as good as those made from true cardinalities, making it more applicable for real-world database systems.

Additional authors include Haozhe Zhang, Christoph Mayer and Dan Olteanu from the University of Zurich, along with Mahmoud Abo Khamis from RelationalAI.

Read the full paper on LpBound.