Skip to main content

Allen School student Victor Zhong named an Apple Scholar for his efforts to teach machines to generalize by reading natural language specifications

Photo of Victor Zhong

Victor Zhong, a Ph.D. student working with Allen School professor Luke Zettlemoyer in the Natural Language Processing (NLP) group, has been selected as a recipient of the Apple Scholar in Artificial Intelligence and Machine Learning fellowship. Zhong, one of 15 student researchers recognized as Apple Scholars this year, was selected based on his innovative research, his contributions as an emerging leader in his area and his unique commitment to take risks and push the envelope in machine learning and artificial intelligence.

“Victor is working to enable systems to solve new problems by simply reading natural language texts that describe what needs to be done. This work generalizes existing supervised machine learning approaches that require large quantities of labeled training data, and is applicable to a wide range of language understanding tasks such as semantic parsing, question answering, and dialog systems,” said Zettlemoyer. “This work is particularly exciting because it opens up new ways to think about how to build more sophisticated language understanding systems with significant less manual engineering effort.”

Zhong’s research focuses on teaching machines to read language specifications that characterize key aspects of a problem in order to efficiently learn solutions that generalize to new problems. Machine learning models typically train on large, fixed datasets and do not generalize to even closely related problems. For example, an object recognition system requires millions of training images and cannot classify new object types after deployment, while a dialogue system requires difficult-to-annotate conversations and cannot converse about new topics that emerge. Similarly, a robot trained to clean one house learns policies that will not work in new houses. 

“For many such problems, large-scale data gathering is difficult, but the collection of specifications — high-level natural language descriptions of what the problem is and how the algorithm should behave — is relatively easy,” Zhong said. “I hypothesize that by reading language specifications that characterize key aspects of the problem, we can efficiently learn solutions that generalize to new problems.” 

In the previous examples when his approach is applied, the object recognition system identifies new products by reading product descriptions; the dialogue system converses about new topics by reading documents and databases about the new topics; and the robot identifies appropriate policies in the new house by reading about objects present in the new house.

Zhong’s work involves zero-shot learning, in which machine learning models trained on one distribution of data must perform well on a different, new distribution of data during inference. His most recent work spans language-to-SQL semantic parsing, game playing through reinforcement learning and task-oriented dialogue.

Most semantic parsing systems are learned for a single target database. According to Zhong, the same system should perform well when deployed to a new production sales database for which there is no training data. In a paper published at the 2020 Conference on Empirical Methods on Natural Language Processing (EMNLP), Zhong and his co-authors proposed grounded adaptation, a framework to adapt semantic parsers to new databases by synthesizing high-quality data in the new databases. 

Given a new inference database, grounded adaptation first samples SQL queries according to database content and generates corresponding utterances. These are then parsed to synthesize utterance-SQL pairs. Finally, synthesized pairs consistent with the sampled SQL are used to adapt the parser to the new database. Grounded adaptation obtained state-of-the-art results on zero-shot SQL parsing on new databases, outperforming alternative techniques such as data augmentation.

Zhong also worked on extending reinforcement learning (RL) policies trained in one environment to new environments that have different underlying dynamics and rules without retraining. While RL is flexible and powerful, its lack of bias necessitates a large amount of training experience. In a paper published at the 2020 International Conference on Learning Representations, Zhong and his collaborators proposed a benchmark and a language-conditioned model that generalizes to new game dynamics via reading high-level descriptions of the dynamics. In order to succeed on this benchmark, an agent must perform multiple high-level reasoning steps to solve new games by cross-referencing language instruction, description of the environment dynamics, and environment observations. Zhong’s reading model achieved top zero-shot performance in new environments with previously unseen dynamics by efficiently reasoning between multiple texts and observations.

Traditional task-oriented dialogue systems focus on a single domain and cannot converse with users to solve tasks in new domains. Zhong also sought to create a dialogue system for restaurant reservations, that could use the same system to help users reserve flights. In work published at the 2019 Annual Meeting of the Association for Computational Linguistics (ACL), Zhong proposed a reading task-oriented dialogue system that solves new tasks by reading language documents on how to solve the new task. This system extracts rule specifications from documents and decides on how to respond to the user by editing extracted rules that have not yet been resolved. Zhong’s system obtained state-of-the-art performance on zero-shot dialogue tasks that require conversing with users to answer questions regarding topics not seen during training.

Zhong is now working to build intelligent, embodied assistants for every home. 

“Assistants such as Siri that interpret language will need to plan not only verbal responses but physical responses. Because it is infeasible to train these assistants in every home, we must learn policies that generalize to new environments not seen during training,” Zhong said. “I hypothesize that language, given its compositional nature and the wealth of information already encoded in text, is key to enabling this kind of generalization.”

In addition to his work on learning to generalize by reading, Zhong’s research spans a variety of topics in NLP such as dialogue, question answering, semantic parsing, and knowledge base population. He has published 13 papers at top NLP and machine learning conferences. 

This is the second year in a row that Apple has supported Allen School graduate students, after Jeong Joon Park was recognized with a 2020 Apple Scholars fellowship. 

Congratulations, Victor!