Hao Zhu

I am a second-year doctoral student at Language Technologies Institute, Carnegie Mellon University. I am fortunately advised by Graham Neubig and Yonatan Bisk. Before joining CMU, I enjoyed four fantasic undergrad years at Tsinghua University, advised by Zhiyuan Liu. I also worked closely with Jason Eisner and Matt Gormley. You can find my cv here (2019/3).

My ultimate goal is to understand human intelligence. Believing in Feynman's famous quote, "What I cannot create, I do not understand.", I am working on teaching Machine Learning models to gain human intelligence. More specifically, I am currently interested in teaching machines to speak human language, as well as to do human-level logical reasoning. I also have broad interests in other cognitive science fields.

Email: {last.name}{first.name} [at] cmu.edu

Twitter: @_Hao_Zhu

GitHub: ProKil

[Full Publication List & Preprints]   [Google Scholar]


[May. 2019] Got two papers accepted at ACL 2019. See you at Firenze.

[Feb. 2019] Got one paper accepted at NAACL 2019 in machine learning area.

[Oct. 2018] We proudly released our FewRel dataset. Consider submitting your models!

[Aug. 2018] Got four papers accepted at EMNLP 2018 in IE and semantics areas. You can now read the abstracts of these papers. Stay tuned for our preprints, codes and datasets! See you in Brussels!

[Jun. 2018] Released camera-ready version of our ACL paper and its code.

[Apr. 2018] Our paper "Incorporating Chinese Characters of Words for Lexical Sememe Prediction" have been accepted to ACL 2018! My co-authors, Huiming, Ruobing and Prof. Zhiyuan Liu, are really fantasic! See you in Melbourne!

[Apr. 2018] I have become the fellow of Tsinghua University Initiative Scientific Research Program with a funding of 32, 000 USD!

[Apr. 2018] This summer and fall I will fortunately do research with Prof. Matt Gormley at CMU and Prof. Jason Eisner at JHU. See you there!


Reviewer: EMNLP 2018/2019/2020, AACL 2020, NAACL 2019, ACL 2019/2020, ICLR 2021

Volunteer/Review Assistant: IJCAI 2017/2018

Research Hightlights

Unified Grammar Induction

We demonstrate that context free grammar (CFG) based methods for grammar induction benefit from modeling lexical dependencies. This contrasts to the most popular current methods for grammar induction, which focus on discovering either constituents or dependencies. Previous approaches to marry these two disparate syntactic formalisms (e.g. lexicalized PCFGs) have been plagued by sparsity, making them unsuitable for unsupervised grammar induction. However, in this work, we present novel neural models of lexicalized PCFGs which allow us to overcome sparsity problems and effectively induce both constituents and dependencies within a single model. Experiments demonstrate that this unified framework results in stronger results on both representations than achieved when modeling either formalism alone.

The Return of Lexical Dependencies: Neural Lexicalized PCFGs Transactions of the Association for Computational Linguistics (2020).