I'm currently a Senior Engineering Manager at Twitter Cortex. Previously, I was leading the Machine Learning and Data Platform efforts at Fast. Before that, I was also a Research Scientist Manager at Facebook (now Meta) supporting a team that focuses on building Multimodal AI Assistant. I received my Ph.D. degree in September, 2017 from the Department of Computer Science, University of California, Santa Barbara, under the supervision of Prof. Xifeng Yan. During my Ph.D Studies, I worked on various research projects on sequence mining, information extraction, active learning and deep learning. In general, my research was focused on developing better knowledge extraction tools for sequence data (e.g., biological sequences, event streams, text corpus). In addition, I also had some experiences with bioinformatics research such as DNA sequences assembly, SNPs calling and gene expression analysis.
Before I started my Ph.D. studies at UCSB, I also obtained B.S. and M.S. in Computer Science from Northeastern University, China. Find more details about me in my CV.
- 08/2021: Paper "NUANCED: Natural Utterance Annotation for Nuanced Conversation with Estimated Distributions" accepted to EMNLP 2021.
- 08/2021: Open-sourced NUANCED:: a user-centric conversational recommendation dataset that contains 5.1k annotated dialogues and 26k high-quality user turns.
- 03/2021: Paper "Adding Chit-Chats to Enhance Task-Oriented Dialogues" accepted to NAACL 2021.
- 09/2020: Paper "User Memory Reasoning for Conversational Recommendation" accepted to COLING 2020.
- 10/2019: Open-sourced ReAgent: a modular, end-to-end platform for building reasoning systems. It closes the loop of turning actions into feedback, and feedback into training data for RL and online learning. ReAgent is used at Facebook to drive tens of billions of decisions per day.
- 12/2018: Open-sourced PyText: a deep-learning based NLP modeling framework now being used as the goto platform at Facebook and the open source community.
- 07/2017: We developed a new motif discovery tool, DeepMotif, that acheives even better performance than ASC+MEME (already 10,000 times faster than MEME). It's 10-100 times faster and doesn't rely on MEME. Learn more about them here. For more information or licensing, please contact me.
- 06/2017: Our motif discovery solution, ASC+MEME (paper), is licensed to SerImmune Inc. funded by NIH, illumina, Merck, etc. to find motifs from massive protein sequences generated by modern sequencing techniques.
NUANCED: Natural Utterance Annotation for Nuanced Conversation with Estimated Distributions
Zhiyu Chen, Honglei Liu, Hu Xu, Seungwhan Moon, Hao Zhou, Bing Liu. Proc. of the Conf. on Empirical Methods in Natural Language Processing (EMNLP 2021). [paper][data]
Adding Chit-Chats to Enhance Task-Oriented Dialogues
Kai Sun, Seungwhan Moon, Paul Crook, Stephen Roller, Becka Silvert, Bing Liu, Zhiguang Wang, Honglei Liu, Eunjoon Cho, Claire Cardie. Proc. of the Annual Conf. of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT 2021). [paper]
User Memory Reasoning for Conversational Recommendation
Hu Xu, Seungwhan Moon, Honglei Liu, Bing Liu, Pararth Shah, Bing Liu, Philip S. Yu. Proc. of Int. Conf. on Computational Linguistics (COLING 2020). [paper]
Federated User Representation Learning
Duc Bui, Kshitiz Malik, Jack Goetz, Honglei Liu, Seungwhan Moon, Anuj Kumar, Kang G. Shin. arXiv preprint arXiv:1909.12535 (2019). [paper]
Active Federated Learning
Jack Goetz, Kshitiz Malik, Duc Bui, Seungwhan Moon, Honglei Liu, Anuj Kumar. Workshop on Federated Learning for Data Privacy and Confidentiality at Neural Information Processing Systems (NeurIPS 2019). [paper]
Global Textual Relation Embedding for Relational
Zhiyu Chen, Hanwen Zha, Honglei Liu, Wenhu Chen, Xifeng Yan, Yu Su. Proc. of the Annual Meeting of the Association for Computational Linguistics (ACL 2019). (Short Paper) [paper]
Interpretability of Deep Reinforcement Learning Models in a
Honglei Liu, Parath Shah, Wenxuan Li, Wenhai Yang, Anuj Kumar. in submission
Explore-Exploit: A Framework for Interactive and Online
Honglei Liu, Anuj Kumar, Wenhai Yang, Benoit Dumoulin. Systems for Machine Learning Workshop at Neural Information Processing Systems (NeurIPS 2018). [paper]
Open-sourced as ReAgent , a modular, end-to-end platform for building reasoning systems
Text Mining and Knowledge Sharing for Scientific Publications
Keqian Li, Ping Zhang, Honglei Liu, Hanwen Zha, Xifeng Yan. Proc. of Int. Conf. on Knowledge Discovery and Data Mining (KDD 2018). (demo) [paper][video]
In Vitro Validation of in Silico Identified Inhibitory
Honglei Liu, Daniel Bridges, Connor Randall, Sara A. Solla, Bian Wu, Paul Hansma, Xifeng Yan, Kenneth S. Kosik, Kristofer Bouchard. Journal of Neuroscience Methods 321 (2019): 39-48. [paper]
Global Relation Embedding for Relation Extraction
Yu Su*, Honglei Liu*, Semih Yavuz, Izzeddin Gur, Huan Sun, Xifeng Yan. Proc. of the Annual Conf. of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT 2018). (*: Equal Contribution) [paper][source code]
Active Learning of Functional Networks from Spike Trains
Honglei Liu, Bian Wu. SIAM Int. Conf. on Data Mining (SDM 2017). [paper][supplementary materials][source code]
Fast Motif Discovery in Short Sequences
Honglei Liu, Fangqiu Han, Hongjun Zhou, Xifeng Yan, Kenneth S. Kosik. Proc. of Int. Conf. on Data Engineering (ICDE 2016). [paper] [slides] [poster] [software]
Software licensed to SerImmune Inc. to produce real world value
ALAE: Accelerating Local Alignment with Affine Gap Exactly in
Xiaochun Yang, Honglei Liu, Bin Wang. Proc. of Int. Conf. on Very Large Data Bases (VLDB 2012). [paper][source code]
Approximate Substring Query Algorithms Supporting Local Optimal
Honglei Liu, Xiaochun Yang, Bin Wang, Rong Jin. Journal of Frontiers of Computer Science and Technology, 2011. [source code]
Generating Proactive Reminders for Assistant Systems
US Patent Pending US17035253
Personalized Conversational Recommendations by Assistant
US Patent Pending US16921665
Personalized Federated Learning for Assistant Systems
US Patent Pending US16815990
Active Federated Learning for Assistant Systems
US Patent Pending US16815960
Methods, Mediums, and Systems for Training a Model
US Patent Pending US16731321
Methods, Mediums, and Systems for Representing a Model in a Memory of
US Patent Pending US16731345
Methods, Mediums, and Systems for Providing a Model for an End-User
US Patent Pending US16731304
Suppressing Reminders for Assistant Systems
US Patent Pending US16733044
Interpretability of Deep Reinforcement Learning Models in Assistant
US Patent Pending US16389769
Building Customized User Profiles Based on Conversational Data
US/EU Patent Pending US2019327330A1/EP3557500A1
Smart Assistant Systems
US Patent Pending US62923342
Systems and Methods for Off-Target Sequence Detection
US Patent Pending US20180075186A1
Biological Sequence Local Comparison Method Capable of Obtaining
CN Patent Granted CN102750461B
An Electric Automobile Battery Replacing Device
CN Patent Granted CN202089042U
- 07/2022 - Present: Senior Engineering Manager - Machine Learning, Twitter
- 04/2022 - 07/2022: Senior Engineering Manager - Machine Learning, Affirm
- 09/2021 - 04/2022: Head of Machine Learning and Data Platform, Fast
- 04/2020 - 09/2021: Research Scientist Manager, Facebook
- 02/2020 - 04/2020: Staff Research Scientist, Facebook
- 01/2019 - 02/2020: Senior Research Scientist, Facebook
- 07/2017 - 12/2018: Research Scientist, Facebook
- 06/2016 - 09/2016: Intern, Facebook
Topic: Indexing and Mining Billions of Time Series
- 07/2015 - 09/2015: Bioinformatics Intern, Illumina
Topic: Fast Specificity Checking for Multiplex PCR Primer Design