Research
My current research focuses on enhancing large language models (LLMs) for social good, with an emphasis on health-related domains. I am actively working on post-training techniques and synthetic data generation to equip LLMs with complex and reliable abilities. Besides, I am interested in language agents and vision LLMs.
I am looking for a Research Intern position next summer. Feel free to reach out!
|
CBT-Bench: Evaluating Large Language Models on Assisting Cognitive Behavior Therapy
Mian Zhang*,
Xianjun Yang*,
Xinlu Zhang,
Travis Labrum,
Jamie C. Chiu,
Shaun M. Eack,
Fei Fang,
William Yang Wang,
Zhiyu Zoey Chen
arXiv, 2024
arXiv
/
dataset
We present the first systematic benchmark to evaluate LLMs' efficacy for CBT. The key finding is while LLMs excel at recalling CBT knowledge, they struggle with complex tasks requiring deep cognitive analysis and patient-specific responses.
|
IDEA: Enhancing the Rule Learning Ability of Large Language Model Agent through Induction, Deduction, and Abduction
Kaiyu He,
Mian Zhang,
Shuo Yan,
Peilin Wu,
Zhiyu Zoey Chen
arXiv, 2024
arXiv
/
code and data
We propose IDEA, a holistic LLM agent framework integrating Induction, DEduction, and Abduction, to generate hypotheses, devise plans, and revise hypotheses iteratively to learn new rules in novel environments.
|
Large Language Models for Disease Diagnosis: A Scoping Review
Shuang Zhou*,
Zidu Xu*,
Mian Zhang*,
Chunpu Xu*,
Yawen Guo,
Zaifu Zhan,
Sirui Ding,
Jiashuo Wang,
Kaishuai Xu,
Yi Fang,
Liqiao Xia,
Jeremy Yeung,
Daochen Zha,
Mingquan Lin,
Rui Zhang
arXiv, 2024
arXiv
We perform a comprehensive review of LLM-based methods for disease diagnosis. Our review examines the existing literature across various dimensions, including disease types and associated clinical specialties, clinical data, LLM techniques, and evaluation methods. Additionally, we offer recommendations for applying and evaluating LLMs for diagnostic tasks. Furthermore, we assess the limitations of current research and discuss future directions.
|
Inconsistent dialogue responses and how to recover from them
Mian Zhang,
Lifeng Jin,
Linfeng Song,
Haitao Mi,
Dong Yu
EACL, 2024
paper
/
code and dataset
We propose the first comprehensive dataset for studying the inconsistencies in dialogue, which covers the life span of inconsistencies, namely introduction, understanding, and resolution. Our experimental findings indicate that our dataset significantly helps the progress in identifying and resolving conversational inconsistencies, and current popular large language models like ChatGPT which are good at resolving inconsistencies however still struggle with detection.
|
SafeConv: Explaining and Correcting Conversational Unsafe Bahavior
Mian Zhang,
Lifeng Jin,
Linfeng Song,
Haitao Mi,
Wenliang Chen,
Dong Yu
ACL, 2023   (Oral Presentation)
paper
/
dataset
We construct a new dataset called SafeConv for the research of conversational safety. SafeConv could power multiple models to detect, explain, and correct conversational unsafe behavior.
|
Friend-training: Learning from Different but Related Tasks
Mian Zhang,
Lifeng Jin,
Linfeng Song,
Haitao Mi,
Dong Yu
EACL, 2023
paper
We propose Friend-traing, a cross-task self-training framework, where models trained to do different tasks are used in an iterative training, pseudo-labeling, and retraining process to help each other for better selection of pseudo-labels.
|
A Pairing Enhancement Approach for Aspect Sentiment Triplet Extraction
Fang Yang,
Mian Zhang,
Gongzhen Hu,
Xiabing Zhou
KSEM, 2023
paper
We propose a pairing enhancement approach for ASTE, which incorporates contrastive learning during the training stage to inject aspect-opinion pairing knowledge into the triplet extraction model.
|
Emotion Recognition in Conversation from Variable-Length Context
Mian Zhang,
Xiabing Zhou
Wenliang Chen,
Min Zhang
ICASSP, 2023
paper
We propose a new method to leverage variable-length context windows when predicting the emotion of different utterances.
|
|