Can Language Models Perform Robust Reasoning in Chain-of-thought Prompting with Noisy Rationales?

Zhanke Zhou, Rong Tao, Jianing Zhu, Yiwen Luo, Zengmao Wang, Bo Han*

*Corresponding author for this work

Research output: Chapter in book/report/conference proceedingConference proceedingpeer-review

1 Citation (Scopus)

Abstract

This paper investigates an under-explored challenge in large language models (LLMs): chain-of-thought prompting with noisy rationales, which include irrelevant or inaccurate reasoning thoughts within examples used for in-context learning. We construct NoRa dataset that is tailored to evaluate the robustness of reasoning in the presence of noisy rationales. Our findings on NoRa dataset reveal a prevalent vulnerability to such noise among current LLMs, with existing robust methods like self-correction and self-consistency showing limited efficacy. Notably, compared to prompting with clean rationales, GPT-3.5 drops by 1.4%-19.8% in accuracy with irrelevant thoughts and more drastically by 2.2%-40.4% with inaccurate thoughts. Addressing this challenge necessitates external supervision that should be accessible in practice. Here, we propose the method of contrastive denoising with noisy chain-of-thought (CD-CoT). It enhances LLMs' denoising-reasoning capabilities by contrasting noisy rationales with only one clean rationale, which can be the minimal requirement for denoising-purpose prompting. This method follows a principle of exploration and exploitation: (1) rephrasing and selecting rationales in the input space to achieve explicit denoising and (2) exploring diverse reasoning paths and voting on answers in the output space. Empirically, CD-CoT demonstrates an average improvement of 17.8% in accuracy over the base model and shows significantly stronger denoising capabilities than baseline methods. The source code is publicly available at: https://github.com/tmlr-group/NoisyRationales.

Original languageEnglish
Title of host publication38th Conference on Neural Information Processing Systems, NeurIPS 2024
EditorsA. Globerson, L. Mackey, D. Belgrave, A. Fan, U. Paquet, J. Tomczak, C. Zhang
PublisherNeural Information Processing Systems Foundation
Number of pages65
ISBN (Electronic)9798331314385
Publication statusPublished - Dec 2024
Event38th Conference on Neural Information Processing Systems, NeurIPS 2024 - Vancouver Convention Center , Vancouver, Canada
Duration: 9 Dec 202415 Dec 2024
https://neurips.cc/Conferences/2024
https://openreview.net/group?id=NeurIPS.cc/2024
https://proceedings.neurips.cc/paper_files/paper/2024

Publication series

NameAdvances in Neural Information Processing Systems
PublisherNeural information processing systems foundation
Volume37
ISSN (Print)1049-5258
NameNeurIPS Proceedings

Conference

Conference38th Conference on Neural Information Processing Systems, NeurIPS 2024
Country/TerritoryCanada
CityVancouver
Period9/12/2415/12/24
Internet address

Fingerprint

Dive into the research topics of 'Can Language Models Perform Robust Reasoning in Chain-of-thought Prompting with Noisy Rationales?'. Together they form a unique fingerprint.

Cite this