Reliable and Efficient In-Memory Fault Tolerance of Large Language Model Pretraining

  • Yuxin Wang
  • , Shaohuai Shi
  • , Xin He
  • , Zhenheng Tang
  • , Xinglin Pan
  • , Yang Zheng
  • , Xiaoyu Wu
  • , Amelie Chi Zhou
  • , Bingsheng He
  • , Xiaowen Chu*
  • *Corresponding author for this work

Research output: Working paperPreprint

Fingerprint

Dive into the research topics of 'Reliable and Efficient In-Memory Fault Tolerance of Large Language Model Pretraining'. Together they form a unique fingerprint.

Keyphrases

Computer Science