Issues · CarperAI/trlx · GitHub

Labels Milestones

Division by Zero in GPTRewardModel with Empty Batches

#609

· ohsono opened

on Nov 28, 2025

ValueError: Invalid pattern: '**' can only be an entire path component

#607

· amrzv opened

on Jun 7, 2025

ImportError: cannot import name 'prepare_model_for_int8_training' from 'peft'

#606

· amrzv opened

on Apr 23, 2025

Extracting total-loss, PPO-loss, rewards per step, returns per step in RLHF-PPO implementation

#605

· rithammajumdarr opened

on Feb 17, 2025

Question about model load

#604

· gray311 opened

on Feb 11, 2025

Does the framework support PPO training for Qwen2?

feature request

#603

· oldwangggggg opened

on Dec 21, 2024

reward_fn in accelerate_ppo_trainer.py

#602

· Jerrrrykun opened

on Dec 11, 2024

OOM error with PEFT LoRA on Llama2-7B

#601

· arpaiva opened

on Sep 20, 2024

Load the checkpoint fails

#600

· AfraAmini opened

on Sep 6, 2024

cannot import name 'flatten_dataclass' from 'trlx.data.ilql_types'

#599

· AfraAmini opened

on Jul 31, 2024

maybe bug in prepare & load's order

#598

· daiwk opened

on Jul 27, 2024

Error when running Ray Tune to launch hyperparameter sweep

#597

· Jing-L97 opened

on Jul 26, 2024