Hi everyone,
I’ve been diving into the world of generative AI within the Aviary framework, and I’ve hit a bit of a roadblock.
I also check this this - GPT-J-6B Fine-Tuning with Ray Train and DeepSpeed — Ray 2.22.0 and
During my generative ai training course sessions with a generative AI model, I keep encountering a persistent error that’s proving to be quite puzzling. It seems that whenever the model reaches a certain stage of training, it abruptly stops and throws an error message that reads, “Error: Unable to converge due to gradient vanishing problem.”
I’ve tried adjusting various hyperparameters, tinkering with the architecture, and even modifying the dataset preprocessing steps, but nothing seems to resolve this issue. I’ve also checked for any anomalies in the dataset, but everything appears to be in order.
Has anyone else encountered a similar error while working with generative AI models in Aviary?
Thanks in advance.