ValueError: You can't train a model that has been loaded in 8-bit precision on a different device than the one you're training on.
ValueError Traceback (most recent call last)
Cell In[17], line 25
22 print_trainable_parameters(model)
24 # Apply the accelerator. You can comment this out to remove the accelerator.
---> 25 model = accelerator.prepare_model(model)
File /vc_data/shankum/miniconda3/envs/llm2/lib/python3.11/site-packages/accelerate/accelerator.py:1392, in Accelerator.prepare_model(self, model, device_placement, evaluation_mode)
1389 if torch.device(current_device_index) != self.device:
1390 # if on the first device (GPU 0) we don't care
1391 if (self.device.index is not None) or (current_device_index != 0):
-> 1392 raise ValueError(
1393 "You can't train a model that has been loaded in 8-bit precision on a different device than the one "
1394 "you're training on. Make sure you loaded the model on the correct device using for example `device_map={'':torch.cuda.current_device() or device_map={'':torch.xpu.current_device()}"
1395 )
1397 if "cpu" in model_devices or "disk" in model_devices:
1398 raise ValueError(
1399 "You can't train a model that has been loaded in 8-bit precision with CPU or disk offload."
1400 )
ValueError: You can't train a model that has been loaded in 8-bit precision on a different device than the one you're training on. Make sure you loaded the model on the correct device using for example `device_map={'':torch.cuda.current_device() or device_map={'':torch.xpu.current_device()}
Are your tokens/ouptut from tokenizer, on the same device on which your model is loaded on?
If your model is on GPU, then make sure you update the token tensors , to GPU as well.
This is an accelerate issue where I am using a multi-gpu setup. I have used the same setup with other SLMs like Zephyr, Llama2 and they seem to work
Playing around with the accelerate settings fixed it for me.
same issue
Playing around with the accelerate settings fixed it for me.
could you elaborate more, please ? thanks
Hi everyone!
In order to fix this issue, you need to make sure to force-load the model into a single GPU and replicate that across all GPUs, to achieve this, please follow the solution proposed here: https://github.com/huggingface/accelerate/issues/1840#issuecomment-1683105994