Looks like this model is BF16. All together it needs ~36 gigs of vram to run. Maybe with 8bit weights and shrinking the context a little it could fit in 24 gigs of vram?
· Sign up or log in to comment