At 4 bit quantization the weights only take half the RAM. You need a good chunk ...

		apitman 3 months ago \| parent \| context \| favorite \| on: Cline and LM Studio: the local coding stack with Q... At 4 bit quantization the weights only take half the RAM. You need a good chunk for context as well, but in my limited testing Qwen3-30B rand well on a single RTX 3090 (24GB VRAM).