That's what they say but I just spent 10 minutes searching the git repo, reading the relavent .py files and looking at their homepage and the vicuna-7b-delta and vicuna-13b-delta-v0 files are no where to be found. Am I blind or did they announce a release without actually releasing?
If you follow this command in their instruction, the delta will be automatically downloaded and applied to the base model.
https://github.com/lm-sys/FastChat#vicuna-13b:
`python3 -m fastchat.model.apply_delta --base /path/to/llama-13b --target /output/path/to/vicuna-13b --delta lmsys/vicuna-13b-delta-v0`
This can be then quantized to the llama.cpp/gpt4all format, right? Specifically, this only tweaks the existing weights slightly, without changing the structure?
You can use this command to apply the delta weights. (https://github.com/lm-sys/FastChat#vicuna-13b)
The delta weights are hosted on huggingface and will be automatically downloaded.
> Unfortunately there's a mismatch between the model generated by the delta patcher and the tokenizer (32001 vs 32000 tokens). There's a tool to fix this at llama-tools (https://github.com/Ronsor/llama-tools). Add 1 token like (C controltoken), and then run the conversion script.