Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

That's what they say but I just spent 10 minutes searching the git repo, reading the relavent .py files and looking at their homepage and the vicuna-7b-delta and vicuna-13b-delta-v0 files are no where to be found. Am I blind or did they announce a release without actually releasing?


If you follow this command in their instruction, the delta will be automatically downloaded and applied to the base model. https://github.com/lm-sys/FastChat#vicuna-13b: `python3 -m fastchat.model.apply_delta --base /path/to/llama-13b --target /output/path/to/vicuna-13b --delta lmsys/vicuna-13b-delta-v0`


This can be then quantized to the llama.cpp/gpt4all format, right? Specifically, this only tweaks the existing weights slightly, without changing the structure?


I may have missed the detail, but it also expects the pytorch conversion rather than original LLaMa model.


Yes, you need to convert the original LLaMA model to the huggingface format, according to https://github.com/lm-sys/FastChat#vicuna-weights and https://huggingface.co/docs/transformers/main/model_doc/llam...


You can use this command to apply the delta weights. (https://github.com/lm-sys/FastChat#vicuna-13b) The delta weights are hosted on huggingface and will be automatically downloaded.


Thanks! https://huggingface.co/lmsys/vicuna-13b-delta-v0

Edit, later: I found some instructive pages on how to use the vicuna weights with llama.cpp (https://lmsysvicuna.miraheze.org/wiki/How_to_use_Vicuna#Use_...) and pre-made ggml format compatible 4-bit quantized vicuna weights, https://huggingface.co/eachadea/ggml-vicuna-13b-4bit/tree/ma... (8GB ready to go, no 60+GB RAM steps needed)


I did try, but got:

``` ValueError: Tokenizer class LLaMATokenizer does not exist or is not currently imported. ```


> Unfortunately there's a mismatch between the model generated by the delta patcher and the tokenizer (32001 vs 32000 tokens). There's a tool to fix this at llama-tools (https://github.com/Ronsor/llama-tools). Add 1 token like (C controltoken), and then run the conversion script.


Just rename it in the tokenconfig.json


Thanks, that indeed worked!

This and using conda in wsl2, instead on bare windows




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: