Model description

This is a Vicuna-like model with only 68M parameters, which is fine-tuned from LLaMA-68m on ShareGPT data.

The training setup follows the Vicuna suite.

The model is mainly developed as a base Small Speculative Model in the MCSD paper. As a comparison, it can be better aligned to the Vicuna models than LLaMA-68m with little loss of alignment to the LLaMA models.

Draft Model Target Model Alignment
LLaMA-68/160M LLaMA-13/33B πŸ˜ƒ
LLaMA-68/160M Vicuna-13/33B 😟
Vicuna-68/160M LLaMA-13/33B πŸ˜ƒ
Vicuna-68/160M Vicuna-13/33B πŸ˜ƒ
Downloads last month
2,846
Safetensors
Model size
68M params
Tensor type
F32
Β·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Model tree for double7/vicuna-68m

Quantizations
1 model

Dataset used to train double7/vicuna-68m