Taiwan Models
Collection
1 item
•
Updated
基於 taide/Llama-3.1-TAIDE-LX-8B-Chat 使用 llama.cpp b4739 版量化過的模型。
C:\kuwa\GenAI OS\windows\executors\taide
目錄下,並將原始模型taide-8b-a.3-q4_k_m.gguf
備份到此目錄外,並刪除run.bat
Llama-3.1-TAIDE-LX-8B-Chat-Q4_K_M.gguf
到 C:\kuwa\GenAI OS\windows\executors\taide
目錄中C:\kuwa\GenAI OS\windows\executors\taide\init.bat
,使用以下設定值3
Llama-3.1 TAIDE LX-8B Chat Q4_K_M
llama-3.1-taide-lx-8b-chat-q4_k_m
--stop "<|eot_id|>"
Llama-3.1-TAIDE-LX-8B-Chat-Q4_K_M.gguf
到任意目錄中genai-os/docker/compose/sample/taide-llamacpp.yaml
設定檔,填入以下內容services:
llamacpp-executor:
image: kuwaai/model-executor
environment:
EXECUTOR_TYPE: llamacpp
EXECUTOR_ACCESS_CODE: llama-3.1-taide-lx-8b-chat-q4_k_m
EXECUTOR_NAME: Llama-3.1 TAIDE LX-8B Chat Q4_K_M
EXECUTOR_IMAGE: llamacpp.png # Refer to src/multi-chat/public/images
depends_on:
- executor-builder
- kernel
- multi-chat
command: ["--model_path", "/var/model/Llama-3.1-TAIDE-LX-8B-Chat-Q4_K_M.gguf", "--temperature", "0"]
# or use GPU
# command: ["--model_path", "/var/model/Llama-3.1-TAIDE-LX-8B-Chat-Q4_K_M.gguf", "--ngl", "-1", "--temperature", "0"]
restart: unless-stopped
volumes: ["/path/to/Llama-3.1-TAIDE-LX-8B-Chat-Q4_K_M.gguf:/var/model/Llama-3.1-TAIDE-LX-8B-Chat-Q4_K_M.gguf"] # Remember to change path
# Uncomment to use GPU
# deploy:
# resources:
# reservations:
# devices:
# - driver: nvidia
# device_ids: ['0']
# capabilities: [gpu]
networks: ["backend"]
genai-os/docker
目錄下執行 ./run.sh
即可啟動新模型關於量化Llama-3.1-TAIDE-LX-8B-Chat的方法與挑戰可以參考San-Li Hsu的筆記"使用llama.cpp將Hugging Face模型權重(safetensors)轉換成GGUF並進行量化"。
Base model
taide/Llama-3.1-TAIDE-LX-8B-Chat