Edit model card

πŸ‡ΉπŸ‡­ OpenThaiGPT 14b 1.5 Instruct

OpenThaiGPT
More Info

πŸ‡ΉπŸ‡­ OpenThaiGPT 14b Version 1.5 is an advanced 14-billion-parameter Thai language chat model based on Qwen v2.5 released on October 13, 2024. It has been specifically fine-tuned on over 2,000,000 Thai instruction pairs and is capable of answering Thai-specific domain questions.

Online Demo:

https://demo72b.aieat.or.th/

Example code for API Calling

https://github.com/OpenThaiGPT/openthaigpt1.5_api_examples

Highlights

  • State-of-the-art Thai language LLM, achieving the highest average scores across various Thai language exams compared to other open-source Thai LLMs.
  • Multi-turn conversation support for extended dialogues.
  • Retrieval Augmented Generation (RAG) compatibility for enhanced response generation.
  • Impressive context handling: Processes up to 131,072 tokens of input and generates up to 8,192 tokens, enabling detailed and complex interactions.
  • Tool calling support: Enables users to efficiently call various functions through intelligent responses.

Benchmark on OpenThaiGPT Eval

** Please take a look at openthaigpt/openthaigpt1.5-14b-instruct for this model's evaluation result.

Exam names scb10x/llama-3-typhoon-v1.5x-70b-instruct Qwen/Qwen2.5-14B-Instruct openthaigpt/openthaigpt1.5-14b openthaigpt/openthaigpt1.5-72b
01_a_level 59.17% 61.67% 65.00% 76.67%
02_tgat 46.00% 44.00% 50.00% 46.00%
03_tpat1 52.50% 60.00% 52.50% 55.00%
04_investment_consult 60.00% 76.00% 72.00% 72.00%
05_facebook_beleble_th_200 87.50% 84.50% 87.00% 90.00%
06_xcopa_th_200 84.50% 85.00% 86.50% 90.50%
07_xnli2.0_th_200 62.50% 69.50% 64.50% 70.50%
08_onet_m3_thai 76.00% 76.00% 84.00% 84.00%
09_onet_m3_social 95.00% 90.00% 90.00% 95.00%
10_onet_m3_math 43.75% 43.75% 12.50% 37.50%
11_onet_m3_science 53.85% 50.00% 53.85% 73.08%
12_onet_m3_english 93.33% 93.33% 93.33% 96.67%
13_onet_m6_thai 55.38% 52.31% 56.92% 56.92%
14_onet_m6_math 41.18% 23.53% 41.18% 41.18%
15_onet_m6_social 67.27% 60.00% 61.82% 65.45%
16_onet_m6_science 50.00% 50.00% 57.14% 67.86%
17_onet_m6_english 73.08% 82.69% 78.85% 90.38%
Micro Average 69.97% 71.00% 71.51% 76.73%

Thai language multiple choice exams, Test on unseen test set, Zero-shot learning. Benchmark source code and exams information: https://github.com/OpenThaiGPT/openthaigpt_eval

(Updated on: 13 October 2024)

Benchmark on scb10x/thai_exam

Models Thai Exam (Acc)
api/claude-3-5-sonnet-20240620 69.2
openthaigpt/openthaigpt1.5-72b-instruct* 64.07
api/gpt-4o-2024-05-13 63.89
hugging-quants/Meta-Llama-3.1-405B-Instruct-AWQ-INT4 63.54
openthaigpt/openthaigpt1.5-14b-instruct* 59.65
scb10x/llama-3-typhoon-v1.5x-70b-instruct 58.76
Qwen/Qwen2-72B-Instruct 58.23
meta-llama/Meta-Llama-3.1-70B-Instruct 58.23
Qwen/Qwen2.5-14B-Instruct 57.35
api/gpt-4o-mini-2024-07-18 54.51
openthaigpt/openthaigpt1.5-7b-instruct* 52.04
SeaLLMs/SeaLLMs-v3-7B-Chat 51.33
openthaigpt/openthaigpt-1.0.0-70b-chat 50.09

* Evaluated by OpenThaiGPT team using scb10x/thai_exam.

(Updated on: 13 October 2024)

Licenses

  • Built with Qwen
  • Qwen License: Allow Research and Commercial uses but if your user base exceeds 100 million monthly active users, you need to negotiate a separate commercial license. Please see LICENSE file for more information.

Sponsors

Supports

Prompt Format

Prompt format is based on ChatML.

<|im_start|>system\n{sytem_prompt}<|im_end|>\n<|im_start|>user\n{prompt}<|im_end|>\n<|im_start|>assistant\n

System prompt:

ΰΈ„ΰΈΈΰΈ“ΰΈ„ΰΈ·ΰΈ­ΰΈœΰΈΉΰΉ‰ΰΈŠΰΉˆΰΈ§ΰΈ’ΰΈ•ΰΈ­ΰΈšΰΈ„ΰΈ³ΰΈ–ΰΈ²ΰΈ‘ΰΈ—ΰΈ΅ΰΉˆΰΈ‰ΰΈ₯าดแΰΈ₯ΰΈ°ΰΈ‹ΰΈ·ΰΉˆΰΈ­ΰΈͺΰΈ±ΰΈ•ΰΈ’ΰΉŒ

Examples

Single Turn Conversation Example

<|im_start|>system\nΰΈ„ΰΈΈΰΈ“ΰΈ„ΰΈ·ΰΈ­ΰΈœΰΈΉΰΉ‰ΰΈŠΰΉˆΰΈ§ΰΈ’ΰΈ•ΰΈ­ΰΈšΰΈ„ΰΈ³ΰΈ–ΰΈ²ΰΈ‘ΰΈ—ΰΈ΅ΰΉˆΰΈ‰ΰΈ₯าดแΰΈ₯ΰΈ°ΰΈ‹ΰΈ·ΰΉˆΰΈ­ΰΈͺΰΈ±ΰΈ•ΰΈ’ΰΉŒ<|im_end|>\n<|im_start|>user\nΰΈͺΰΈ§ΰΈ±ΰΈͺΰΈ”ΰΈ΅ΰΈ„ΰΈ£ΰΈ±ΰΈš<|im_end|>\n<|im_start|>assistant\n

Single Turn Conversation with Context (RAG) Example

<|im_start|>system\nΰΈ„ΰΈΈΰΈ“ΰΈ„ΰΈ·ΰΈ­ΰΈœΰΈΉΰΉ‰ΰΈŠΰΉˆΰΈ§ΰΈ’ΰΈ•ΰΈ­ΰΈšΰΈ„ΰΈ³ΰΈ–ΰΈ²ΰΈ‘ΰΈ—ΰΈ΅ΰΉˆΰΈ‰ΰΈ₯าดแΰΈ₯ΰΈ°ΰΈ‹ΰΈ·ΰΉˆΰΈ­ΰΈͺΰΈ±ΰΈ•ΰΈ’ΰΉŒ<|im_end|>\n<|im_start|>user\nΰΈΰΈ£ΰΈΈΰΈ‡ΰΉ€ΰΈ—ΰΈžΰΈ‘ΰΈ«ΰΈ²ΰΈ™ΰΈ„ΰΈ£ ΰΉ€ΰΈ›ΰΉ‡ΰΈ™ΰΉ€ΰΈ‘ΰΈ·ΰΈ­ΰΈ‡ΰΈ«ΰΈ₯ΰΈ§ΰΈ‡ นครแΰΈ₯ΰΈ°ΰΈ‘ΰΈ«ΰΈ²ΰΈ™ΰΈ„ΰΈ£ΰΈ—ΰΈ΅ΰΉˆΰΈ‘ΰΈ΅ΰΈ›ΰΈ£ΰΈ°ΰΈŠΰΈ²ΰΈΰΈ£ΰΈ‘ΰΈ²ΰΈΰΈ—ΰΈ΅ΰΉˆΰΈͺΰΈΈΰΈ”ΰΈ‚ΰΈ­ΰΈ‡ΰΈ›ΰΈ£ΰΈ°ΰΉ€ΰΈ—ΰΈ¨ΰΉ„ΰΈ—ΰΈ’ ΰΈΰΈ£ΰΈΈΰΈ‡ΰΉ€ΰΈ—ΰΈžΰΈ‘ΰΈ«ΰΈ²ΰΈ™ΰΈ„ΰΈ£ΰΈ‘ΰΈ΅ΰΈžΰΈ·ΰΉ‰ΰΈ™ΰΈ—ΰΈ΅ΰΉˆΰΈ—ΰΈ±ΰΉ‰ΰΈ‡ΰΈ«ΰΈ‘ΰΈ” 1,568.737 ΰΈ•ΰΈ£.กฑ. ΰΈ‘ΰΈ΅ΰΈ›ΰΈ£ΰΈ°ΰΈŠΰΈ²ΰΈΰΈ£ΰΈ•ΰΈ²ΰΈ‘ΰΈ—ΰΈ°ΰΉ€ΰΈšΰΈ΅ΰΈ’ΰΈ™ΰΈ£ΰΈ²ΰΈ©ΰΈŽΰΈ£ΰΈΰΈ§ΰΉˆΰΈ² 8 ΰΈ₯ΰΉ‰ΰΈ²ΰΈ™ΰΈ„ΰΈ™\nΰΈΰΈ£ΰΈΈΰΈ‡ΰΉ€ΰΈ—ΰΈžΰΈ‘ΰΈ«ΰΈ²ΰΈ™ΰΈ„ΰΈ£ΰΈ‘ΰΈ΅ΰΈžΰΈ·ΰΉ‰ΰΈ™ΰΈ—ΰΈ΅ΰΉˆΰΉ€ΰΈ—ΰΉˆΰΈ²ΰΉ„ΰΈ£ΰΉˆ<|im_end|>\n<|im_start|>assistant\n

Multi Turn Conversation Example

First turn
<|im_start|>system\nΰΈ„ΰΈΈΰΈ“ΰΈ„ΰΈ·ΰΈ­ΰΈœΰΈΉΰΉ‰ΰΈŠΰΉˆΰΈ§ΰΈ’ΰΈ•ΰΈ­ΰΈšΰΈ„ΰΈ³ΰΈ–ΰΈ²ΰΈ‘ΰΈ—ΰΈ΅ΰΉˆΰΈ‰ΰΈ₯าดแΰΈ₯ΰΈ°ΰΈ‹ΰΈ·ΰΉˆΰΈ­ΰΈͺΰΈ±ΰΈ•ΰΈ’ΰΉŒ<|im_end|>\n<|im_start|>user\nΰΈͺΰΈ§ΰΈ±ΰΈͺΰΈ”ΰΈ΅ΰΈ„ΰΈ£ΰΈ±ΰΈš<|im_end|>\n<|im_start|>assistant\n
Second turn
<|im_start|>system\nΰΈ„ΰΈΈΰΈ“ΰΈ„ΰΈ·ΰΈ­ΰΈœΰΈΉΰΉ‰ΰΈŠΰΉˆΰΈ§ΰΈ’ΰΈ•ΰΈ­ΰΈšΰΈ„ΰΈ³ΰΈ–ΰΈ²ΰΈ‘ΰΈ—ΰΈ΅ΰΉˆΰΈ‰ΰΈ₯าดแΰΈ₯ΰΈ°ΰΈ‹ΰΈ·ΰΉˆΰΈ­ΰΈͺΰΈ±ΰΈ•ΰΈ’ΰΉŒ<|im_end|>\n<|im_start|>user\nΰΈͺΰΈ§ΰΈ±ΰΈͺΰΈ”ΰΈ΅ΰΈ„ΰΈ£ΰΈ±ΰΈš<|im_end|>\n<|im_start|>assistant\nΰΈͺΰΈ§ΰΈ±ΰΈͺΰΈ”ΰΈ΅ΰΈ„ΰΈ£ΰΈ±ΰΈš ΰΈ’ΰΈ΄ΰΈ™ΰΈ”ΰΈ΅ΰΈ•ΰΉ‰ΰΈ­ΰΈ™ΰΈ£ΰΈ±ΰΈšΰΈ„ΰΈ£ΰΈ±ΰΈš ΰΈ„ΰΈΈΰΈ“ΰΈ•ΰΉ‰ΰΈ­ΰΈ‡ΰΈΰΈ²ΰΈ£ΰΉƒΰΈ«ΰΉ‰ΰΈ‰ΰΈ±ΰΈ™ΰΈŠΰΉˆΰΈ§ΰΈ’ΰΈ­ΰΈ°ΰΉ„ΰΈ£ΰΈ„ΰΈ£ΰΈ±ΰΈš?<|im_end|>\n<|im_start|>user\nΰΈΰΈ£ΰΈΈΰΈ‡ΰΉ€ΰΈ—ΰΈžΰΈ‘ΰΈ«ΰΈ²ΰΈ™ΰΈ„ΰΈ£ ΰΈŠΰΈ·ΰΉˆΰΈ­ΰΉ€ΰΈ•ΰΉ‡ΰΈ‘ΰΈ’ΰΈ²ΰΈ§ΰΉ†ΰΈ„ΰΈ·ΰΈ­ΰΈ­ΰΈ°ΰΉ„ΰΈ£<|im_end|>\n<|im_start|>assistant\n
Result
<|im_start|>system\nΰΈ„ΰΈΈΰΈ“ΰΈ„ΰΈ·ΰΈ­ΰΈœΰΈΉΰΉ‰ΰΈŠΰΉˆΰΈ§ΰΈ’ΰΈ•ΰΈ­ΰΈšΰΈ„ΰΈ³ΰΈ–ΰΈ²ΰΈ‘ΰΈ—ΰΈ΅ΰΉˆΰΈ‰ΰΈ₯าดแΰΈ₯ΰΈ°ΰΈ‹ΰΈ·ΰΉˆΰΈ­ΰΈͺΰΈ±ΰΈ•ΰΈ’ΰΉŒ<|im_end|>\n<|im_start|>user\nΰΈͺΰΈ§ΰΈ±ΰΈͺΰΈ”ΰΈ΅ΰΈ„ΰΈ£ΰΈ±ΰΈš<|im_end|>\n<|im_start|>assistant\nΰΈͺΰΈ§ΰΈ±ΰΈͺΰΈ”ΰΈ΅ΰΈ„ΰΈ£ΰΈ±ΰΈš ΰΈ’ΰΈ΄ΰΈ™ΰΈ”ΰΈ΅ΰΈ•ΰΉ‰ΰΈ­ΰΈ™ΰΈ£ΰΈ±ΰΈšΰΈ„ΰΈ£ΰΈ±ΰΈš ΰΈ„ΰΈΈΰΈ“ΰΈ•ΰΉ‰ΰΈ­ΰΈ‡ΰΈΰΈ²ΰΈ£ΰΉƒΰΈ«ΰΉ‰ΰΈ‰ΰΈ±ΰΈ™ΰΈŠΰΉˆΰΈ§ΰΈ’ΰΈ­ΰΈ°ΰΉ„ΰΈ£ΰΈ„ΰΈ£ΰΈ±ΰΈš?<|im_end|>\n<|im_start|>user\nΰΈΰΈ£ΰΈΈΰΈ‡ΰΉ€ΰΈ—ΰΈžΰΈ‘ΰΈ«ΰΈ²ΰΈ™ΰΈ„ΰΈ£ ΰΈŠΰΈ·ΰΉˆΰΈ­ΰΉ€ΰΈ•ΰΉ‡ΰΈ‘ΰΈ’ΰΈ²ΰΈ§ΰΉ†ΰΈ„ΰΈ·ΰΈ­ΰΈ­ΰΈ°ΰΉ„ΰΈ£<|im_end|>\n<|im_start|>assistant\nΰΈŠΰΈ·ΰΉˆΰΈ­ΰΉ€ΰΈ•ΰΉ‡ΰΈ‘ΰΈ‚ΰΈ­ΰΈ‡ΰΈΰΈ£ΰΈΈΰΈ‡ΰΉ€ΰΈ—ΰΈžΰΈ‘ΰΈ«ΰΈ²ΰΈ™ΰΈ„ΰΈ£ΰΈ„ΰΈ·ΰΈ­ \"ΰΈΰΈ£ΰΈΈΰΈ‡ΰΉ€ΰΈ—ΰΈžΰΈ‘ΰΈ«ΰΈ²ΰΈ™ΰΈ„ΰΈ£ อฑรรัตนโกΰΈͺΰΈ΄ΰΈ™ΰΈ—ΰΈ£ΰΉŒ ΰΈ‘ΰΈ«ΰΈ΄ΰΈ™ΰΈ—ΰΈ£ΰΈ²ΰΈ’ΰΈΈΰΈ˜ΰΈ’ΰΈ² ΰΈ‘ΰΈ«ΰΈ²ΰΈ”ΰΈ΄ΰΈ₯กภพ ΰΈ™ΰΈžΰΈ£ΰΈ±ΰΈ•ΰΈ™ΰΈ£ΰΈ²ΰΈŠΰΈ˜ΰΈ²ΰΈ™ΰΈ΅ΰΈšΰΈΉΰΈ£ΰΈ΅ΰΈ£ΰΈ‘ΰΈ’ΰΉŒ ΰΈ­ΰΈΈΰΈ”ΰΈ‘ΰΈ£ΰΈ²ΰΈŠΰΈ™ΰΈ΄ΰΉ€ΰΈ§ΰΈ¨ΰΈ™ΰΉŒΰΈ‘ΰΈ«ΰΈ²ΰΈͺΰΈ–ΰΈ²ΰΈ™ ΰΈ­ΰΈ‘ΰΈ£ΰΈžΰΈ΄ΰΈ‘ΰΈ²ΰΈ™ΰΈ­ΰΈ§ΰΈ•ΰΈ²ΰΈ£ΰΈͺΰΈ–ΰΈ΄ΰΈ• ΰΈͺักกะทัตติฒวิษณุกรรฑประΰΈͺΰΈ΄ΰΈ—ΰΈ˜ΰΈ΄ΰΉŒ\"

How to use

Free API Service (hosted by Siam.Ai and Float16.cloud)

Siam.AI

curl https://api.aieat.or.th/v1/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer dummy" \
  -d '{
    "model": ".",
    "prompt": "<|im_start|>system\nΰΈ„ΰΈΈΰΈ“ΰΈ„ΰΈ·ΰΈ­ΰΈœΰΈΉΰΉ‰ΰΈŠΰΉˆΰΈ§ΰΈ’ΰΈ•ΰΈ­ΰΈšΰΈ„ΰΈ³ΰΈ–ΰΈ²ΰΈ‘ΰΈ—ΰΈ΅ΰΉˆΰΈ‰ΰΈ₯าดแΰΈ₯ΰΈ°ΰΈ‹ΰΈ·ΰΉˆΰΈ­ΰΈͺΰΈ±ΰΈ•ΰΈ’ΰΉŒ<|im_end|>\n<|im_start|>user\nΰΈΰΈ£ΰΈΈΰΈ‡ΰΉ€ΰΈ—ΰΈžΰΈ‘ΰΈ«ΰΈ²ΰΈ™ΰΈ„ΰΈ£ΰΈ„ΰΈ·ΰΈ­ΰΈ­ΰΈ°ΰΉ„ΰΈ£<|im_end|>\n<|im_start|>assistant\n",
    "max_tokens": 512,
    "temperature": 0.7,
    "top_p": 0.8,
    "top_k": 40,
    "stop": ["<|im_end|>"]
  }'

Float16

curl -X POST https://api.float16.cloud/dedicate/78y8fJLuzE/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer float16-AG0F8yNce5s1DiXm1ujcNrTaZquEdaikLwhZBRhyZQNeS7Dv0X" \
  -d '{
    "model": "openthaigpt/openthaigpt1.5-7b-instruct",
    "messages": [
      {
        "role": "system",
        "content": "ΰΈ„ΰΈΈΰΈ“ΰΈ„ΰΈ·ΰΈ­ΰΈœΰΈΉΰΉ‰ΰΈŠΰΉˆΰΈ§ΰΈ’ΰΈ•ΰΈ­ΰΈšΰΈ„ΰΈ³ΰΈ–ΰΈ²ΰΈ‘ΰΈ—ΰΈ΅ΰΉˆΰΈ‰ΰΈ₯าดแΰΈ₯ΰΈ°ΰΈ‹ΰΈ·ΰΉˆΰΈ­ΰΈͺΰΈ±ΰΈ•ΰΈ’ΰΉŒ"
      },
      {
        "role": "user",
        "content": "ΰΈͺΰΈ§ΰΈ±ΰΈͺΰΈ”ΰΈ΅"
      }
    ]
   }'

OpenAI Client Library (Hosted by VLLM, please see below.)

import openai

# Configure OpenAI client to use vLLM server
openai.api_base = "http://127.0.0.1:8000/v1"
openai.api_key = "dummy"  # vLLM doesn't require a real API key

prompt = "<|im_start|>system\nΰΈ„ΰΈΈΰΈ“ΰΈ„ΰΈ·ΰΈ­ΰΈœΰΈΉΰΉ‰ΰΈŠΰΉˆΰΈ§ΰΈ’ΰΈ•ΰΈ­ΰΈšΰΈ„ΰΈ³ΰΈ–ΰΈ²ΰΈ‘ΰΈ—ΰΈ΅ΰΉˆΰΈ‰ΰΈ₯าดแΰΈ₯ΰΈ°ΰΈ‹ΰΈ·ΰΉˆΰΈ­ΰΈͺΰΈ±ΰΈ•ΰΈ’ΰΉŒ<|im_end|>\n<|im_start|>user\nΰΈΰΈ£ΰΈΈΰΈ‡ΰΉ€ΰΈ—ΰΈžΰΈ‘ΰΈ«ΰΈ²ΰΈ™ΰΈ„ΰΈ£ΰΈ„ΰΈ·ΰΈ­ΰΈ­ΰΈ°ΰΉ„ΰΈ£<|im_end|>\n<|im_start|>assistant\n"

try:
    response = openai.Completion.create(
        model=".",  # Specify the model you're using with vLLM
        prompt=prompt,
        max_tokens=512,
        temperature=0.7,
        top_p=0.8,
        top_k=40,
        stop=["<|im_end|>"]
    )
    print("Generated Text:", response.choices[0].text)
except Exception as e:
    print("Error:", str(e))

Huggingface

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "openthaigpt/openthaigpt1.5-14b-instruct"

model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype="auto",
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(model_name)

prompt = "ΰΈ›ΰΈ£ΰΈ°ΰΉ€ΰΈ—ΰΈ¨ΰΉ„ΰΈ—ΰΈ’ΰΈ„ΰΈ·ΰΈ­ΰΈ­ΰΈ°ΰΉ„ΰΈ£"
messages = [
    {"role": "system", "content": "ΰΈ„ΰΈΈΰΈ“ΰΈ„ΰΈ·ΰΈ­ΰΈœΰΈΉΰΉ‰ΰΈŠΰΉˆΰΈ§ΰΈ’ΰΈ•ΰΈ­ΰΈšΰΈ„ΰΈ³ΰΈ–ΰΈ²ΰΈ‘ΰΈ—ΰΈ΅ΰΉˆΰΈ‰ΰΈ₯าดแΰΈ₯ΰΈ°ΰΈ‹ΰΈ·ΰΉˆΰΈ­ΰΈͺΰΈ±ΰΈ•ΰΈ’ΰΉŒ"},
    {"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

generated_ids = model.generate(
    **model_inputs,
    max_new_tokens=512
)
generated_ids = [
    output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]

response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]

vLLM

  1. Install VLLM (https://github.com/vllm-project/vllm)

  2. Run server

vllm serve openthaigpt/openthaigpt1.5-14b-instruct --tensor-parallel-size 4 
  • Note, change --tensor-parallel-size 4 to the amount of available GPU cards.

If you wish to enable tool calling feature, add --enable-auto-tool-choice --tool-call-parser hermes into command. e.g.,

vllm serve openthaigpt/openthaigpt1.5-14b-instruct --tensor-parallel-size 4 --enable-auto-tool-choice --tool-call-parser hermes
  1. Run inference (CURL example)
curl -X POST 'http://127.0.0.1:8000/v1/completions' \
-H 'Content-Type: application/json' \
-d '{
  "model": ".",
  "prompt": "<|im_start|>system\nΰΈ„ΰΈΈΰΈ“ΰΈ„ΰΈ·ΰΈ­ΰΈœΰΈΉΰΉ‰ΰΈŠΰΉˆΰΈ§ΰΈ’ΰΈ•ΰΈ­ΰΈšΰΈ„ΰΈ³ΰΈ–ΰΈ²ΰΈ‘ΰΈ—ΰΈ΅ΰΉˆΰΈ‰ΰΈ₯าดแΰΈ₯ΰΈ°ΰΈ‹ΰΈ·ΰΉˆΰΈ­ΰΈͺΰΈ±ΰΈ•ΰΈ’ΰΉŒ<|im_end|>\n<|im_start|>user\nΰΈͺΰΈ§ΰΈ±ΰΈͺΰΈ”ΰΈ΅ΰΈ„ΰΈ£ΰΈ±ΰΈš<|im_end|>\n<|im_start|>assistant\n",
  "max_tokens": 512,
  "temperature": 0.7,
  "top_p": 0.8,
  "top_k": 40,
  "stop": ["<|im_end|>"]
}'

Processing Long Texts

The current config.json is set for context length up to 32,768 tokens. To handle extensive inputs exceeding 32,768 tokens, we utilize YaRN, a technique for enhancing model length extrapolation, ensuring optimal performance on lengthy texts.

For supported frameworks, you could add the following to config.json to enable YaRN:

{
  ...
  "rope_scaling": {
    "factor": 4.0,
    "original_max_position_embeddings": 32768,
    "type": "yarn"
  }
}

Tool Calling

The Tool Calling feature in OpenThaiGPT 1.5 enables users to efficiently call various functions through intelligent responses. This includes making external API calls to retrieve real-time data, such as current temperature information, or predicting future data simply by submitting a query. For example, a user can ask OpenThaiGPT, β€œWhat is the current temperature in San Francisco?” and the AI will execute a pre-defined function to provide an immediate response without the need for additional coding. This feature also allows for broader applications with external data sources, including the ability to call APIs for services such as weather updates, stock market information, or data from within the user’s own system.

Example:

import openai

def get_temperature(location, date=None, unit="celsius"):
    """Get temperature for a location (current or specific date)."""
    if date:
        return {"temperature": 25.9, "location": location, "date": date, "unit": unit}
    return {"temperature": 26.1, "location": location, "unit": unit}

tools = [
    {
        "name": "get_temperature",
        "description": "Get temperature for a location (current or by date).",
        "parameters": {
            "location": "string", "date": "string (optional)", "unit": "enum [celsius, fahrenheit]"
        },
    }
]

messages = [{"role": "user", "content": "ΰΈ­ΰΈΈΰΈ“ΰΈ«ΰΈ ΰΈΉΰΈ‘ΰΈ΄ΰΈ—ΰΈ΅ΰΉˆ San Francisco วันนม้มแΰΈ₯ΰΈ°ΰΈžΰΈ£ΰΈΈΰΉ‰ΰΉˆΰΈ‡ΰΈ™ΰΈ΅ΰΉ‰ΰΈ„ΰΈ·ΰΈ­ΰΉ€ΰΈ—ΰΉˆΰΈ²ΰΉ„ΰΈ£ΰΉˆ?"}]

# Simulated response flow using OpenThaiGPT Tool Calling
response = openai.ChatCompletion.create(
    model=".", messages=messages, tools=tools, temperature=0.7, max_tokens=512
)

print(response)

Full example: https://github.com/OpenThaiGPT/openthaigpt1.5_api_examples/blob/main/api_tool_calling_powered_by_siamai.py

GPU Memory Requirements

Number of Parameters FP 16 bits 8 bits (Quantized) 4 bits (Quantized) Example Graphic Card for 4 bits
7b 24 GB 12 GB 6 GB Nvidia RTX 4060 8GB
14b 48 GB 24 GB 12 GB Nvidia RTX 4070 16GB
72b 192 GB 96 GB 48 GB Nvidia RTX 4090 24GB x 2 cards

Authors

Disclaimer: Provided responses are not guaranteed.

Downloads last month
1,113
Safetensors
Model size
14.8B params
Tensor type
BF16
Β·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for openthaigpt/openthaigpt1.5-14b-instruct

Adapters
1 model
Quantizations
4 models

Collection including openthaigpt/openthaigpt1.5-14b-instruct

Evaluation results