Text Generation
Transformers
English
code
Eval Results
Inference Endpoints
bartowski's picture
Quant for 8.0
c626bf2 verified
|
raw
history blame
1.91 kB
metadata
language:
  - en
library_name: transformers
pipeline_tag: text-generation
datasets:
  - teknium/OpenHermes-2.5
  - TokenBender/python_eval_instruct_51k
  - codefuse-ai/Evol-instruction-66k
tags:
  - code
license: apache-2.0
model-index:
  - name: SpeechlessCoder
    results:
      - task:
          type: text-generation
        dataset:
          type: openai_humaneval
          name: HumanEval
        metrics:
          - name: pass@1
            type: pass@1
            value: 0
            verified: false

speechless-starcoder2-7b

Code: https://github.com/uukuguy/speechless

Use the following dataset to fine-tune bigcode/starcoder2-7b in order to improve the model's reasoning and planning abilities.

Total 986k samples.

  • teknium/OpenHermes-2.5
  • TokenBender/python_eval_instruct_51k
  • Spider
  • codefuse-ai/Evol-instruction-66k

How to Prompt the Model

This model accepts the Alpaca instruction format.

For example:

You are an intelligent programming assistant.

### Instruction:
Implement a linked list in C++

### Response:

HumanEval

Metric Value
humaneval-python

lm-evaluation-harness

{'ARC (acc_norm)': ,
'HellaSwag (acc_norm)': ,
'MMLU (acc)': ,
'TruthfulQA (mc2)': ,
'Winoground (acc)': ,
'GSM8K (acc)': ,
'DROP (f1)': ,
'Open LLM Score': }

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric Value
Avg.
ARC (25-shot)
HellaSwag (10-shot)
MMLU (5-shot)
TruthfulQA (0-shot)
Winogrande (5-shot)
GSM8K (5-shot)
DROP (3-shot)