Text Generation
GGUF
English
mixture of experts
Mixture of Experts
4x8B
32 bit enhanced
float 32 quants
LLama MOE
uncensored
creative
creative writing
fiction writing
plot generation
sub-plot generation
story generation
scene continue
storytelling
fiction story
science fiction
romance
all genres
story
writing
vivid prosing
vivid writing
fiction
roleplaying
bfloat16
swearing
rp
horror
mergekit
Inference Endpoints
conversational
Update README.md
Browse files
README.md
CHANGED
@@ -76,7 +76,7 @@ Several prompts and outputs below.
|
|
76 |
|
77 |
<B>QUANTS From Float 32 (32-bit) Source:</B>
|
78 |
|
79 |
-
- All quants have been
|
80 |
- All quants have also been upgraded with "more bits" for output tensor (all set at Q8_0) and embed for better performance (this is in addition to the "refresh")
|
81 |
- New specialized quants (in addition to the new refresh/upgrades): "max, max-cpu" (will include this in the file name) for quants "Q2K", "IQ4_XS", "Q6_K" and "Q8_0"
|
82 |
- "MAX": output tensor / embed at float 32. You get better instruction following/output generation than standard/upgraded quants.
|
|
|
76 |
|
77 |
<B>QUANTS From Float 32 (32-bit) Source:</B>
|
78 |
|
79 |
+
- All quants have been quanted with the lastest LLAMACPP improvements : Better instruction following, output generation across all quants.
|
80 |
- All quants have also been upgraded with "more bits" for output tensor (all set at Q8_0) and embed for better performance (this is in addition to the "refresh")
|
81 |
- New specialized quants (in addition to the new refresh/upgrades): "max, max-cpu" (will include this in the file name) for quants "Q2K", "IQ4_XS", "Q6_K" and "Q8_0"
|
82 |
- "MAX": output tensor / embed at float 32. You get better instruction following/output generation than standard/upgraded quants.
|