This config also has cache disabled.
#1
by
autobots
- opened
With cache enabled:
Output generated in 1.79 seconds (7.83 tokens/s, 14 tokens, context 39, seed 453061271)
Output generated in 5.06 seconds (2.37 tokens/s, 12 tokens, context 1848, seed 115540957)
Still a little slower than a normal 7b but very usable.
Thank you! I forgot about this one. Fixed now.
TheBloke
changed discussion status to
closed