mlx_lm.server gives wonky answers
#49
by
conleysa
- opened
Hello! I am noticing that when I run llama-3-8b-instruct using mlx_lm.server, I get strange answers. Like I ask it for a query and it tells me about dog breeds. On the other hand, if I use from mlx_lm.load and mlx_lm.generate, I get reasonable responses.
Is there any reason the new llama-3 shouldn't be run from the mlx_lm server?
I can run as a server using llama-2-13b and get reasonable responses.
1232