phate334/multilingual-e5-large-gguf

This model was converted to GGUF format from intfloat/multilingual-e5-large using llama.cpp.

Run it

  • Deploy using Docker
$ docker run -p 8080:8080 -v ./multilingual-e5-large-q4_k_m.gguf:/multilingual-e5-large-q4_k_m.gguf ghcr.io/ggerganov/llama.cpp:server--b1-4b9afbb --host 0.0.0.0 --embedding -m /multilingual-e5-large-q4_k_m.gguf

or Docker Compose

services:
  e5-f16:
    image: ghcr.io/ggerganov/llama.cpp:server--b1-4b9afbb
    ports:
      - 8080:8080
    volumes:
      - ./multilingual-e5-large-f16.gguf:/multilingual-e5-large-f16.gguf
    command: --host 0.0.0.0 --embedding -m /multilingual-e5-large-f16.gguf
  e5-q4:
    image: ghcr.io/ggerganov/llama.cpp:server--b1-4b9afbb
    ports:
      - 8081:8080
    volumes:
      - ./multilingual-e5-large-q4_k_m.gguf:/multilingual-e5-large-q4_k_m.gguf
    command: --host 0.0.0.0 --embedding -m /multilingual-e5-large-q4_k_m.gguf
Downloads last month
42
GGUF
Model size
559M params
Architecture
bert

4-bit

16-bit

Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for phate334/multilingual-e5-large-gguf

Quantized
(10)
this model