Post
1832
ever wondered how you can make an API call to a visual-question-answering model without sending an image url 👀
you can do that by converting your local image to base64 and sending it to the API.
recently I made some changes to my library "loadimg" that allows you to make converting images to base64 a breeze.
🔗 https://github.com/not-lain/loadimg
API request example 🛠️:
you can do that by converting your local image to base64 and sending it to the API.
recently I made some changes to my library "loadimg" that allows you to make converting images to base64 a breeze.
🔗 https://github.com/not-lain/loadimg
API request example 🛠️:
from loadimg import load_img
from huggingface_hub import InferenceClient
# or load a local image
my_b64_img = load_img(imgPath_url_pillow_or_numpy ,output_type="base64" )
client = InferenceClient(api_key="hf_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx")
messages = [
{
"role": "user",
"content": [
{
"type": "text",
"text": "Describe this image in one sentence."
},
{
"type": "image_url",
"image_url": {
"url": my_b64_img # base64 allows using images without uploading them to the web
}
}
]
}
]
stream = client.chat.completions.create(
model="meta-llama/Llama-3.2-11B-Vision-Instruct",
messages=messages,
max_tokens=500,
stream=True
)
for chunk in stream:
print(chunk.choices[0].delta.content, end="")