Update README.md
Browse files
README.md
CHANGED
@@ -10,6 +10,9 @@ metrics:
|
|
10 |
- accuracy
|
11 |
library_name: transformers
|
12 |
---
|
|
|
|
|
|
|
13 |
# Model Card for Euclid-convnext-large (Version on 12/05/2024)
|
14 |
|
15 |
A multimodal large language models specifically trained for strong low-level geometric perception.
|
@@ -23,11 +26,11 @@ Euclid is trained on 1.6M synthetic geometry images with high-fidelity question-
|
|
23 |
It combines a ConvNeXt visual encoder with a Qwen-2.5 language model, connected through a 2-layer MLP multimodal connector.
|
24 |
|
25 |
|
26 |
-
### Model Sources
|
27 |
|
28 |
- **Repository:** https://github.com/euclid-multimodal/Euclid
|
29 |
-
- **Paper:**
|
30 |
-
- **Demo:**
|
31 |
|
32 |
## Uses
|
33 |
|
@@ -83,10 +86,9 @@ Performance on Geoperception benchmark tasks:
|
|
83 |
|
84 |
If you find Euclid useful for your research and applications, please cite using this BibTeX:
|
85 |
```bibtex
|
86 |
-
@
|
87 |
-
|
88 |
-
|
89 |
-
|
90 |
-
|
91 |
-
year={2024}
|
92 |
}
|
|
|
10 |
- accuracy
|
11 |
library_name: transformers
|
12 |
---
|
13 |
+
|
14 |
+
[Euclid: Supercharging Multimodal LLMs with Synthetic High-Fidelity Visual Descriptions](https://arxiv.org/abs/2412.08737)
|
15 |
+
|
16 |
# Model Card for Euclid-convnext-large (Version on 12/05/2024)
|
17 |
|
18 |
A multimodal large language models specifically trained for strong low-level geometric perception.
|
|
|
26 |
It combines a ConvNeXt visual encoder with a Qwen-2.5 language model, connected through a 2-layer MLP multimodal connector.
|
27 |
|
28 |
|
29 |
+
### Model Sources
|
30 |
|
31 |
- **Repository:** https://github.com/euclid-multimodal/Euclid
|
32 |
+
- **Paper:** https://arxiv.org/abs/2412.08737
|
33 |
+
- **Demo:** https://euclid-multimodal.github.io/
|
34 |
|
35 |
## Uses
|
36 |
|
|
|
86 |
|
87 |
If you find Euclid useful for your research and applications, please cite using this BibTeX:
|
88 |
```bibtex
|
89 |
+
@article{zhang2024euclid,
|
90 |
+
title={Euclid: Supercharging Multimodal LLMs with Synthetic High-Fidelity Visual Descriptions},
|
91 |
+
author={Zhang, Jiarui and Liu, Ollie and Yu, Tianyu and Hu, Jinyi and Neiswanger, Willie},
|
92 |
+
journal={arXiv preprint arXiv:2412.08737},
|
93 |
+
year={2024}
|
|
|
94 |
}
|