Tesseract vs. GOT-OCR2_0: Which Performs Better for Text Extraction from Images?

#17
by bubbleMilkTea - opened

I'm curious about the differences between Tesseract and GOT-OCR2_0. Which one performs better?
My main goal is to convert an image file into general Markdown format. Do you recommend using GOT-OCR2_0's plain text OCR to extract text and then applying markdownify, or using GOT-OCR2_0's formatted text OCR to extract Mathpix Markdown and convert it to general Markdown? Which approach would be more efficient, and which would you recommend?

I also had this question, and it was my goal to find and receive mathematical expressions from a photo. In my experience, GOT-OCR2_0 is far better at handling problems than Tesseract.

GOT-OCR2_0 is a better option because it is a model that can be fine-tuned on custom data. It performs well in extracting text from images even though the image has some noise as well as if it is not clear, but it has limitations when it comes to handwritten images, as it struggles to accurately extract text from them. If needed, we can fine-tune this model to improve performance.

On the other hand, Tesseract is a tool specifically designed for OCR tasks. It produces accurate output when the image is clear and free of noise. However, like GOT-OCR2_0, it also struggles to extract text accurately from handwritten images.

Sign up or log in to comment