How to annotate scanned document images to train the PaddleOCR model

How to annotate scanned document images to train the PaddleOCR model? I can run PaddleOCR training on my machine with the Total-Text dataset, but now I need to assemble my dataset. I already have the document images, I have already uploaded the images to Roboflow, but I haven’t found any tool that helps annotate scanned documents. Any suggestions or solutions for this?

Another problem is that I can’t type special characters for classes, am I doing something wrong, or am I missing something?

Hey @insinfo

Roboflow’s object detection or instance segmentation project type is not designed for free-text annotation but for class labeling and therefore only supports alphanumeric characters.

It might be worth taking a look at the captioning (text-image pair) dataset type, but that does not allow for localization/object-level annotation

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.