Document Classification Models

  • Project Type: Work Project
  • Operating System & Browser: Windows 11 & MS edge
  • Project Universe Link or Workspace/Project ID: None. This isn’t on Roboflow.

I am searching for any good models that can perform document Classification. My documents are PDFs that are structured. They are invoices, delivery notes, identity documents etc.

I am building a classifier that can run in CPU which will perform classification of these documents based on the content in the docs. I will be converting the pdfs and then passing to an image classifier.

I need help on finding which lightweight classifiers are best for the task. Data is enough for training. I need to know of a good classifier for this purpose.

This is my first post so I apologise for any things missing or not following the formats properly. Been using roboflow for quite some time and knew about it from Piotr on LinkedIn.

If you’re looking to train your own document classifier, I’d recommend using ResNet for a small/fast option to run on CPU. A ViT would be more accurate though may not run on your CPU. Upload your documents, assign them class labels, and train a model. Both are able to be custom trained in Roboflow today.

1 Like

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.