View on GitHub

Document classifier

Which type of document is this? Upload a PDF or JPG
and find out now.

Image by Kirsty Pargeter on Vecteezy

Our AI model had a look at your document.

Image by Maew Dulaekan on Vecteezy

Powered by artificial intelligence

How does it work? The machine model in the background identifies typical visual features, distinguishing 9 different document types.

Training data were provided by the Organisation for Corruption and Crime Reporting Project (OCCRP).

For technical information see our paper or visit our GitHub repository.

Image by Rhetorika Studio on Vecteezy

Feedback? Suggestions?

We are happy to share our work or to assist you in setting up your own document classifier, customised for your needs. Feel free to contact us via ...

Image by Atcharanee on Vecteezy

Cite us

If you want to cite this, please refer to our paper: ...

Made by...

...
José Miguel Cordero Carvacho

Department of Computer Science, University of Chile, Santiago, Chile

...
Theresa Henn

Department of Information Systems and Social Networks, University of Bamberg, Bamberg, Germany

...
Frederik Holtel

M.A. Political Sciences/International Economic Policy, Free University Berlin and Sciences Po Paris

...
Jose Angel Sanchez Gomez

Department of Statistics and Operations Research, University of North Carolina at Chapel Hill, Chapel Hill, USA

...
Diego Arenas

German Research Center for Artificial Intelligence (DFKI), Kaiserslautern, Germany

...
Andrea Sipka

German Research Center for Artificial Intelligence (DFKI), Kaiserslautern, Germany

...
Sebastian Vollmer

Department of Computer Science, Rhineland-Palatinate Technical University at Kaiserslautern-Landau, Germany

In cooperation with OCCRP