Whether it are paper documents that need to be digitised or documents that are already digital: often the characteristics of a digital document are missing. Characteristics that are necessary to realise a digital format, or to comply with laws and regulations. In order to assign certain characteristics to a type of document, classification is indispensable. In classification, documents are softwarematically assigned to a predefined document type. Unstructured data is transformed into structured information. But how exactly does the classification process work?

When to classify?

It happens regularly that a company initially only keeps a paper file, but that company wants to digitise this after a certain period. For this purpose, documents from the physical archive must be characterised in predefined document types. In addition, the digital files are classified into document types, so that certain documents can be found more quickly. Let's take a personnel file as an example. When this is digitised, the software uses classification to recognise whether it is an employment contract, a certificate of cunduct, a wage tax statement, a copy of an ID, and so on.

Classification is also used to apply legislation and regulations. For example, by extracting a date from a certain document, the retention period can be determined. For example, do you want to replace the certificate of cunducts of your employees every three years? By using both classification and data extraction, the effective date of the Certificate of Good Conduct can be added to the file attribute, to which you can then apply your deletion policy. There are many more examples where classification (with or without data extraction) can be used.

Recognition rate

A classification project is characterised by a recognition rate. A recognition rate is expressed as a percentage, which indicates for how many documents the software has recognised a document type. The advantage of classification is that you can easily create categories in your documents. This allows you to create a better overview, it fits better with your current working method and it simplifies searching. It is also possible to link individual retention periods to a specific document. This is because documents are classified at the document level.

The classification process

Before you start classifying documents, it is important to make an inventory of the exact number of documents you want to classify and which document types are involved. Next, the document types are set up in the software. This is followed by a training day, during which a number of documents are tested. Then the software can be adjusted and supplemented where necessary. When the software is set up correctly, the bulk project can be classified. After the classification has taken place, any stragglers can still be classified manually afterwards.

