Automatically dispatch documents based on business rules

Content Script represents an extremely flexible tool when it comes to defining business rules that can be applied to classify and dispatch incoming documents. The wide range of Content Script APIs currently available allow to access different properties of the incoming documents and to process them as needed.

In this recipe, we will be extracting basic properties of each document, such as its name and file type, as well as more advanced features, such as standard and custom properties of MS Office documents. Once this information is available, we can feed it to our business rules to define where to dispatch the document.

In this example, we will dispatch the documents by moving them to a “target” folder, each representing a document classification. We will consider 3 sample classifications for incoming documents: “Accounting”, “Human Resources”, and “Unclassified”, the latter for documents that don’t match any defined business rules.

def globalInbox = docman.getNodebyName('01. Global Inbox')
def accountingInbox = docman.getNodeByName('02. Accounting Inbox')
def humanResourcesInbox = docman.getNodeByName('03. Human Resources Inbox')
def unclassifiedInbox = docman.getNodeByName('99. Unclassified Inbox')

The sample business rules are applied as follows:

(1) IF the document is an MS Office document (Word or Excel), check the document properties for a property named “Document Classification”. If found, that will be the final value.

(2) IF nothing is found, process the document name for specific keywords. ‘Employee’, ‘Policy’, ‘Letter’ will identify an HR document. ‘Invoice’, ‘Purchase Order’ will identify an Accounting document.

(3) IF no match is found, the document is flagged as “Unclassified”.

Let’s define a closure for the document classification operations:

def classifyDocument = { document ->

    String documentClassification
    
    String docxMimeType = "application/vnd.openxmlformats-officedocument.wordprocessingml.document"
    String xlsxMimeType = "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet"
    
    if( document.subtype == 144 ){ // Only process documents

        //
        // 1. Attempt to extract classification from document properties (for MS Office files)
        //

        if( document.mimeType == docxMimeType ){ // docx document

            // Read custom properties
            def wordDocProperties = docx.loadWordDoc(document).listCustomProperties()
            documentClassification = wordDocProperties?."Document Classification"

        } else if( document.mimeType == xlsxMimeType ){ // xlsx document

            // Read custom properties
            def excelProperties = xlsx.loadSpreadsheet(document).listCustomProperties()
            documentClassification = excelProperties?."Document Classification"

        } else { // generic document (PDF, etc)
            // Do nothing
        }

        //
        // 2. Apply other business rules
        // 

        if( !documentClassification ){

            // Try to use the document name

            def humanResourcesKeys = [ 'Employee', 'Policy', 'Letter' ]
            def accountingKeys     = [ 'Invoice', 'Purchase Order' ]

            if( humanResourcesKeys.find{ key -> document.name.toLowerCase().contains( key.toLowerCase() ) } ){

                documentClassification = "Human Resources"

            } else if(accountingKeys.find{ key -> document.name.toLowerCase().contains( key.toLowerCase() ) } ){

                documentClassification = "Accounting"

            }

        }

    }
    
    return documentClassification

}

Once we have a classification, we can use it for dispatching. Let’s define a closure for that too:

def dispatchDocument = { document, documentClassification ->

    //
    // Apply dispatching rules to select the target folder
    // 

    def targetFolder

    switch( documentClassification ){

        case "Accounting": 
            targetFolder = accountingInbox
            break

        case "Human Resources": 
            targetFolder = humanResourcesInbox
            break

        default :
            targetFolder = unclassifiedInbox
            break

    }

    docman.moveNode( doc, targetFolder ) // Move in target folder

}

As a last step, we will simply run the classification and dispatching routines on each document in the Global Inbox:

globalInbox.getChildrenFast().each{ document ->
    
    dispatchDocument(document, classifyDocument(document) )
    
}