Import module

Import module accesses and crawls external repositories and gathers the data. In a pre-processing step called format conversion, various formats (pdf, MS Word, HTML etc.) are transformed into an internal format so that documents are accessible to other modules in a standardized way.