Vectorizing module

Vectorizing module generates statistical representation for documents from features extracted in the extraction module.

Multiple vectors per document can be constructed to capture different aspects of the document (for example full content vector, extracted people vector, etc.). Various weighting and feature selection schemes are also available.

Feature vectors are rarely presented to end-users – their intended purpose is for use by other knowledge discovery algorithms.