Text To Bag-Of-Words Converter

The utility is intended to perform transformations from various raw text formats as Distance-Matrix-File ("-mtx"),
Tab-Separated-File ("-tab"), Transactions-File ("-itsc"), Compact-Documents-File ".Cpd" ("-icpd"), Text-Base ("-itbs"),
and Reuters-21578 ("-ir21578") into the file in the Bag-Of-Words format ".Bow" ("-o"). With the parameter "-docs" the number of converted documents are controlled.

usage: Txt2Bow.exe
-imtx:Input-Matrix-File (default:'')
-itab:Input-Tab-File (default:'')
-itsc:Input-Transaction-File (default:'')
-ispr:Input-Sparse-File (default:'')
-icpd:Input-CompactDocuments-File (default:'')
-itbs:Input-TextBase-File (default:'')
-ir21578:Input-Reuters21578-Path (default:'')
-o:Bow-Output-File (.bow) (default:'')
-docs:Documents (default:-1)

Txt2Bow.exe -icpd:Reuters21578.Cpd -o:Reuters21578.Bow

The above example call converts Reuters21578.Cpd that is in Compact-Documents format into Reuters21578.Bow in Bag-Of-Words format.