Text To Compact-Documents Converter

The utility is intended to perform transformations from various raw text formats as Text-Base ("-itbs"), Reuters-2000 ("-ir2000"), and Reuters-21578 ("-ir21578") into the file in Compact-Documents format ".Cpd" ("-o"). With the parameter "-docs" the number of converted documents is determined (value "-1" means "all documents"). With the "-test" parameter an additional test at the end of the transformation is performed to check the correctness of the obtained ".Cpd" file.

usage: Txt2Cpd.exe
-itbs:Input-TextBase-File (default:'')
-ir2000:Input-Reuters2000-Path (default:'')
-ir21578:Input-Reuters21578-Path (default:'')
-o:Output-Cpd-FileName (default:'.')
-docs:Documents (default:-1)
-test:Testing (default:'F')

Txt2Cpd.exe -ir21578:Reuters21578Data -o:.

The above example call takes the original Reuters *.sgm files at the  directory 'Reuters21578Data' (-ir21578:) and produces an output file 'Reuters21578.Cpd' in the Cpd (Compact-Documents) format at the current directory (-o:.).