Html To Xml Converter

Transforms Html documents into cleaned XML documents. The format of the output file (-o parameter) is controled by several parameters which are by default all turned on. '-otxt' enables output of continues parts of text. '-ourl' enables output of urls appearing the text (which may be absolutized by providing a base url in '-u' parameter). '-otok' enables output of single tokens from original html. '-otag' enables output of tags, and '-oarg' enables output of tag parameters.

usage: Html2Xml.exe
-i:Input-Html-File (default:'')
-o:Output-XML-File (default:'')
-u:Base-Url (default:'')
-otxt:Output-Text (default:'T')
-ourl:Output-Urls (default:'T')
-otok:Output-Tokens (default:'T')
-otag:Output-Tags (default:'T')
-oarg:Output-Arguments (default:'T')


Html2Xml.Exe -i:test.html -o:test.xml -u:
The above example call transforms a file named test.html (-i:) into file test.xml (-o:) using as the base URL