---------------------------------------- ----------------------------------------

Text-Garden Command Line Utilities

---------------------------------------- ----------------------------------------

(c) Marko Grobelnik, Dunja Mladenic
Department of Knowledge Technologies
Jozef Stefan Institute, Slovenia

Text-Garden Components enable easy handling of text documents for the purpose of data analysis including automatic model generation and document classification, document clustering, document visualization, dealing with Web documents, crawling the Web and many other. The code is written in C++ and originally runs on Windows platform and using Wine or similar utility can be run on Linux/Unix. The code was developed through our own research needs guided by our research projects and refined/polished as the time permitted. The top level components build on the core of the software are contributed through the time by several people from our group including Luka Bradesko, Janez Brank, Blaz Fortuna, Miha Grcar, Jure Leskovec, Blaz Novak.

Please reference the Web site <www.textmining.net>, if you are using any of the provided utilities.

----------------------------------------

Lexical text processing

Lexical text processing in Text-Garden includes operations such as tokenization, stop-words, stemming, n-grams, Wordnet usage. The functionality is covered mainly through parameters of utility transforming textual data into vector representation Bag-Of-Words format with the file extension ".Bow"

----------------------------------------

Unsupervised Learning

----------------------------------------

Semi-Supervised Learning

----------------------------------------

Supervised Learning

----------------------------------------

Classification of Documents

----------------------------------------

Feature construction/extraction

----------------------------------------

Visualization of documents based on clustering

----------------------------------------

Visualization of documents based on semantic-space

----------------------------------------

Crawling

---------------------------------------- ----------------------------------------