{"id":32,"date":"2011-03-07T10:28:52","date_gmt":"2011-03-07T10:28:52","guid":{"rendered":"http:\/\/ailab.ijs.si\/dunja_mladenic\/"},"modified":"2014-05-19T14:40:13","modified_gmt":"2014-05-19T12:40:13","slug":"textmining-software-tools","status":"publish","type":"page","link":"https:\/\/ailab.ijs.si\/dunja_mladenic\/research\/textmining-software-tools\/","title":{"rendered":"TextMining Software Tools"},"content":{"rendered":"<p style=\"text-align: justify;\">\n<h4 style=\"text-align: justify;\"><a href=\"https:\/\/ailab.ijs.si\/marko_grobelnik\/\" target=\"_blank\">Marko Grobelnik<\/a>, <a href=\"https:\/\/ailab.ijs.si\/dunja_mladenic\/\" target=\"_blank\">Dunja Mladenic<\/a><\/h4>\n<ul style=\"text-align: justify;\">\n<li><a href=\"https:\/\/ailab.ijs.si\/\" target=\"_blank\">Artificial Intelligence Laboratory<\/a><\/li>\n<li><a href=\"http:\/\/www.ijs.si\/\" target=\"_blank\">Jozef Stefan Institute<\/a>, <a href=\"http:\/\/www.slovenia.info\/?lng=2\" target=\"_blank\">Slovenia<\/a><\/li>\n<\/ul>\n<p style=\"text-align: justify;\">Text-Garden is a software library and collection of software tools for solving large scale tasks dealing with structured, semi-structured and unstructured data &#8211; emphasis of functionality is on dealing with text. It can be used in various ways covering research and applicative scenarios. Text-Garden is being used by several institutions including British Telecom, Carnegie Mellon University, Microsoft Research, Cycorp.<\/p>\n<p style=\"text-align: justify;\">\n<p style=\"text-align: justify;\">\n<h2 style=\"text-align: justify;\">Some history<\/h2>\n<p style=\"text-align: justify;\">The development of Text-Garden started in 1996 as a set of C++ classes for dealing with text in order to perform text-learning tasks. There were two people working on it until 2002 and it was developed slowly according to the academic tasks being on our agenda. From 2003 on Text-Garden became central software platform in our <a href=\"https:\/\/ailab.ijs.si\/dunja\/TextWebJSI\/\" target=\"_blank\">research group<\/a> at J. Stefan Institute. Text-Garden is used in a number of research and applicative projects (~10 people contributing).<\/p>\n<p style=\"text-align: justify;\">\n<p style=\"text-align: justify;\">\n<h2 style=\"text-align: justify;\">Technical Aspects<\/h2>\n<p style=\"text-align: justify;\">Text Garden is almost entirely written in portable C++.<\/p>\n<ul style=\"text-align: justify;\">\n<li>It compiles under Windows (Microsoft Visual C++, Borland C++) and Unix\/Linux (GNU C)<\/li>\n<\/ul>\n<ul style=\"text-align: justify;\">\n<li>It runs under 32bit and 64bit platforms<\/li>\n<\/ul>\n<ul style=\"text-align: justify;\">\n<li>It consists of ~200.000 relatively compact lines of code<\/li>\n<\/ul>\n<p style=\"text-align: justify;\">\n<p style=\"text-align: justify;\">\n<h2 style=\"text-align: justify;\">Using Text-Garden Functionality<\/h2>\n<p style=\"text-align: justify;\">Text-Garden functionality can be accessed in a number of ways:<\/p>\n<ul style=\"text-align: justify;\">\n<li>As plain C++ classes giving complete functionality.<\/li>\n<li>As DLL library of ~250 functions giving simplified extract of major functionality.<\/li>\n<li>As command line utilities with ~60 command line utilities getting connected in pipeline. <a href=\"https:\/\/ailab.ijs.si\/dunja_mladenic\/research\/textmining-software-tools\/text-garden-command-line-utilities\/\">Basic utilities<\/a> covering document classification, clustering and visualization can be downloaded under LGPL license.<\/li>\n<li>Through GUI tools developed on the top of Text-Garden, including Document Atlas, OntoGen.<\/li>\n<li>Through interfaces to several platforms with the same API:<\/li>\n<\/ul>\n<blockquote>\n<ul>\n<li>C\/C++ &#8211; through simplified DLL &amp; native C++<\/li>\n<li>Java \ufffd through JNI<\/li>\n<li>.NET \ufffd e.g. accessible through C#, VB, \ufffd<\/li>\n<li>Matlab \ufffd through standard Matlab interface<\/li>\n<li>Python \ufffd through standard Python interface<\/li>\n<li>Mathematica, Prolog, R \ufffd in preparation<\/li>\n<\/ul>\n<\/blockquote>\n<p style=\"text-align: justify;\">The API has ~40 classes and ~250 functions. Interfaces to the all above platforms are generated automatically from the master Text-Garden header file.<\/p>\n<p style=\"text-align: justify;\">\n","protected":false},"excerpt":{"rendered":"<p>Marko Grobelnik, Dunja Mladenic Artificial Intelligence Laboratory Jozef Stefan Institute, Slovenia Text-Garden is a software library and collection of software tools for solving large scale tasks dealing with structured, semi-structured and unstructured data &#8211; emphasis of functionality is on dealing with text. It can be used in various ways covering research and applicative scenarios. Text-Garden [&hellip;]<\/p>\n","protected":false},"author":53,"featured_media":0,"parent":79,"menu_order":1,"comment_status":"closed","ping_status":"open","template":"","meta":{"vkexunit_cta_each_option":"","footnotes":"","_links_to":"","_links_to_target":""},"class_list":["post-32","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"https:\/\/ailab.ijs.si\/dunja_mladenic\/wp-json\/wp\/v2\/pages\/32","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/ailab.ijs.si\/dunja_mladenic\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/ailab.ijs.si\/dunja_mladenic\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/ailab.ijs.si\/dunja_mladenic\/wp-json\/wp\/v2\/users\/53"}],"replies":[{"embeddable":true,"href":"https:\/\/ailab.ijs.si\/dunja_mladenic\/wp-json\/wp\/v2\/comments?post=32"}],"version-history":[{"count":29,"href":"https:\/\/ailab.ijs.si\/dunja_mladenic\/wp-json\/wp\/v2\/pages\/32\/revisions"}],"predecessor-version":[{"id":376,"href":"https:\/\/ailab.ijs.si\/dunja_mladenic\/wp-json\/wp\/v2\/pages\/32\/revisions\/376"}],"up":[{"embeddable":true,"href":"https:\/\/ailab.ijs.si\/dunja_mladenic\/wp-json\/wp\/v2\/pages\/79"}],"wp:attachment":[{"href":"https:\/\/ailab.ijs.si\/dunja_mladenic\/wp-json\/wp\/v2\/media?parent=32"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}