Description of Text-Garden

Purpose and Functionality

Software Suite for dealing with unstructured data – in particular it covers (1) analysis of multiple modalities (such as text or social networks), (2) cross modal analysis, (3) components for  text and network visualization, (4) scalable implementations of many standard and newer analytic methods from the field of machine learning, text mining, kernel methods etc.

Availability, Preconditions and Licensing

Instalation package is publicly available in binaries at www.textmining.netThe software runs under Windows, Linux (partly) and has interface for working on Matlab, Java, Phyton, R. The tool is free for research purposes.

Integration with other SEKT Tools

Some components of Text-Garden are integrated into KAON2 , DocumentAtlas, OntoGen, SEKTbar, OntoClassify. The software tool was originally developed prior to the SEKT project and extended with some components inside the SEKT project.

Publications

MLADENIĆ, Dunja, GROBELNIK, Marko. Text and web mining. In: Data mining and decision support : integration and collaboration,

MLADENIĆ, Dunja, LAVRAČ, Nada, BOHANEC, Marko, MOYLE, Steve (eds.), The Kluwer international series in engineering and computer science, SECS 745, Boston; Dordrecht; London: Kluwer Academic Publishers, 2003, 13-14.

MLADENIĆ, Dunja, GROBELNIK, Marko. Summarization and visualization. In: ZANASI, A. (ur.). Text mining and its applications to intelligence, CRM and knowledge management, (Advances in management information, vol. 2). Southampton; Boston: WIT, 2003, 131-143.

MLADENIĆ, Dunja. How do approach data analysis of texts. Zb. rad. – Fak. organ. inform. Varažd., 2004, vol. 28,  123-134.

GROBELNIK, Marko, MLADENIĆ, Dunja. Automated knowledge discovery in advanced knowledge management. J. knowl. manag., 2005, vol. 9, 132-149.

MLADENIĆ, Dunja. Text mining in action!. In: From data and information analysis to knowledge engineering : proceedings of the 29th annual conference of the Gesellschaft für Klassifikation e.V., University of Magdeburg, March 9-11, 2005, (Studies in classification, data analysis, and knowledge organization). Berlin: Springer, 2006,  52-62.

MLADENIĆ, Dunja. Feature selection for dimensionality reduction. In: Subspace, latent structure and feature selection : statistical and optimization perspectives workshop, SLSFS 200, SAUNDERS, Craig, GUNN, Steve, SHAWE-TAYLOR, John, GROBELNIK, Marko (eds.), Bohinj, Slovenia, February 23-25, 2005 : revised selected papers, (Lecture notes in computer science, Vol. 3940). Berlin; Heidelberg: Springer, 2006, 84-102.

MLADENIĆ, Dunja. Text mining – machine learning on document. In: WANG, John (ed.). Encyclopedia of data warehousing and mining. Hershey [etc.]: Idea Group Reference, cop. 2006, 1109-1112.

GROBELNIK, Marko, MLADENIĆ, Dunja. Knowledge discovery for ontology construction. In: Semantic web technologies : trends and research in ontology-based systems. DAVIES, JOHN, STUDER, Rudi, WARREN, Paul (eds.), Chichester: John Wiley & Sons, cop. 2006, 9-27.

NOVAK, Blaž, MLADENIĆ, Dunja, GROBELNIK, Marko. Text classification with active learning. In: From data and information analysis to knowledge engineering : proceedings of the 29th annual conference of the Gesellschaft für Klassifikation e.V., University of Magdeburg, March 9-11, 2005, Studies in classification, data analysis, and knowledge organization. Berlin: Springer, 2006, 398-405.

BRANK, Janez, MLADENIĆ, Dunja, GROBELNIK, Marko. Hierarchical text categorization using coding matrices. In: Proceedings of the 9th International multi-conference Information Society IS-2006, BOHANEC, Marko, GAMS, Matjaž, RAJKOVIČ, Vladislav, URBANČIČ, Tanja, BERNIK, Mojca, MLADENIĆ, Dunja, GROBELNIK, Marko, HERIČKO, Marjan, KORDEŠ, Urban, MARKIČ, Olga, MUSEK, Janek, OSREDKAR, Mari Jože, KONONENKO, Igor, NOVAK ŠKARJA, Barbara (eds.), October 2006, Ljubljana: Institut “Jožef Stefan”, 2006, 219-222.

BERENDT, Bettina, HOTHO, Andreas, MLADENIĆ, Dunja, SOMEREN, Maarten W. van, SPILIOPOULOU, Myra, STUMME, Gerd. A roadmap for web mining : from web to ssemantic web. V: BERENDT, Bettina (ur.), HOTHO, Andreas (ur.), MLADENIĆ, Dunja (ur.), SOMEREN, Maarten W. van (ur.), SPILIOPOULOU, Myra (ur.), STUMME, Gerd (ur.). Web mining : from web to semantic web : First European Web Mining Forum, EWMF 2003, Cavtat-Dubrovnik, Croatia, September 22, 2003 : invited and selected revised papers, (Lecture notes in artificial inteligence, Lecture notes in computer science, vol. 3209). Berlin; Heidelberg; New York: Springer, 2004, 1-22.

Publications of Applications built on the top of the tool

GROBELNIK, Marko, MLADENIĆ, Dunja. Visualization of news articles. Informatica, 2004, vol. 28, no. 4, 375-380.

GROBELNIK, Marko, MLADENIĆ, Dunja. Analysis of a database of research projects using text mining and link analysis. In: Data mining and decision support : integration and collaboration, MLADENIĆ, Dunja, LAVRAČ, Nada, BOHANEC, Marko, MOYLE, Steve (eds.), The Kluwer international series in engineering and computer science, SECS 745, Boston; Dordrecht; London: Kluwer Academic Publishers, 2003, 157-166.

MLADENIĆ, Dunja, KAVČIČ-ČOLIĆ, Alenka, GROBELNIK, Marko. Initiatives to preserve Slovenian digital heritage. In: Innovation and knowledge economy: issues, applications, case studies, CUNNINGHAM, Paul, CUNNINGHAM, Miriam (eds.), Information and communication technologies and the knowledge economy. Amsterdam [etc.]: IOS Press, 2005, 993-998.

JORGE, Alípio, ALVES, Mário A., GROBELNIK, Marko, MLADENIĆ, Dunja, PETRAK, Johann. Web site access analysis for a national statistical agency. In: Data mining and decision support : integration and collaboration, MLADENIĆ, Dunja, LAVRAČ, Nada, BOHANEC, Marko, MOYLE, Steve (eds.), The Kluwer international series in engineering and computer science, SECS 745, Boston; Dordrecht; London: Kluwer Academic Publishers, 2003,  167-176.

GROBELNIK, Marko, MLADENIĆ, Dunja, JERMOL, Mitja. Towards the EU IST projects knowledge map and project partners competence directory. In: Fourth European Conference on Knowledge Management : Oriel College, Oxford University, MCGRATH, Fergal, REMENYI, D. (eds.), United Kingdom, 18-19 September 2003.

MLADENIĆ, Dunja. Web browsing using machine learning on text data. In: SZCZEPANIAK, Piotr S. (ed.). Intelligent exploration of the web, (Studies in fuzziness and soft computing, vol. .111). New York; Heidelberg: Physica-Verlag, 2002,  288-303.

FLACH, Peter A., GROBELNIK, Marko, KAVŠEK, Branko, LAVRAČ, Nada, LJUBIČ, Peter, MLADENIĆ, Dunja, TODOROVSKI, Ljupčo. On the road to knowledge : mining 21 years of UK traffic accident reports. In: Data mining and decision support : integration and collaboration, MLADENIĆ, Dunja, LAVRAČ, Nada, BOHANEC, Marko, MOYLE, Steve (eds.), The Kluwer international series in engineering and computer science, SECS 745, Boston; Dordrecht; London: Kluwer Academic Publishers, 2003, 143-155.

GHANI, Rayid, JONES, Rosie, MLADENIĆ, Dunja. Building minority language corpora by learning to generate web search queries. Knowledge and information systems, 2005, vol. 7, 56-83.

MLADENIĆ, Dunja, GROBELNIK, Marko. Visualizing very large graphs using clustering neighborhoods. V: MORIK, Katharina (ur.), BOULICAUT, Jean-François (ur.), SIEBES, Arno (ur.). Local pattern detection : international seminar : Dagstuhl Castle, Germany, April 12-16, 2004 : revised selected papers, (Lecture notes in computer science, Lecture notes in artificial intelligence, 3539), (State-of-the-art survey). Berlin; Heidelberg; New York: Springer, cop. 2005, str. 89-97.

BEVK, Matjaž, MLADENIĆ, Dunja. Analysis of demining project proposals. V: MARKIČ, Olga (ur.), GAMS, Matjaž (ur.), KORDEŠ, Urban (ur.), HERIČKO, Marjan (ur.), MLADENIĆ, Dunja (ur.), GROBELNIK, Marko (ur.), ROZMAN, Ivan (ur.), RAJKOVIČ, Vladislav (ur.), URBANČIČ, Tanja (ur.), BERNIK, Mojca (ur.), BOHANEC, Marko (ur.). Zbornik 8. mednarodne multikonference Informacijska družba IS 2005, 11. do 17. oktober 2005, (Informacijska družba). Ljubljana: Institut “Jožef Stefan”, 2005, str. 208-211.