Description of OntoClassify

Purpose and Functionality

System for scalable classification of text into large topic ontologies currently including DMoz and Inspec.

Availability, Preconditions and Licensing

The system is available as Web service. The software runs under Windows platform. The tool is free for research purposes.

Integration with other SEKT Tools

Developed on the top of Text-Garden tool and integrated with Text Garden, extended for the needs of the SEKT project.
Classifier into Inspec is integrated in Search and Browse tool of SEKT WP5 [1].

Publications

MLADENIĆ, Dunja, GROBELNIK, Marko. Feature selection on hierarchy of web documents. Decision support systems, 2003, vol. 35, 45-87.

MLADENIĆ, Dunja, GROBELNIK, Marko. Mapping documents onto web page ontology. In: BERENDT, B. et al (eds.) Web mining : from web to semantic web : First European Web Mining Forum (Lecture notes in artificial inteligence, Lecture notes in computer science, vol. 3209). Berlin; Heidelberg; New York: Springer, 2004, 77-96.

MLADENIĆ, Dunja, BRANK, Janez, GROBELNIK, Marko, MILIĆ-FRAYLING, Nataša. Feature selection using linear classifier weights : interaction with classification models. V: Proceedings of SIGIR 2004 : the Twenty-Seventh Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, [Sheffield], July 25th-29th. New York: Association for Computing Machinery, 2004, 234-241.

GROBELNIK, Marko, MLADENIĆ, Dunja. Simple classification into large topic ontology of web documents. CIT. Journal of Comput. Inf. Technol., 2005, vol. 13, 279-285.

GROBELNIK, Marko, BRANK, Janez, MLADENIĆ, Dunja, NOVAK, Blaž, FORTUNA, Blaž. Using DMoz for constructing ontology from data stream. In: 28th International Conference on Information Technology Interfaces, LUŽAR – STIFFLER, Vesna, HLJUZ DOBRIĆ, Vesna (eds.), June 19-22, 2006, Cavtat/Dubrovnik, Croatia. ITI 2006 : proceedings of the 28th International Conference on Information Technology Interfaces, June 19-22, 2006, Cavtat/Dubrovnik, Croatia, (IEEE Catalog, No. 06EX1244). Zagreb: University of Zagreb, SRCE University Computing Centre, cop. 2006, 439-444.

FORTUNA, Blaž, GROBELNIK, Marko, MLADENIĆ, Dunja. Classification of documents into hierarchy using string kernels. In: 29th Annual Conference of the German Classification Society, March 9-11, 2005, Magdeburg. From data and information analysis to knowledge engineering : program and abstracts. Magdeburg: Otto-von-Guericke-University, 2005, str. 223.

RADOŠEVIĆ, Daniel, DOBŠA, Jasminka, MLADENIĆ, Dunja. Flexible length phrases in document classification. V: LUŽAR – STIFFLER, Vesna (ur.), HLJUZ DOBRIĆ, Vesna (ur.). 28th International Conference on Information Technology Interfaces, June 19-22, 2006, Cavtat/Dubrovnik, Croatia. ITI 2006 : proceedings of the 28th International Conference on Information Technology Interfaces, June 19-22, 2006, Cavtat/Dubrovnik, Croatia, (IEEE Catalog, No. 06EX1244). Zagreb: University of Zagreb, SRCE University Computing Centre, cop. 2006, 457-462.

RADOŠEVIĆ, Daniel, DOBŠA, Jasminka, MLADENIĆ, Dunja, STAPIĆ, Zlatko, NOVAK, Miroslav. Genre document classification using flexible length phrases. V: AURER, Boris (ur.), BAČA, Miroslav (ur.). 17th International Conference on Information and Intelligent Systems, September 20-22, 2006, Varaždin, Croatia. Conference proceedings. Varaždin: Faculty of Organisation and Informatics, FOI, cop. 2006, 23-28.

[1] DAVIES, J., DUKE, Alistair, KINGS, Nick, MLADENIĆ, Dunja, BONTCHEVA, Kalina, GRČAR, Miha, BENJAMINS, Richard, CONTRERAS, Jesus, BLAZQUES CÍVICO, Mercedes, GLOVER, Tim. Next generation knowledge access. Journal of knowledge management, 2005, vol. 9, pp. 64-84.