Yahoo Planet, Project Page
Overview
Yahoo Planet is a project where we use the
Yahoo hierarchy of Web documents
as a base for automatic document categorization.
Several top categories are taken as separate problems, and for each an automatic document
classifier is generated.
Demo version of our system
enables automatic categorization of typed text inside sub-hierarchy of
the selected top category. Currently whole documents can be categorized by simply
coping their content into a window and requesting categorization of the "typed" text.
More about Yahoo Planet
- Learning from text hierarchy
- Mladenic, D., Grobelnik, M. (1999)
Assigning keywords to documents using machine learning
(uncompressed)
Proceedings of the 10th International Conference on Information and Intelligent Systems IIS-99,
Varazdin, Croatia, September.
(abstract)
- Mladenic, D., Grobelnik, M. (1998)
Word sequences as features in text-learning
(uncompressed)
Proceedings of the Seventh Electrotechnical and Computer Sc. Conference ERK'98,
Ljubljana, Slovenia: IEEE section, 1998, pp. 145-148.
(abstract)
- Mladenic, D. (1998)
Turning Yahoo into an Automatic Web-Page Classifier
(uncompressed)
Proceedings of the 13th European Conference on Aritficial Intelligence ECAI'98,
pp. 473-474.
(abstract)
- Grobelnik, M., Mladenic, D. (1998)
Efficient text categorization
(uncompressed)
Proceedings of Text Mining Workshop on ECML-98,
Chemnitzer informatik-berichte 0947-5125, pp. 1-10.
(abstract)
- Feature selection
- Mladenic, D., Grobelnik, M. (1999)
Feature selection for unbalanced class distribution and Naive Bayes
(uncompressed)
Proceedings of the 16th International Conference on Machine Learning ICML-99.
Morgan Kaufmann Publishers, San Francisco, CA, pp.258-267.
(abstract)
- Mladenic, D., Grobelnik, M. (1998)
Feature selection for classification based on text hierarchy
(uncompressed)
Working notes of Learning from Text and the Web, Conference on Automated Learning and Discovery CONALD-98.
Carnegie Mellon Univ., Pittsburgh, 6 pages.
(abstract)
- Mladenic, D., (1998)
Feature subset selection in text-learning
(uncompressed)
10th European Conference on Machine Learning ECML98.
Springer, Berlin, pp. 95-100.
(abstract)
- Grobelnik, M., Mladenic, D., (1998)
Learning Machine: design and implementation
(uncompresses),
Technical Report IJS-DP-7824,
Department of Intelligent Systems, J.Stefan Institute,
January, 1998.
Project Members
This project is strongly related to the Personal WebWatcher project,
the Word-Mining project
and the PhD thesis project: Machine Learning on non-homogeneous, distributed text data
Dunja Mladenic
Last modified: Fri Jun 29 20:21:01 EDT 2001