Description of Slovenian Lemmatizer

Purpose and Functionality

Web service offering lemmatization for Slovenian language. The underlying model for lemmatization was automatically generated using machine learning approach, in particular Ripple Down Rules.  In comparison with the standard if-then classification rules, the Ripple Down Rules representation resembles decision lists of the form if-then-else: new rules are added by creating except or else branches to the existing rules.

Availability, Preconditions and Licensing

The service is available at http://nl2.ijs.si/analyze/. The model is publicly available and can be downloaded at http://nl2.ijs.si/analyze/lemRDR.tgz. The Web service can be used in two ways: through the HTML page and via SOAP protocol.

The tool is free for research purposes.

Publications

PLISSON, Joël, LAVRAČ, Nada, MLADENIĆ, Dunja. A rule based approach to word lemmatization. V: TRČEK, Denis (ur.), LIKAR, Borut (ur.), GROBELNIK, Marko (ur.), MLADENIĆ, Dunja (ur.), GAMS, Matjaž (ur.), BOHANEC, Marko (ur.). Zbornik C 7. mednarodne multi-konference Informacijska družba IS 2004, 9. do 15. oktober 2004, (Informacijska družba). Ljubljana: Institut “Jožef Stefan”, 2004,  83-86.

PLISSON, Joël, MLADENIĆ, Dunja, LAVRAČ, Nada, ERJAVEC, Tomaž. A lemmatization web service based on machine learning techniques. V: VETULANI, Zygmunt (ur.). 2nd Language & Technology Conference, April 21-23,2005, Poznań, Poland. Human language technologies as a challenge for computer science and linguistics : in memory of Maurice Gross and Antonio Zampolli : proceedings. Poznań: Wydawnictwo Poznańskie Sp. z o.o., 2005, 369-372.