Description of Slovenian Lemmatizer
Purpose and Functionality
Web service offering lemmatization for Slovenian language. The underlying model for lemmatization was automatically generated using machine learning approach, in particular Ripple Down Rules. In comparison with the standard if-then classification rules, the Ripple Down Rules representation resembles decision lists of the form if-then-else: new rules are added by creating except or else branches to the existing rules.
Availability, Preconditions and Licensing
The service is available at http://nl2.ijs.si/analyze/. The model is publicly available and can be downloaded at http://nl2.ijs.si/analyze/lemRDR.tgz. The Web service can be used in two ways: through the HTML page and via SOAP protocol.
The tool is free for research purposes.
Publications
PLISSON, Joël, LAVRAČ, Nada, MLADENIĆ, Dunja. A rule based approach to word lemmatization. V: TRČEK, Denis (ur.), LIKAR, Borut (ur.), GROBELNIK, Marko (ur.), MLADENIĆ, Dunja (ur.), GAMS, Matjaž (ur.), BOHANEC, Marko (ur.). Zbornik C 7. mednarodne multi-konference Informacijska družba IS 2004, 9. do 15. oktober 2004, (Informacijska družba). Ljubljana: Institut “Jožef Stefan”, 2004, 83-86.
PLISSON, Joël, MLADENIĆ, Dunja, LAVRAČ, Nada, ERJAVEC, Tomaž. A lemmatization web service based on machine learning techniques. V: VETULANI, Zygmunt (ur.). 2nd Language & Technology Conference, April 21-23,2005, Poznań, Poland. Human language technologies as a challenge for computer science and linguistics : in memory of Maurice Gross and Antonio Zampolli : proceedings. Poznań: Wydawnictwo Poznańskie Sp. z o.o., 2005, 369-372.