perstem - Persian stemmer and morphological analyzer
Persian (Farsi) stemmer, morphological analyzer, transliterator, and partial part-of-speech tagger. Input may be encoded as Perso-Arabic script UTF-8, ISIRI 3342, Windows-1256, SGML/HTML/XML-style numeric character references (ncr), or dehdari-transliterated latin-script text. Use the -i flag to specify input encoding. Output is handled similarly. Thanks to Jace Livingston, David Zajic, and Corey Miller for their comprehensive error analysis and other suggestions. Thanks to Jay Ritch and Artyom Lukanin for spotting bugs.