PsychoPy is an open-source package for creating psychology stimuli in Python (A real and free alternative to Matlab). PsychoPy combines the graphical strengths of OpenGL with the easy Python syntax to give psychophysics a free and simple stimulus presentation and control package. The goal is to provide, for the busy scientist (including me!), tools to control timing and windowing and a simple set of pre-packaged stimuli and methods. The code is platform independent, using Python and C libraries that are widely available.
science neuroscience experiment experimental-design experiment-control psychophysics psycholinguistics linguistics psychopy psychologyCurated list of Sentiment Analysis methods, implementations and misc. The goal of this repository is to provide adequate links for scholars who want to research in this domain; and at the same time, be sufficiently accessible for developers who want to integrate sentiment analysis into their applications.
sentiment-analysis awesome-list machine-learning deep-learning supervised-machine-learning nlp linguisticsLingo is a linguistics module, currently providing inflection and some string transformations. Eventually I would like to extend its capabilities and add additional languages.Can be viewed here.
language linguistics inflectionPyNLPl, pronounced as 'pineapple', is a Python library for Natural Language Processing. It contains various modules useful for common, and less common, NLP tasks. PyNLPl can be used for basic tasks such as the extraction of n-grams and frequency lists, and to build simple language model. There are also more complex data types and algorithms. Moreover, there are parsers for file formats common in NLP (e.g. FoLiA/Giza/Moses/ARPA/Timbl/CQL). There are also clients to interface with various NLP specific servers. PyNLPl most notably features a very extensive library for working with FoLiA XML (Format for Linguistic Annotatation). The library is a divided into several packages and modules. It works on Python 2.7, as well as Python 3.
nlp computational-linguistics linguistics library folia machine-learning language-modelling search-algorithms evaluation-metrics text-processing nlp-library natural-language-processinglingtypology package connects R with the Glottolog database (v. 2.7) and provides additional functionality for linguistic mapping. The Glottolog database contains the catalogue of the world's languages. This package helps researchers to make linguistic maps, using philosophy of the Cross-Linguistic Linked Data project, which uniform access to the data across publications. This package is based on leaflet package, so lingtypology package is a package for linguistic interactive mapping.Sometimes installation failed because of the absence of the package crosstalk. Just install it using command install.packages("crosstalk").
clld linguistics typology linguistic-maps autotype wals phoible glottolog-database afbo sails abvdPROSODIC is a a metrical-phonological parser written in Python. Currently, it can parse English and Finnish text, but adding additional languages is easy with a pronunciation dictionary or a custom python function. PROSODIC was built by Ryan Heuser, Josh Falk, and Arto Anttila, beginning in the summer of 2010. Josh also maintains another repository, in which he has rewritten the part of this project that does phonetic transcription for English and Finnish. Sam Bowman has contributed to the codebase as well, adding several new metrical constraints. PROSODIC does two main things. First, it tokenizes text into words, and then converts each word into its stressed, syllabified, phonetic transcription. Second, if desired, it finds the best available metrical parse for each line of text. In the style of Optimality Theory, (almost) all logically possibile parses are attempted, but the best parses are those that least violate a set of user-defined constraints. The default metrical constraints are those proposed by Kiparsky and Hanson in their paper "A Parametric Theory of Poetic Meter" (Language, 1996). See below for how these and other constraints are implemented.
metrical-parser linguistics nlp finnish-language-analysis poetry rhythmThe Wiki for this repository is used as the general developer Wiki for Voikko.
linguistics spelling dictionary libraryResources for conservation, development, and documentation of low resource (human) languages. According to some estimates, half of the 7,000~ currently spoken languages are expected to become extinct this century (Krauss 1992). However, there is a lot of work by academics, independent scholars, organizations, communities, and individuals which goes towards stopping or slowing this trend. This list is intended to provide a list of open source code that would be useful for documenting, conserving, developing, preserving, or working with endangered languages.
endangered-languages natural-language language-resources human-language natural-language-processing language-learning language-documentation resourced-languages awesome-list awesome list minority-language low-resource-languages lrls nlp endangered languages linguistics low-resource resourcesSkipgram and flexgram extraction are computationally more demanding but have been implemented with similar optimisations. Skipgrams are computed by abstracting over n-grams, and flexgrams in turn are computed either by abstracting over skipgrams, or directly from n-grams on the basis of co-occurrence information (mutual pointwise information). At the heart of the sofware is the notion of pattern models. The core tool, to be used from the command-line, is colibri-patternmodeller which enables you to build pattern models, generate statistical reports, query for specific patterns and relations, and manipulate models.
c-plus-plus nlp ngrams skipgram ngram corpus linguistics library text-processing computational-linguistics pattern-recognitionFLAT is a web-based linguistic annotation environment based around the FoLiA format (http://proycon.github.io/folia), a rich XML-based format for linguistic annotation. FLAT allows users to view annotated FoLiA documents and enrich these documents with new annotations, a wide variety of linguistic annotation types is supported through the FoLiA paradigm. It is a document-centric tool that fully preserves and visualises document structure. FLAT is written in Python using the Django framework. The user interface is written using javascript with jquery. The FoLiA Document Server (https://github.com/proycon/foliadocserve) , the back-end of the system, is written in Python with CherryPy and is used as a RESTful webservice.
nlp annotation-tool web-application folia computational-linguistics linguisticsFoLiA is an XML-based annotation format, suitable for the representation of linguistically annotated language resources. FoLiA’s intended use is as a format for storing and/or exchanging language resources, including corpora. Our aim is to introduce a single rich format that can accommodate a wide variety of linguistic annotation types through a single generalised paradigm. We do not commit to any label set, language or linguistic theory. This is always left to the developer of the language resource, and provides maximum flexibility. XML is an inherently hierarchic format. FoLiA does justice to this by maximally utilising a hierarchic, inline, setup. We inherit from the D-Coi format, which posits to be loosely based on a minimal subset of TEI. Because of the introduction of a new and much broader paradigm, FoLiA is not backwards-compatible with D-Coi, i.e. validators for D-Coi will not accept FoLiA XML. It is however easy to convert FoLiA to less complex or verbose formats such as the D-Coi format, or plain-text. Converters are provided.
nlp computational-linguistics xml file-format linguistics corpus language library foliaThis is the R package to support phonetic spelling algorithms in R. Several packages provide the Soundex algorithm. However, other algorithms have been developed since Soundex that can also provide phonetic spelling and test phonetic similarity. This work used the Extreme Science and Engineering Discovery Environment (XSEDE), which is supported by National Science Foundation grant number ACI-1548562. In particular, it used the Comet system at the San Diego Supercomputing Center (SDSC) through allocations TG-DBS170012 and TG-ASC150024.
phonetic-spelling-algorithms soundex phonics nysiis metaphone text-processing linguistics record-linkagepoem-gen is a poem generator created for NaNoGenMo 2014 by Camden Segal. It uses large source texts from Project Gutenberg to make poems. The source texts are converted into word maps - each word is linked with all words that are used before it - so the generator can imitate the usage of the word from the source text.
poem computational linguistics language noveltyIf you have any one-liner cool facts that aren't already here, submit a PR so we both learn! Include your citations inline as link(s). Unfortunately, according to Github, I cannot stop you from forking this repository. So, whatever licence that is, whatever licence this inherits.
agile ansible anthropology apple archery aws biology business chemistry coffeescript django docker driving es6 flask random korean linguistics womenYou have now defeated English class.
linguistics language text-analysis summarizationResources for conservation, development, and documentation of endangered, minority, and low or under-resourced human languages. There is no centralized list of open-source code that would be useful for documenting, conserving, developing, preserving, or working with endangered languages. According to some estimates, half of the 7,000~ currently spoken languages are expected to become extinct this century (Wikipedia). However, there is a lot of work by academics, independent scholars, organizations, communities, and individuals which goes towards stopping or slowing this trend. This list is intended to provide a central location to document those efforts.
endangered-languages natural-language language-resources human-language natural-language-processing language-learning language-documentation resourced-languages awesome-list awesome list minority-language endangered languages linguistics low-resource resourcesExpletives vomiting library...
linguistics nlp expletives bad-words vulgaritiesCurrent documentation can be viewed at https://textgridtools.readthedocs.io/en/stable/.
praat textgrid elan annotation data-analysis linguisticsThis gem allows the user to generate words for constructed languages, given a LANG file that describes the language. It can also be useful for linguistics to study and generate valid words from a descripted language. The *.lang file must include sets of phonemes with their individual probability weight, and a grammatical expression, to describe how to generate words for the described language.
conlangs constructed-language language phonemes linguistics esperanto
We have large collection of open source products. Follow the tags from
Tag Cloud >>
Open source products are scattered around the web. Please provide information
about the open source projects you own / you use.
Add Projects.