1 to 10 of 17 Results
Dec 20, 2018
Jacques Fize, 2018, "BVLAC corpus - Extracted Data", doi:10.18167/DVN1/8LIG1D, CIRAD Dataverse, V1
[FR] Dans le cadre du projet SONGES sur la mise en correspondance de données textuelles massives et hétérogènes, nous élaborons des modèles de représentation de données ainsi que des mesures de similarité à partir d’indicateurs trouvés dans les textes (thématiques, spatiaux et te... |
Dec 14, 2018
Fize, Jacques, 2018, "Données pour l'évaluation de méthodes de géocodage", doi:10.18167/DVN1/KH7YTO, CIRAD Dataverse, V1
[FR] Ce jeu de données contient les toponymes de différents documents et leur référence dans le gazetier Geodict (http://dx.doi.org/10.18167/DVN1/MWQQOQ). Les documents utilisés proviennent de deux sources de texte : PadiWeb et AgroMada. [EN] This dataset contains the toponyms of... |
Sep 29, 2018
Coste, Caroline; Roche, Mathieu; Falala, Sylvain; Touré, Ibra; Bonnet, Pascal, 2018, "Corpus en anglais sur la Mobilité", doi:10.18167/DVN1/GGBWWL, CIRAD Dataverse, V1
Les questions liées à « mobilité » confèrent un caractère clairement pluridisciplinaire en Sciences Sociales (migrations, démographie, etc.) mais également dans d’autres domaines comme la Santé (par exemple, risques en épidémiologie) ou en Agriculture (par exemple, études liées à... |
Aug 20, 2018
Roche, Mathieu; Arsevska, Elena, 2018, "PADI-web: ASF corpora", doi:10.18167/DVN1/POIZMA, CIRAD Dataverse, V4
Both corpora (news articles) have been manually collected using the query "african swine fever outbreak" with Google. These corpora in English have been semi-automatically normalized. They can be used as (a) input of BioTex tool in order to extract terminology, (b) input of Weka... |
May 29, 2018
Arsevska, Elena; Valentin, Sarah; Rabatel, Julien; de Goër de Hervé, Jocelyn; Falala, Sylvain; Lancelot, Renaud; Roche, Mathieu, 2017, "PADI-web dataset manually evaluated [1st January - 28th June 2016]", doi:10.18167/DVN1/JZM34U, CIRAD Dataverse, V2
Data are downloaded from PADI-web for the period from 1st January to 28th June 2016 for the four studied diseases, i.e. African swine fever (ASF), foot-and-mouth disease (FMD), bluetongue (BTV), and avian influenza (AI). This dataset indicates information associated with "extract... |
Mar 6, 2018
Fize, Jacques; Gaurav, Shrivastava, 2017, "Geodict: an integrated gazetteer", doi:10.18167/DVN1/MWQQOQ, CIRAD Dataverse, V3
[EN] Geodict is a gazetteer where 12 millions spatial entities are referenced. Each entry is associated with basic yet detailed information such as multi-lingual labels, polygon of boundaries, coordinates, class, etc. Geodict data are extracted from famous dataset: Geonames, Wiki... |
Dec 18, 2017
Rabatel Julien; Arsevska, Elena; de Goër de Hervé, Jocelyn; Falala, Sylvain; Lancelot, Renaud; Roche, Mathieu, 2017, "PADI-web corpus: news manually labeled", doi:10.18167/DVN1/KMTIFG, CIRAD Dataverse, V2
This dataset contains a set of news articles in English related to animal disease outbreaks, that have been used to evaluate and train the information extraction module of the PADI-web system (http://epia.clermont.inra.fr/vsi). It is composed of 532 articles (in JSON), with infor... |
Sep 19, 2017
Roche, Mathieu; Teisseire, Maguelonne; Shrivastava, Gaurav, 2017, "Valorcarn-TETIS: Terms extracted with Biotex", doi:10.18167/DVN1/PGQGQL, CIRAD Dataverse, V1
Text-Mining: Terms extracted with Biotex tool (http://tubo.lirmm.fr/biotex) from "Valorcarn Corpus" (http://dx.doi.org/10.18167/DVN1/7YTQGQ). -- Valorcarn Project (2015-2017) [project supported by GloFoodS program (INRA-Cirad)]. Topic: Mining of scientific documents for identific... |
Sep 19, 2017
Roche, Mathieu; Teisseire, Maguelonne; Shrivastava, Gaurav, 2017, "Valorcarn-TETIS: Terms extracted with Rake", doi:10.18167/DVN1/YGYL3W, CIRAD Dataverse, V1
Text-Mining: Terms extracted with Rake tool (https://github.com/aneesha/RAKE) from "Valorcarn Corpus" (http://dx.doi.org/10.18167/DVN1/7YTQGQ). -- Valorcarn Project (2015-2017) [project supported by GloFoodS program (INRA-Cirad)]. Topic: Mining of scientific documents for identif... |
Sep 19, 2017
Roche, Mathieu; Teisseire, Maguelonne; Shrivastava, Gaurav, 2017, "Valorcarn-TETIS: Fusion of terms extracted with Biotex and Fastr", doi:10.18167/DVN1/CFBIYD, CIRAD Dataverse, V1
Text-Mining: Fusion of terms extracted with Biotex and Fastr from "Valorcarn Corpus" (http://dx.doi.org/10.18167/DVN1/7YTQGQ). -- Valorcarn Project (2015-2017) [project supported by GloFoodS program (INRA-Cirad)]. Topic: Mining of scientific documents for identification of proces... |