41 to 50 of 53 Results
20 août 2018
Roche, Mathieu; Arsevska, Elena, 2018, "PADI-web: ASF corpora", https://doi.org/10.18167/DVN1/POIZMA, CIRAD Dataverse, V4
Both corpora (news articles) have been manually collected using the query "african swine fever outbreak" with Google. These corpora in English have been semi-automatically normalized. They can be used as (a) input of BioTex tool in order to extract terminology, (b) input of Weka... |
29 mai 2018
Arsevska, Elena; Valentin, Sarah; Rabatel, Julien; de Goër de Hervé, Jocelyn; Falala, Sylvain; Lancelot, Renaud; Roche, Mathieu, 2017, "PADI-web dataset manually evaluated (1st January - 28th June 2016)", https://doi.org/10.18167/DVN1/JZM34U, CIRAD Dataverse, V2
Data are downloaded from PADI-web for the period from 1st January to 28th June 2016 for the four studied diseases, i‧e. African swine fever (ASF), foot-and-mouth disease (FMD), bluetongue (BTV), and avian influenza (AI). This dataset indicates information associated with "extract... |
6 mars 2018
Fize, Jacques; Gaurav, Shrivastava, 2017, "Geodict: an integrated gazetteer", https://doi.org/10.18167/DVN1/MWQQOQ, CIRAD Dataverse, V3
[EN] Geodict is a gazetteer where 12 millions spatial entities are referenced. Each entry is associated with basic yet detailed information such as multi-lingual labels, polygon of boundaries, coordinates, class, etc. Geodict data are extracted from famous dataset: Geonames, Wiki... |
18 déc. 2017
Rabatel, Julien; Arsevska, Elena; de Goër de Hervé, Jocelyn; Falala, Sylvain; Lancelot, Renaud; Roche, Mathieu, 2017, "PADI-web corpus: news manually labeled", https://doi.org/10.18167/DVN1/KMTIFG, CIRAD Dataverse, V2
This dataset contains a set of news articles in English related to animal disease outbreaks, that have been used to evaluate and train the information extraction module of the PADI-web system (http://epia.clermont.inra.fr/vsi). It is composed of 532 articles (in JSON), with infor... |
19 sept. 2017
Roche, Mathieu; Teisseire, Maguelonne; Shrivastava, Gaurav, 2017, "Valorcarn-TETIS: Terms extracted with Biotex", https://doi.org/10.18167/DVN1/PGQGQL, CIRAD Dataverse, V1
Text-Mining: Terms extracted with Biotex tool (http://tubo.lirmm.fr/biotex) from "Valorcarn Corpus" (http://dx.doi.org/10.18167/DVN1/7YTQGQ). -- Valorcarn Project (2015-2017) [project supported by GloFoodS program (INRA-Cirad)]. Topic: Mining of scientific documents for identific... |
19 sept. 2017
Roche, Mathieu; Teisseire, Maguelonne; Shrivastava, Gaurav, 2017, "Valorcarn-TETIS: Terms extracted with Rake", https://doi.org/10.18167/DVN1/YGYL3W, CIRAD Dataverse, V1
Text-Mining: Terms extracted with Rake tool (https://github.com/aneesha/RAKE) from "Valorcarn Corpus" (http://dx.doi.org/10.18167/DVN1/7YTQGQ). Valorcarn Project (2015-2017) [project supported by GloFoodS program (INRA-Cirad)]. Mining of scientific documents for identification of... |
19 sept. 2017
Roche, Mathieu; Teisseire, Maguelonne; Shrivastava, Gaurav, 2017, "Valorcarn-TETIS: Fusion of terms extracted with Biotex and Fastr", https://doi.org/10.18167/DVN1/CFBIYD, CIRAD Dataverse, V1
Text-Mining: Fusion of terms extracted with Biotex and Fastr from "Valorcarn Corpus" (http://dx.doi.org/10.18167/DVN1/7YTQGQ). -- Valorcarn Project (2015-2017) [project supported by GloFoodS program (INRA-Cirad)]. Topic: Mining of scientific documents for identification of proces... |
19 sept. 2017
Roche, Mathieu; Teisseire, Maguelonne; Shrivastava, Gaurav, 2017, "Valorcarn-TETIS: Variations of terms extracted with Fastr (driven extraction)", https://doi.org/10.18167/DVN1/LPBHWP, CIRAD Dataverse, V1
Text mining: Extraction of variations of term extraction. Input: (1) list of terms, (2) corpus ("Valorcarn Corpus" - http://dx.doi.org/10.18167/DVN1/7YTQGQ) For instance, with "biltong samples", we obtain "biltong spice sample", "samples to produce biltong", etc. -- Valorcarn Pro... |
19 sept. 2017
Roche, Mathieu; Teisseire, Maguelonne; Shrivastava, Gaurav, 2017, "Valorcarn-TETIS: Semantic groups of terms", https://doi.org/10.18167/DVN1/0WEHKT, CIRAD Dataverse, V1
Text-Mining: The extracted terms are gathered according the head (first and last words) (e‧g. (1) food consumption / food pathogen / food preservation, (2) spoiled biltong / venison biltong / wet biltong, and so forth. -- Valorcarn Project (2015-2017) [project supported by GloFoo... |
19 sept. 2017
Roche, Mathieu; Teisseire, Maguelonne; Shrivastava, Gaurav, 2017, "Valorcarn-TETIS: Candidates for OTR (Ontological and Terminological Resource)", https://doi.org/10.18167/DVN1/KNFAGG, CIRAD Dataverse, V1
Text Mining: The different terms extracted by text-mining approaches are candidates for an OTR (Ontological and Terminological Resource) associated to Valorcarn Project. -- Valorcarn Project (2015-2017) [project supported by GloFoodS program (INRA-Cirad)]. Topic: Mining of scient... |