POSBIOTM/W: A Development Workbench for Machine Learning Oriented Biomedical Text Mining System
نویسندگان
چکیده
The POSBIOTM/W1 is a workbench for machine-learning oriented biomedical text mining system. The POSTBIOTM/W is intended to assist biologist in mining useful information efficiently from biomedical text resources. To do so, it provides a suit of tools for gathering, managing, analyzing and annotating texts. The workbench is implemented in Java, which means that it is platform-independent.
منابع مشابه
POSBIOTM-NER: a trainable biomedical named-entity recognition system
SUMMARY POSBIOTM-NER is a trainable biomedical named-entity recognition system. POSBIOTM-NER can be automatically trained and adapted to new datasets without performance degradation, using CRF (conditional random field) machine learning techniques and automatic linguistic feature analysis. Currently, we have trained our system on three different datasets. GENIA-NER was trained based on GENIA Co...
متن کاملAdvances in the Witchcraft Workbench Project
The Workbench for Intelligent exploraTion of Human ComputeR conversaTions is a new platform-independent open-source workbench designed for the analysis, mining and management of large spoken dialogue system corpora. What makes Witchcraft unique is its ability to visualize the effect of classification and prediction models on ongoing system-user interactions. Witchcraft is now able to handle pre...
متن کاملPOSBIOTM-NER in the Shared Task of BioNLP/NLPBA2004
Two classifiers -Support Vector Machine (SVM) and Conditional Random Fields (CRFs) are applied here for the recognition of biomedical named entities. According to their different characteristics, the results of two classifiers are merged to achieve better performance. We propose an automatic corpus expansion method for SVM and CRF to overcome the shortage of the annotated training data. In addi...
متن کاملDevelopment of bespoke machine learning and biocuration workflows in a BioC-supporting text mining workbench
As part of our participation in the Collaborative Biocurator Assistant Task of BioCreative V, we developed methods and tools for recognising and normalising mentions denoting genes/proteins and organisms. A combination of different approaches were used in addressing these tasks. The recognition of gene/protein and organism names was cast as a sequence labelling problem to which the conditional ...
متن کاملSearching for High-Utility Text in the Biomedical Literature
Much current research is concerned with extracting biomedical facts from text, so far with relatively modest results. Our work is motivated by the idea that text mining can be improved, if the system could first identify text regions that are rich in scientific content, retrieve documents that have many such regions, and focus on fact extraction from these regions. We call these parts of the te...
متن کامل