ProtoNet 6.0: organizing 10 million protein sequences in a compact hierarchical family tree
نویسندگان
چکیده
منابع مشابه
ProtoNet 6.0: organizing 10 million protein sequences in a compact hierarchical family tree
ProtoNet 6.0 (http://www.protonet.cs.huji.ac.il) is a data structure of protein families that cover the protein sequence space. These families are generated through an unsupervised bottom-up clustering algorithm. This algorithm organizes large sets of proteins in a hierarchical tree that yields high-quality protein families. The 2012 ProtoNet (Version 6.0) tree includes over 9 million proteins ...
متن کاملProtoNet 4.0: A hierarchical classification of one million protein sequences
ProtoNet is an automatic hierarchical classification of the protein sequence space. In 2004, the ProtoNet (version 4.0) presents the analysis of over one million proteins merged from SwissProt and TrEMBL databases. In addition to rich visualization and analysis tools to navigate the clustering hierarchy, we incorporated several improvements that allow a simplified view of the scaffold of the pr...
متن کاملProtoNet: hierarchical classification of the protein space
The ProtoNet site provides an automatic hierarchical clustering of the SWISS-PROT protein database. The clustering is based on an all-against-all BLAST similarity search. The similarities' E-score is used to perform a continuous bottom-up clustering process by applying alternative rules for merging clusters. The outcome of this clustering process is a classification of the input proteins into a...
متن کاملProtoNet : Navigating the Hierarchical Clustering of the Protein Space
The ProtoNet site provides an automatic hierarchical clustering of the protein space. The clustering is based on an all-against-all BLAST similarity test. With this similarity measure we proceed to perform a continuous bottom-up clustering process by applying alternative rules for merging clusters. The outcome of this clustering process is a classification of the input proteins into a hierarchy...
متن کاملPredicting fold novelty based on ProtoNet hierarchical classification
MOTIVATION Structural genomics projects aim to solve a large number of protein structures with the ultimate objective of representing the entire protein space. The computational challenge is to identify and prioritize a small set of proteins with new, currently unknown, superfamilies or folds. RESULTS We develop a method that assigns each protein a likelihood of it belonging to a new, yet und...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Nucleic Acids Research
سال: 2011
ISSN: 0305-1048,1362-4962
DOI: 10.1093/nar/gkr1027