More about Best Life Science Database of All Time:
Best Life Science Database of All Time is a public top list created by Listnerd on rankly.com on November 27th 2012. Items on the Best Life Science Database of All Time top list are added by the rankly.com community and ranked using our secret ranking sauce. Best Life Science Database of All Time has gotten 219 views and has gathered 100 votes from 100 voters. O O
Best Life Science Database of All Time is a top list in the General category on rankly.com. Are you a fan of General or Best Life Science Database of All Time? Explore more top 100 lists about General on rankly.com or participate in ranking the stuff already on the all time Best Life Science Database of All Time top list below.
If you're not a member of rankly.com, you should consider becoming one. Registration is fast, free and easy. At rankly.com, we aim to give you the best of everything - including stuff like the Best Life Science Database of All Time list.
Get your friends to vote! Spread this URL or share:
The goal of this effort is to demonstrate the ability of RDF Gateway to efficiently store and query massive amounts of RDF data in its native RDF repository. The Uniprot RDF project provides all UniProt protein sequence and annotation data in RDF format and is an excellent large source of data.
SIDER contains information on marketed medicines and their recorded adverse drug reactions. The information is extracted from public documents and package inserts. The available information include side effect frequency, drug and side effect classifications as well as links to further information, for example drug–target relations.
FlyMine is an integrated database of genomic, expression and protein data for Drosophila, Anopheles and C. elegans. Integrating data makes it possible to run sophisticated data mining queries that span domains of biological knowledge.
FlyMine is under continued development by a team of software developers and biologists in the Cambridge Systems Biology Centre at Cambridge University.
CardioSHARE is a unique framework for querying distributed data and performing data analysis using Semantic Web standards. CardioSHARE's two main innovations are an enhancement to a standard SPARQL query engine, which enables the required data to be retrieved dynamically from Web Services; and the ability to use OWL class restrictions to drive the discovery and execution of Web Services capable of generating that class' defining properties, thus allowing naiive data to be "lifted" into more complex OWL classifications. Both of these behaviours are accomplished by mapping predicates onto Web Services capable of producing RDF data that satisfy those predicates. Our initial focus has been on integration with the BioMoby project: a set of 1500+ interoperable bioinformatics web services. CardioSHARE effectively brings this established pool of resources into conformance with Semantic Web standards. Given that much of the data from CardioSHARE is generated dynamically based on analysis of incoming query data, the effective size of the "virtual" triplestore is un-measurable; limited only by the number of conceivable inputs.
DBpedia is a project aiming to extract structured content from the information created as part of the Wikipedia project. This structured information is then made available on the World Wide Web. DBpedia allows users to query relationships and properties associated with Wikipedia resources, including links to other related datasets. DBpedia has been described by Tim Berners-Lee as one of the more famous parts of the Linked Data project.
The project was started by people at the Free University of Berlin and the University of Leipzig, in collaboration with OpenLink Software, and the first publicly available dataset was published in 2007. It is made available under free licences, allowing others to reuse the dataset.
Wikipedia articles consist mostly of free text, but also include structured information embedded in the articles, such as "infobox" tables, categorisation information, images, geo-coordinates and links to external Web pages. This structured information is extracted and put in a uniform dataset which can be queried.
As of September 2011, the DBpedia dataset describes more than 3.64 million things, out of which 1.83 million are classified in a consistent ontology, including
BioPortal SPARQL is a service to query BioMedical ontologies using the SPARQL standard. Ontologies have been transformed into RDF triples from their original formats (OWL, OBO and UMLS/RRF, ...) and asserted into a triple store. This service provides programatic access to that triple store.
Cyanobacteria carry a complete set of genes for oxygenic photosynthesis, which is the most fundamental life process on the earth. This organism is also interesting from an evolutional viewpoint, for it was born in a very ancient age and has survived in various environments. Chloroplast is believed to have evolved from cyanobacterial ancestors which developed an endosymbiontic relationship with a eukaryotic host cell.
CyanoBase provides an easy way of accessing the sequences and all-inclusive annotation data on the structures of the cyanobacterial genomes. This database was originally developed by Makoto Hirosawa, Takakazu Kaneko and Satoshi Tabata, and the current version of CyanoBase has been developed and maintained by Mitsuteru Nakao, Shinobu Okamoto, Takatomo Fujisawa, Yasukazu Nakamura, Takakazu Kaneko, and Satoshi Tabata at Kazusa DNA Research Institute.
STITCH is a resource to explore known and predicted interactions of chemicals and proteins. Chemicals are linked to other chemicals and proteins by evidence derived from experiments, databases and the literature.
STITCH contains interactions for over 74,000 small molecules and over 2.5 million proteins in 630 organisms.
The ChEMBL team's research focuses on mapping the interactions and functional effects of small molecules binding to their macromolecular targets.
The group studies the interactions of pharmacologically active molecules and their receptors. In particular the group builds and maintains a series of drug discovery databases that are components of ChEMBL.
The Gene Ontology, or GO, is a major bioinformatics initiative to unify the representation of gene and gene product attributes across all species. More specifically, the project aims to:
The GO is part of a larger classification effort, the Open Biomedical Ontologies (OBO).
There is no universal standard terminology in biology and related domains, and term usages may be specific to a species, research area or even a particular research group. This makes communication and sharing of data more difficult. The Gene Ontology project provides an ontology of defined terms representing gene product properties. The ontology covers three domains:
Each GO term within the ontology has a term name, which may be a word or string of words; a unique alphanumeric identifier; a definition with cited sources; and a namespace indicating the domain to which it belongs. Terms may also have synonyms, which are classed as being exactly equivalent to the term name, broader, narrower, or related; references to equivalent concepts in other databases; and comments on term meaning or usage. The GO ontology is structured as a directed acyclic graph, and each term has defined relationships to one or more other
A grand challenge in the post-genomic era is a complete computer
representation of the cell, the organism, and the biosphere, which will
enable computational prediction of higher-level complexity of cellular
processes and organism behaviors from genomic and molecular
information. Towards this end we have been developing a bioinformatics
resource named KEGG as part of the research projects of the Kanehisa
Laboratories in the Bioinformatics Center of Kyoto University and the
Human Genome Center of the University of Tokyo.
Affymetrix' GeneChip® technology was invented in the late 1980's by a
team of scientists led by Stephen P.A. Fodor, Ph.D. The theory behind
their work was revolutionary - a notion that semiconductor
manufacturing techniques could be united with advances in combinatorial
chemistry to build vast amounts of biological data on a small glass
chip. This technology became the basis of a new company, Affymetrix,
formed as a division of Affymax, N.V. in 1991. Affymetrix began
operating independently in 1992.
Chempedia is a free service for uniquely identifying and naming chemical substances. If you or your organization work with chemical substances and would like a convenient way to keep track of them in spreadsheets, wikis, web pages and other databases, Chempedia can help. If you just have a substance name, you can use Chempedia to find what's known about it.
Community Created and Reviewed
Chempedia is created and maintained by volunteers worldwide. We take quality very seriously. That's why all content is subjected to a streamlined form of peer-review that borrows from the best practices of modern social media.
Chempedia is as much about the people using it as the data it contains. Interested in knowing who submitted or named a substance? We make it easy to find other chemists likely to share your interests.
Free to Use and Adapt
We think information created by the community belongs to the community. That's why all information contained in Chempedia is free to download, use, and adapt.
Chemical Entities of Biological Interest, also known as ChEBI, is a database and ontology of molecular entities focused on 'small' chemical compounds, that is part of the Open Biomedical Ontologies effort. The term "molecular entity" refers to any "constitutionally or isotopically distinct atom, molecule, ion, ion pair, radical, radical ion, complex, conformer, etc., identifiable as a separately distinguishable entity". The molecular entities in question are either products of nature or synthetic products used to intervene in the processes of living organisms. Molecules directly encoded by the genome, such as nucleic acids, proteins and peptides derived from proteins by proteolytic cleavage, are not as a rule included in ChEBI.
ChEBI uses nomenclature, symbolism and terminology endorsed by the International Union of Pure and Applied Chemistry (IUPAC) and Nomenclature Committee of the International Union of Biochemistry and Molecular Biology (NC-IUBMB).
All data in the database is non-proprietary or is derived from a non-proprietary source. It is thus freely accessible and available to anyone. In addition, each data item is fully traceable and explicitly referenced to the original
The Bio2RDF project is a tool to convert bioinformatics data and knowledge bases to RDF format. It is a kind of generalized rdfizer for bioinformatics applications, and it is a place for the semantic web life science community to develop and grow.
BioPAX is a collaborative effort to create a data exchange format for biological pathway data. Get involved...
BioPAX Level 3 covers metabolic pathways, molecular interactions, signaling pathways (including molecular states and generics), gene regulation and genetic interactions. BioPAX Level 3 is currently under development and review by pathway databases and is scheduled for release by mid-2008.
BioCyc is a collection of 371 Pathway/Genome Databases. Each
Pathway/Genome Database in the BioCyc collection describes the genome
and metabolic pathways of a single organism, with the exception of the MetaCyc database, which is a reference source on metabolic pathways from many organisms.
To learn more about BioCyc, read the Introduction to BioCyc or watch our free online instructional videos.
The BioCyc databases are divided into three tiers, based on their quality.
BioCyc Tier 1: Intensively Curated Databases
EcoCyc Escherichia coli K-12 MetaCyc Metabolic pathways and enzymes from more than 900 organisms
The BioCyc Open Chemical Database is also an intensively
curated database. It is an open database of chemical compounds from other BioCyc databases.
Because it contains chemical compounds only, it is not a Pathway/Genome Database.
BioCyc Tier 2: Computationally-Derived Databases Subject to Moderate Curation
20 databases are available.
[list of tier 2 DBs]
BioCyc Tier 3: Computationally-Derived Databases Subject to No Curation
349 databases are available and ready for adoption
by interested scientists for curation and updating.
PGDBs in Tier 3 were produced as a collaboration
between the groups of Peter D. Karp at SRI International and
Christos Ouzounis at the European Bioinformatics Institute.
[list of tier 3 DBs]
Observe how genes interact in dynamic graphical models. Our online
maps depict molecular relationships from areas of active research. In
an "open source" approach, this community-fed forum constantly
integrates emerging proteomic information from the scientific
community. It also catalogs and summarizes important resources
providing information for over 120,000 genes from multiple species.
Find both classical pathways as well as current suggestions for new
The European Bioinformatics Institute (EBI) is a centre for research and services in bioinformatics, and is part of European Molecular Biology Laboratory (EMBL).
The roots of the EMBL-EBI lie in the EMBL Nucleotide Sequence Data Library (now known as EMBL-Bank), which was established in 1980 at the EMBL laboratories in Heidelberg, Germany and was the world's first nucleotide sequence database. The original goal was to establish a central computer database of DNA sequences, rather than have scientists submit sequences to journals. What began as a modest task of abstracting information from literature soon became a major database activity with direct electronic submissions of data and the need for highly skilled informatics staff. The task grew in scale with the start of the genome projects, and grew in visibility as the data became relevant to research in the commercial sector. It soon became apparent that the EMBL Nucleotide Sequence Data Library needed better financial security to ensure its long-term viability and to cope with the sheer scale of the task.
There was also a need for research and development to provide services, to collaborate with global partners to support the
TreeBASE is a relational database of phylogenetic information hosted by the University at Buffalo. In previous years the database has been hosted by Harvard University Herbaria, Leiden University EEW, and the University of California, Davis. TreeBASE stores phylogenetic trees and the data matrices used to generate them from published research papers. We encourage biologists to submit phylogenetic data that are either published or in press, especially if these data were not fully presented in the publication due to space limitations. TreeBASE accepts all types of phylogenetic data (e.g., trees of species, trees of populations, trees of genes) representing all biotic taxa. For more information, see an introduction to TreeBASE, information on searching, the database schema, and a graphic presentation of the web site's internal structure. Also, check out some ideas on why you might want to use TreeBASE.
TreeBASE is now a participant in CIPRES, the NSF-sponsored Cyberinfrastructure for Phylogenetic Research project. As such, it is being redesigned from the ground up through collaborative research among Computer Scientists, Biologists, and Programmers. Presently TreeBASE is being mirrored at the San Diego Supercomputer Center at UCSD. Eventually, the redesigned, new and improved CIPRES version of TreeBASE will take over. In the meantime, please send us suggestions of what kinds of features or functions would you like designed into the new database? Are there new or unusual data types, queries, and functions that are not already offered by the current version of TreeBASE? Please send your suggestions here.
The WWW implementation of TreeBASE requires a forms-capable and frames-capable browser. We would be very grateful for any feedback on TreeBASE, including suggestions for improvement. In particular, if you encounter any errors please let us know.
The Swiss-Prot, TrEMBL, and PIR protein database activities have united to form the Universal Protein Resource (UniProt), which provides a central resource on protein sequences and functional annotation with three database components, each addressing a key need in protein bioinformatics. The UniProt Knowledgebase (UniProtKB), comprising the manually annotated UniProtKBSwiss-Prot section and the automatically annotated UniProtKBTrEMBL section, is the preeminent storehouse of protein annotation. The extensive cross-references, functional and feature annotations, and literature-based evidence attribution enable scientists to analyze proteins and query across databases. The UniProt Reference Clusters (UniRef) speed similarity searches via sequence space compression by merging sequences that are 100% (UniRef100), 90% (UniRef90), or 50% (UniRef50) identical. Finally, the UniProt Archive (UniParc) stores all publicly available protein sequences, containing the history of sequence data with links to the source databases. The UniProt databases continue to grow in size and in availability of information. New download availability includes all major releases of UniProtKB, sequence collections by taxonomic division, and complete proteomes. A bibliography mapping service has been added, and an ID mapping service is available.
LinkedCT is a Linked Data source of clinical trials available at ClinicalTrials.gov, a registry of federally and privately supported clinical trials conducted in the United States and around the world.
ClinicalTrials.gov give you information about a trial's purpose, who may participate, locations, and phone numbers for more details. This information should be used in conjunction with advice from health care professionals.
The LinkedCT data space is published according to the principles of publishing Linked Data. Each entity in LinkedCT is identified by a unique HTTP dereferenceable Uniform Resource Identifier (URI). When the URI is looked up, related RDF statements about the entity is returned in HTML or RDF/XML based on the user’s agent. Moreover, a SPARQL endpoint is provided as the standard access method for RDF data.
myExperiment is a social web site for researchers sharing Research Objects such as Scientific Workflows.
The myExperiment website was launched in November 2007 and contains a significant collection of scientific workflows for a variety of workflow systems, most notably Taverna, but also other tools such as Bioclipse. myExperiment has a REST API and is based on an open source Ruby on Rails codebase. It supports Linked data and has a SPARQL Endpoint, with an interactive tutorial.
The myExperiment project is directed by David De Roure at University of Oxford and is one of the activities of the myGrid consortium led by Carole Goble of The University of Manchester, UK and of the e-Research South UK regional consortium led by the Oxford e-Research Centre. It was originally funded by JISC under the Virtual Research Environment programme and by the Microsoft Technical Computing Initiative. myExperiment is being enhanced by the workflows for ever project (Wf4Ever) which aims to provide new features to support the preservation of Research Objects in conjunction with the dLibra digital library framework.
Recent advance in high throughput technique has generated biological data in myriad volumes, which simultaneously contributes to a newly emerged discipline -- system biology, which adopts comprehensive approach to study biological systems. Chemogenomics, as an integrated part of system biology, studies the impact of small molecules towards biological systems and carries datum description about interaction among chemical entities and protein molecules. The integration between chemical informatics and bioinformatics within the realm of system biology leads to a new synergetic subject, namely systems chemical biology(ref).
However, the current de facto of chemical and biological data distribution impedes the growth of systems chemical biology due to heterogeneous formats used. This project is dedicated to address such challenges using existing semantic web technology, in particular bio2rdf, Linking open drug data. Beyond the generic scopes of these two initiatives, we are also planning to incorporate new semantic clauses to embed the core interests of system chemical biology, for instance chemical structural similarity and biological sequence similarity. Figure 1 shows the overall scope of systems chemical biology.
IntAct provides a freely available, open source database system and analysis tools for protein interaction data. All interactions are derived from literature curation or direct user submissions and are freely available.
Provider:United States National Library of Medicine
DailyMed provides high quality information about marketed drugs. This information includes FDA labels (package inserts). This Web site provides health information providers and the public with a standard, comprehensive, up-to-date, look-up and download resource of medication content and labeling as found in medication package inserts. The National Library of Medicine (NLM) provides this as a public service and does not accept advertisements.
The DrugBank database, available at the University of Alberta, is a bioinformatics and cheminformatics resource that combines detailed drug (i.e., chemical, pharmacological and pharmaceutical) data with comprehensive drug target (i.e., sequence, structure, pathway) information. The database contains nearly 4800 drug entries including:
More than 2500 protein (i.e., drug target, non-redundant) sequences are linked to these drug entries.
Each DrugCard entry contains 148 data fields with half of the information being devoted to drug/chemical data and the other half devoted to drug target or protein data.
It is maintained by David Wishart and Craig Knox.
Users may query DrugBank in a number of ways:
Entrez Gene (www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=gene) is NCBI's database for gene-specific information. It does not include all known or predicted genes; instead Entrez Gene focuses on the genomes that have been completely sequenced, that have an active research community to contribute gene-specific information, or that are scheduled for intense sequence analysis. The content of Entrez Gene represents the result of curation and automated integration of data from NCBI's Reference Sequence project (RefSeq), from collaborating model organism databases, and from many other databases available from NCBI. Records are assigned unique, stable and tracked integers as identifiers. The content (nomenclature, map location, gene products and their attributes, markers, phenotypes, and links to citations, sequences, variation details, maps, expression, homologs, protein domains and external databases) is updated as new information becomes available. Entrez Gene is a step forward from NCBI's LocusLink, with both a major increase in taxonomic scope and improved access through the many tools associated with NCBI Entrez.
Live Virtuoso instance hosting Linked Open Data (LOD) Cloud
We have reached a beachead re. the Virtuoso instance hosting the Linked Open Data (LOD) Cloud; meaning, we are not going to be performing any major updates and deletions short-term, bar incorporation of fresh data sets from the Freebase and Bio2RDF projects (both communities a prepping new RDF data sets).
At the current time we have loaded 100% of all the very large data sets from the LOD Cloud. As result, we can start the process of exposing Linked Data virtues in a manner that's palatable to users, developers, and database professionals across the Web 1.0, 2.0, and 3.0 spectrums.
What does this mean?
You can use the "Search & Find" or"URI Lookup" or SPARQL endpoint associated with the LOD cloud hosting instance to perform the following tasks:
Find entities associated with full text search patterns -- Google Style, but with Entity & Text proximity Rank instead of Page Rank, since we are dealing with Entities rather than documents about entities
Find and Lookup entities by Identifier (URI) -- which is helpful when locating URIs to use for identify entities in your own linked data spaces on the Web
View entity descriptions via a variety of representation formats (HTML, RDFa, RDF/XML, N3, Turtle etc.)
Determine uses of entity identifiers across the LOD cloud -- which helps you select preferred URIs based on usage statistics.
PubChem provides information on the biological activities of small molecules. It is a component of NIH's Molecular Libraries Roadmap Initiative. If you would like to learn more about how to use the PubChem resources, please go to our help page.