Sequence and meta data for various protein structures

Context

This is a protein data set retrieved from Research Collaboratory for Structural Bioinformatics (RCSB) Protein Data Bank (PDB).

The PDB archive is a repository of atomic coordinates and other information describing proteins and other important biological macromolecules. Structural biologists use methods such as X-ray crystallography, NMR spectroscopy, and cryo-electron microscopy to determine the location of each atom relative to each other in the molecule. They then deposit this information, which is then annotated and publicly released into the archive by the wwPDB.

The constantly-growing PDB is a reflection of the research that is happening in laboratories across the world. This can make it both exciting and challenging to use the database in research and education. Structures are available for many of the proteins and nucleic acids involved in the central processes of life, so you can go to the PDB archive to find structures for ribosomes, oncogenes, drug targets, and even whole viruses. However, it can be a challenge to find the information that you need, since the PDB archives so many different structures. You will often find multiple structures for a given molecule, or partial structures, or structures that have been modified or inactivated from their native form.

Content

There are two data files. Both are arranged on "structureId" of the protein:

pdb_data_no_dups.csv contains protein meta data which includes details on protein classification, extraction methods, etc.
data_seq.csv contains >400,000 protein structure sequences.

Acknowledgements

Original data set down loaded from http://www.rcsb.org/pdb/

Inspiration

Protein data base helped the life science community to study about different diseases and come with new drugs and solution that help the human survival.

Structural Protein Sequences

Sequence and meta data for various protein structures

Context

Content

Acknowledgements

Inspiration

Related Datasets

Protein Secondary Structure

Yahoo Finance Historical Prices And Ticker Fundamentals

Data From: Characterization Of Commercial Cricket Protein Powder And Impact Of Cricket Protein Powder Replacement On Wheat Dough Protein Composition

Data From: Characterization Of Commercial Cricket Protein Powder And Impact Of Cricket Protein Powder Replacement On Wheat Dough Protein Composition

Data From: Transcriptome And Metabolome Analyses Reveal Regulatory Networks Associated With Nutrition Synthesis In Sorghum Seeds

Data From: Transcriptome And Metabolome Analyses Reveal Regulatory Networks Associated With Nutrition Synthesis In Sorghum Seeds