COCONUT: The COlleCtion Of Open NatUral ProducTs.
Unlocking Molecule Information
@kaggle.thedevastator_open_source_natural_product_annotations
Unlocking Molecule Information
@kaggle.thedevastator_open_source_natural_product_annotations
By [source]
This dataset contains data on a collection of natural products in the form of molecular annotations. Information includes the molecular formula, clean-SMILES representation, InChi representation, and corresponding InChiKey. With this unique data set, you are able to explore and gain insight into some of the most captivating organic molecules out there! Furthermore, it is an open source platform to help identify potential sources for novel compounds for drug discovery and other applications. So go ahead--discover new and exciting natural products that nature has bestowed upon us all!
For more datasets, click here.
- 🚨 Your notebook can be here! 🚨!
Molecular Formula: This is a string that represents the elemental composition of a molecule. It can be used to easily distinguish different forms and structures of molecules.
Clean Smiles: This is a simplified molecular-input line-entry system (SMILES) representation of the natural product. It is a simple way to represent atoic and bond connectivity within molecules, making it readable by computers and databases while preserving necessary chemical information on the compounds.
InChi: This is an international chemical identifier (InChi) representation for each molecule in this collection. InChis are specifically designed to capture important structural characteristics from chemical compounds in order form that can be interpreted globallyy by managing data sources as well as multiple computer systems as it remains valid through different format transformations without any loss or alteration of data accuracy when decompressed or regenerated
This dataset provides researchers with an unified opportunity to access detailed molecular properties required for their research without requiring special software or hardware capabilities for their analysis, which makes exploration easier than before! With this dataset, researchers will gain access to deep knowledge about different molecular structures - allowing them to discover new and exciting possibilities with scientific applications such as drug discovery, materials science exploration etc.. If you are interested in learning more about other features available within our natural products database please refer directly ti our repository found here
- Automatically predicting the effects of natural products on biochemical pathways in biological cells to explore potential therapeutic activities.
- Analyzing the effects of different mixtures of natural products and their individual components as starting points for drug discovery processes.
- By computational exploration, highlighting active compounds within a library of natural products to be used as leads when designing novel drugs that target specific pathways
If you use this dataset in your research, please credit the original authors.
Data Source
License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication
No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.
File: COCONUT4MetFrag_april.csv
Column name | Description |
---|---|
molecular_formula | This column contains the chemical formula of the molecule which describes each atom present in it and how many times they occur as well as their elemental composition. (String) |
clean_smiles | This column contains the Simplified Molecular-Input Line-Entry System (SMILES) representation of the molecule which allows implicit hydrogen atoms to be represented by brackets rather than explicit hydrogen atoms. (String) |
inchi | This column contains the International Chemical Identifier (InChI) representation of the molecule which provides a consistent means to represent chemical substances through creation of unique identifiers consisting functional groups found in its structure. (String) |
inchikey | This column contains the International Chemical Identifier (InChIKey) representation of the molecule which consists of 27 characters including numbers, capital letters and hyphens, which serves as a condensed version of InChI allowing for easier comparison across independent resources. (String) |
If you use this dataset in your research, please credit the original authors.
If you use this dataset in your research, please credit .
CREATE TABLE coconut4metfrag_april (
"coconut_id" VARCHAR,
"molecular_formula" VARCHAR,
"clean_smiles" VARCHAR,
"inchi" VARCHAR,
"inchikey" VARCHAR,
"coconut_id_1" VARCHAR
);
Anyone who has the link will be able to view this.