Complete Daily Price History (OHLC) and Metadata for the Leading Digital Assets
Dataset Description
Dataset Overview:
This dataset is a high-precision financial archive covering the Top 500 Cryptocurrencies by market capitalization. It captures historical price action and market metadata from late 2024 through 2025. Engineered for data scientists and quant analysts, the data has been strictly validated to ensure zero missing values, zero duplicates, and zero negative prices.
It allows for deep-dive analysis into the crypto market's structure, offering both granular daily price data (OHLC) for time-series forecasting and comprehensive metadata (Supply, Volume, ATH) for fundamental valuation.
Data Science Applications:
- Price Prediction: Train LSTM/GRU models using
open,close,high, andlowfeatures. - Market Sentiment Analysis: Correlate
volumeandmarket_capchanges with price action. - Portfolio Optimization: Analyze covariance matrices between top assets like Bitcoin (BTC) and emerging altcoins.
- Clustering: Group coins by
market_cap_rankorcirculating_supplyto identify asset classes. - Technical Analysis: Compute RSI, MACD, and Bollinger Bands using the clean OHLC history.
Column Descriptors:
top_500_metadata.csv
Contains snapshot data for the top 500 coins.
id: Unique CoinGecko identifier (e.g., bitcoin, solana).symbol: Ticker symbol (e.g., btc, sol).name: Full name of the asset.image: URL to the coin's logo.current_price: Latest market price in USD.market_cap: Total market capitalization in USD.market_cap_rank: Global rank by market cap.fully_diluted_valuation: Theoretical market cap if max supply was in circulation.total_volume: 24-hour trading volume.high_24h/low_24h: Highest and lowest price in the last 24 hours.price_change_24h: Absolute price change in the last 24 hours.price_change_percentage_24h: Percentage price change in the last 24 hours.circulating_supply: Amount of coins currently in the market.total_supply: Total amount of coins created.max_supply: Maximum amount of coins that will ever exist.ath: All-Time High price.atl: All-Time Low price.last_updated: Timestamp of the metadata snapshot.
crypto_ohlc_checkpoint.csv
Contains the historical time-series data.
| Column Name | Data Type | Description |
|---|---|---|
| coin_id | String | Unique identifier (matches id in metadata). |
| symbol | String | Ticker symbol (e.g., btc). |
| timestamp | Integer | Unix timestamp (ms) of the record. |
| date | String | Date in YYYY-MM-DD format. |
| open | Float | Opening price (USD). |
| high | Float | Highest price (USD) during the day. |
| low | Float | Lowest price (USD) during the day. |
| close | Float | Closing price (USD). |
Ethically Obtained Data:
This dataset was constructed using the CoinGecko Public API (v3) in strict adherence to their terms of service.
- Rate Limits: Data extraction respected the 30 calls/minute limit using exponential backoff algorithms.
- Public Domain: All data points are publicly accessible market information.
- No Personal Data: Contains only aggregated financial market metrics.
Acknowledgements:
Data provided by the CoinGecko API.
Related Datasets
-
Cryptocurrencies Historical Data
@kaggle
-
XRP Ledger Data
@xrpscan