Crypto Data Hourly Price Since 2017 To 2023-10
1M+ Hourly Price Data Tick for 34 Cryptocurrencies from 2017 to Present, OHLCV
@kaggle.franoisgeorgesjulien_crypto
1M+ Hourly Price Data Tick for 34 Cryptocurrencies from 2017 to Present, OHLCV
@kaggle.franoisgeorgesjulien_crypto
Find my notebook : Advanced EDA & Data Wrangling - Crypto Market Data where I cover the full EDA and advanced data wrangling to get beautiful dataset ready for analysis.
Find my Deep Reinforcement Learning v1 notebook: Deep Reinforcement Learning for Trading
Find my Quant Analysis notebook:💎 Quant Analysis & Visualization | BTC V1
Dataset Presentation:
This dataset provides a comprehensive collection of hourly price data for 34 major cryptocurrencies, covering a time span from January 2017 to the present day. The dataset includes Open, High, Low, Close, Volume (OHLCV), and the number of trades for each cryptocurrency for each hour (row).
Making it a valuable resource for cryptocurrency market analysis, research, and trading strategies. Whether you are interested in historical trends or real-time market dynamics, this dataset offers insights into the price movements of a diverse range of cryptocurrencies.
This is a pure gold mine, for all kind of analysis and predictive models. The granularity of the dataset offers a wide range of possibilities. Have Fun!
Ready to Use - Cleaned and arranged dataset less than 0.015% of missing data hour: crypto_data.csv
First Draft - Before External Sources Merge (to cover missing data points): crypto_force.csv
Original dataset merged from all individual token datasets: cryptotoken_full.csv
crypto_data.csv & cryptotoken_full.csv highly challenging wrangling situations:
see notebook 'Advanced EDA & Data Wrangling - Crypto Market Data' to follow along and have a look at the EDA, wrangling and cleaning process.
Date Range: From 2017-08-17 04:00:00 to 2023-10-19 23:00:00
Date Format: YYYY-MM-DD HH-MM-SS (raw data to be converted to datetime)
Data Source: Binance API (some missing rows filled using Kraken & Poloniex market data)
Crypto Token in the dataset (also available as independent dataset):
Date column presents some inconsistencies that need to be cleaned before formatting to datetime:
start_timestamp = pd.Timestamp('2023-07-27 00:00:00')
end_timestamp = pd.Timestamp('2023-07-27 23:00:00')
hourly_timestamps = pd.date_range(start=start_timestamp, end=end_timestamp, freq='H')
hourly_data = {
'Date': hourly_timestamps,
'Symbol': 'ETCUSDT',
'Open': 18.29,
'High': 18.3,
'Low': 18.17,
'Close': 18.22,
'Volume USDT': 127468,
'tradecount': 623,
'Token': 'ETC'
}
hourly_df = pd.DataFrame(hourly_data)
df = pd.concat([df, hourly_df], ignore_index=True)
df = df.drop(550341)
# Count the occurrences of the pattern '.xxx' in the 'Date' column
count_occurrences_before = df['Date'].str.count(r'\.\d{3}')
print("Occurrences before cleaning:", count_occurrences_before.sum())
# Remove '.xxx' pattern from the 'Date' column
df['Date'] = df['Date'].str.replace(r'\.\d{3}', '', regex=True)
# Count the occurrences of the pattern '.xxx' in the 'Date' column after cleaning
count_occurrences_after = df['Date'].str.count(r'\.\d{3}')
print("Occurrences after cleaning:", count_occurrences_after.sum())
Disclaimer: Any individual or entity choosing to engage in market analysis, develop predictive models, or utilize data for trading purposes must do so at their own discretion and risk. It is important to understand that trading involves potential financial loss, and decisions made in the financial markets carry inherent risks. This dataset is provided for informational and research purposes only, and its use in trading decisions should be made with full awareness of the associated risks. Users are urged to exercise caution, conduct thorough research, and consider seeking advice from qualified financial professionals when engaging in trading activities. The dataset provider assumes no responsibility for trading outcomes. NFA.
Anyone who has the link will be able to view this.