This dataset is an augmented Chinese stock market dataset that includes not only OHLC prices and volume data, but also some other financial ratios at daily frequency, like PE, PB, PS ratio, dividend yield, and etc. The covered period is from Jan 4th, 2005, to May 11th, 2022.
All data are available at "daily frequency", including FRs (financial ratios) like PE ratio and some fundamentals like total market cap, etc.
It takes sufficiently large amount of time to gather information/data about all liquid and publicly traded stocks on Shanghai Stock Exchange and Shenzhen Stock Exchange (a total of 4714 stocks, as identified by their ticker symbols).
Please note that there're some "ST" stocks included in this dataset as well. Users/Researchers should pay particular attention to those stocks as those stocks are experiencing financial distress. Therefore, these stocks are very likely to go bankrupt/delisted in 3 years if companies' financial condition doesn't improve.
"ST" stocks can be found in "ticker_info.csv" file with "ST" included in the "company name" column. Users can merge it with "stock_data.csv" if they want to exclude these "ST" stock data.
In my dataset, all the columns (or features) are pure features, indicating that none of these features are generated from other features (ex. "20-day momentum" is a generated feature from "close" data, etc.).
Users can create generated technical indicators/factors themselves to augment the features and apply feature engineering to this richer (augmented) pool of features.
I hope the contribution of this dataset will advance the research in the area of (quantitative) finance, algorithmic trading, economics and more.