Context
Predicting the stock market is one of the most commonly performed projects when someone is learning about ML and Data Science. After all, who wouldn't want to delegate the task of picking stocks to a model and reap the rewards for themselves? However, one of the most difficult and tedious steps to predict what stocks to invest in is actually gathering the data to use. There are so many options and it is important to get sufficient information for each. But, what if you can skip this step and just download a dataset that has all that information easily available for you? Look no further as this is the answer to this problem.
Content
This dataset contains information of 4447 stocks traded under Nasdaq across various exchanges. There is a file that contains information for all 4447 stocks but also has several null fields, which is why I labeled it as full_financial_stocks_raw.csv --it has minimal modifications to the values inside the rows. The second file, dividend_stocks_only.csv, is still a raw-ish style dataset but it only contains stocks that pay out dividends to its shareholders. Interestingly, it seems dividend-paying stocks have more information about them, which explains why this file has significantly fewer rows with null values.
Update: In the next 24 hours, I will be uploading an optimized, feature-engineered dataset that has fewer columns overall and fewer rows with null values. This dataset is intended to be a fully cleaned option to directly feed into ML/DL models.
Acknowledgements
I would like to thank the sources where I obtained my data, which are the FTP Nasdaq Trader website and the Yahoo Finance API.
Inspiration
Analyzing the stock market is one of the most intriguing endeavors I could think of as the ways it can be influenced are so broad and distinct from one another. A news article can influence how investors view a particular company, social media can directly fluctuate a company's share price, and there are numerous calculations and formulas that can show what stocks are worth investing in.