Content
This repository contains a collection of Russian literature in txt format (all in UTF-8 encoding). In addition, for each author there is a csv file containing information about the year of writing of each work.
This dataset was created for a project to determine the authorship of a piece of text, but I'm sure that you can use this dataset for anything 😉.
The main feature that allows this dataset to be used for any purpose is that the data is not processed at all. The text has not been pre-processed in any way, the designations of authors, chapters and references to the translation of foreign inserts have not been removed.
Acknowledgements
Thanks Ilibrary, LitLib, Wikisource and all-all-all.