Decreto Rilancio DL 20200519 - Word Frequencies
Italy's 55bln EUR post-COVID19 recovery measures - word frequencies
@kaggle.robertolofaro_decreto_rilancio_dl_20200519_word_frequencies
Italy's 55bln EUR post-COVID19 recovery measures - word frequencies
@kaggle.robertolofaro_decreto_rilancio_dl_20200519_word_frequencies
Top-left: the tag cloud of the law in Italian
Bottom-left: the English version of tha tag cloud
On the right: screenshot from the research-by-tag-cloud application that is currently offline (to avoid traffic overload, might go online later)
Along with the dataset on UN SDG and the World Bank selected indicators previously released , this dataset contains the whole list of articles and subdivisions within the "Decreto Rilancio", D.L. 34, as issued on 2020-05-19.
This is the baseline dataset.
This implies that it does not yet contain any further amendments (e.g. on 2020-05-20 where issued an errata corrige and some minor amendments, both listed at the bottom of this introduction.
A new version of this dataset will be issued after it will be implemented.
NOTE 2020-05-23: added as a new file an extension on the baseline extending the frequencies to both the title and the content of each article of the decree; also, released a search by tag cloud prototype search mini-website that allows to use the most frequent words (as per the latest file) and retrieve the list of associated article (again, created to support writings and activities, shared as part of my #datademocracy initiative)
You will see reposts of this dataset that claim "155bln" instead of "55bln": I stick to the latter for two reasons:
As noted within the Kernel that I released with this dataset, there is then an additional element, the "impact"- including the TPG (Third Party Guarantoor) role that the Italian State will take vs private companies asking for loans
As summarized in Italian within an article released along with the initial draft:
"Al di là del dibattito su come conteggiare o meno queste risorse, ecco spiegata la differenza tra i 55 miliardi indicati da Conte, Di Maio e Zingaretti e i 155 miliardi citati da Gualtieri.
Stanno parlando di due cose diverse: i primi parlano dei soldi presi in deficit per finanziare il decreto; il secondo tiene in considerazione anche le garanzie per la liquidità alle imprese.
Per completezza, segnaliamo che secondo le stime del Def, il decreto “Cura Italia” avrà un impatto sul saldo netto da finanziare per il 2020 di circa 25 miliardi di euro (a fronte di un peso sull’indebitamento netto di quasi 20 miliardi di euro: lo scostamento in questo caso era quindi molto minore)."
The dataset contains all the textual content of the Decreto Legge, in Italian (I translated only the tag cloud in English, shown along with the one in Italian).
The tag cloud list on the right-hand side is from a local application using the same tag cloud search framework I already used for the ECB Speeches tag cloud search that since October 2019 update weekly on Sundays.
As the official website where it was published today crashed repeatedly, unfortunately decided to avoid publishing the application.
Anyway, as a service to fellow Italians (or foreigners) that might be interested in searching and commenting, decided to release the "raw" word frequencies table.
By choice, instead of filtering out common Italian words, this dataset:
Locally, the database contains also the full text of the government decree, but, as it is anyway available, and the purpose of the database and dataset in supporting analysis, clustering, searches, etc is not needed, it has not been included in this dataset.
PLEASE NOTE: I have no affiliation whatsoever with the Italian Government- I selected these data (along with others from other sources, e.g. Eurostat, OECD, World Bank, UN) just to support my publishing purposes on the use of Open Data for business and social projects and initiatives
More information on the concept, and associated past and future datasets or publications, please visit Data Democracy
Thanks to the Gazzetta Ufficiale for releasing only both the PDF image-only originally received from the Government, and for releasing on 2020-05-20 also the text version (used to load the database)
Connecting different data points to identify potential correlations, as part of my knowledge update/learning process (and to complement my other publication activities)
Anyone who has the link will be able to view this.