Baselight

Paradise-Panama-Papers

Data Scientists United Against Corruption

@kaggle.zusmani_paradisepanamapapers

Loading...
Loading...

About this Dataset

Paradise-Panama-Papers

Context

The Paradise Papers is a cache of some 13GB of data that contains 13.4 million confidential records of offshore investment by 120,000 people and companies in 19 tax jurisdictions (Tax Heavens - an awesome video to understand this); that was published by the International Consortium of Investigative Journalists (ICIJ) on November 5, 2017. Here is a brief video about the leak. The people include Queen Elizabeth II, the President of Columbia (Juan Manuel Santos), Former Prime Minister of Pakistan (Shaukat Aziz), U.S Secretary of Commerce (Wilbur Ross) and many more. According to an estimate by the Boston Consulting Group, the amount of money involved is around $10 trillion. The leak contains many famous companies, including Facebook, Apple, Uber, Nike, Walmart, Allianz, Siemens, McDonald’s and Yahoo.

It also contains a lot of U. S President Donald Trump allies including Rax Tillerson, Wilbur Ross, Koch Brothers, Paul Singer, Sheldon Adelson, Stephen Schwarzman, Thomas Barrack and Steve Wynn etc. The complete list of Politicians involve is avaiable here.

The Panama Papers in the cache of 38GB of data from the national corporate registry of Bahamas. It contains world’s top politicians and influential persons as head and director of offshore companies registered in Bahamas.

Offshore Leaks details 13,000 offshore accounts in a report.

I am calling all data scientists to help me stop the corruption and reveal the patterns and linkages invisible for the untrained eye.

Content

The data is the effort of more than 100 journalists from 60+ countries

The original data is available under creative common license and can be downloaded from this link.

I will keep updating the datasets with more leaks and data as it’s available

Acknowledgements

International Consortium of Investigative Journalists (ICIJ)

Paradise Papers Update

Paradise Papers data has been uploaded as released by ICIJ on Nov 21, 2017. You can find Paradise Papers zip file and six extracted files in CSV format, all starting with a prefix of Paradise. Happy Coding!

Inspiration

Some ideas worth exploring:

  1. How many companies and individuals are there in all of the leaks data

  2. How many countries involved

  3. Total money involved

  4. What is the biggest best tax heaven

  5. Can we compare the corruption with human development index and make an argument that would correlate corruption with bad conditions in that country

  6. Who are the biggest cheaters and where they live

  7. What role Fortune 500 companies play in this game

I need your help to make this world corruption free in the age of NLP and Big Data

Tables

Paradise Papers Nodes Other

@kaggle.zusmani_paradisepanamapapers.paradise_papers_nodes_other
  • 56.98 KB
  • 2031 rows
  • 18 columns
Loading...

CREATE TABLE paradise_papers_nodes_other (
  "labels_n" VARCHAR,
  "n_valid_until" VARCHAR,
  "n_country_codes" VARCHAR,
  "n_countries" VARCHAR,
  "n_node_id" BIGINT,
  "n_sourceid" VARCHAR,
  "n_address" VARCHAR,
  "n_name" VARCHAR,
  "n_jurisdiction_description" VARCHAR,
  "n_service_provider" VARCHAR,
  "n_jurisdiction" VARCHAR,
  "n_closed_date" VARCHAR,
  "n_incorporation_date" VARCHAR,
  "n_ibcruc" VARCHAR,
  "n_type" VARCHAR,
  "n_status" VARCHAR,
  "n_company_type" VARCHAR,
  "n_note" VARCHAR
);

Share link

Anyone who has the link will be able to view this.