Global Hotspots Of Sharks And Longline Fishing
Machine-Learning-Assisted Spatial Distribution of At-Risk Species
@kaggle.thedevastator_global_hotspots_of_sharks_and_longline_fishing
Machine-Learning-Assisted Spatial Distribution of At-Risk Species
@kaggle.thedevastator_global_hotspots_of_sharks_and_longline_fishing
By [source]
This dataset provides a critical global assessment of hotspots for shark interactions with industrial longline fisheries. It utilizes machine-learning techniques to identify at-risk shark species and their spatial distribution patterns, highlighting crucial risk areas for threatened shark populations. Through the various parameters of the data, such as catch size, catch units, fish group and presence/absence of species among other details, this dataset can be used to better understand which fishing activities pose a potential threat to sharks while protecting those that are not detrimental. With this information we can help conserve our oceans' fragile ecosystems by maneuvering strategies towards sustainability in order to ensure healthy oceans for generations to come
For more datasets, click here.
- 🚨 Your notebook can be here! 🚨!
This dataset provides valuable insights into the spatial distribution of shark interactions with industrial longline fisheries. It can be used by researchers and conservationists to understand potential risk areas for endangered shark populations in different parts of the world, as well as for developing targeted strategies and measures to protect them.
In order to use this dataset effectively, it is important to understand its structure and content. This dataset contains columns that provide information on each observation including: .pred_class (predicted class of the observation), pres_abs (presence or absence of species), catch (catch data for the species), rfmo (Regional Fisheries Management Organization), year (year of the observation), latitude/longitude (location information) and a variety of other variables related to environmental values, sea surface temperature/height, chlorophyll-a concentration etc. The catch data has been transformed using various methods so that they are easier to use in develop predictive models.
In addition to these variables, this dataset also includes information on prices associated with each observed interaction as well as results from machine-learning-assisted models such as Random Forest Classification/Regression Trees, Minimum Node Size Classifier/Regressor and Mean Absolute Error scores resulting from the model. The results generated by these models can help identify potential hotspots for future interactions between sharks and industrial longline fishing operations which may lead us towards designing better policies for preserving threatened shark populations around the world
- This dataset can be used to predict future patterns of shark interactions with industrial longline fisheries, as well as identify hotspots of activity.
- This dataset can provide valuable insight into how human activities and climate change may be impacting sharks and their environments.
- This dataset can help provide early warnings for conservation efforts that should focus on particular areas in order to protect threatened species from unsustainable exploitation or other anthropogenic threats (e.g., habitat degradation)
If you use this dataset in your research, please credit the original authors.
Data Source
License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication
No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.
File: IOTC_ll_untuned_final_predict.csv
| Column name | Description |
|---|---|
| .pred_class | Predicted class of the species (String) |
| pres_abs | Presence or absence of the species in the region (Boolean) |
| catch | Total catch of the species (Integer) |
| rfmo | Regional Fisheries Management Organization (String) |
| year | Year of the data (Integer) |
| latitude | Latitude of the location (Float) |
| longitude | Longitude of the location (Float) |
| species_sciname | Scientific name of the species (String) |
| catch_units | Units of the catch (String) |
| gear_group | Type of fishing gear used (String) |
| spatial_notes | Notes about the spatial distribution of the species (String) |
| original_effort | Original effort of the fishing gear (Integer) |
| species_commonname | Common name of the species (String) |
| species_group | Group of the species (String) |
| species_resolution | Resolution of the species (String) |
| median_price_group | Median price of the group (Float) |
| median_price_species | Median price of the species (Float) |
| sdm | Statistical distribution model (String) |
| zone | Zone of the location (String) |
| location_cluster | Cluster of the location (String) |
| mean_sst | Mean sea surface temperature (Float) |
| median_sst | Median sea surface temperature (Float) |
| min_sst | Minimum sea surface temperature (Float) |
| max_sst | Maximum sea surface temperature (Float) |
| sd_sst | Standard deviation of sea surface temperature (Float) |
| se_sst | Standard error of sea surface temperature (Float) |
| cv_sst | Coefficient of variation of sea surface temperature (Float) |
| mean_chla | Mean chlorophyll-a concentration (Float) |
| median_chla | Median chlorophyll-a concentration (Float) |
| min_chla | Minimum chlorophyll-a concentration (Float) |
| max_chla | Maximum chlor |
| min_ssh | Minimum sea surface height (Float) |
| max_ssh | Maximum sea surface height (Float) |
| sd_ssh | Standard deviation of sea surface height (Float) |
| se_ssh | Standard error of sea surface height (Float) |
| cv_ssh | Coefficient of variation of sea surface height (Float) |
| bycatch_total_effort_portugal_longline | Total bycatch effort of Portugal longline (Integer) |
| bycatch_total_effort_spain_longline | Total bycatch effort of Spain longline (Integer) |
| bycatch_total_effort_france_longline | Total bycatch effort of France longline (Integer) |
| bycatch_total_effort_india_longline | Total bycatch effort of India longline (Integer) |
| bycatch_total_effort_seychelles_longline | Total bycatch effort of Seychelles longline (Integer) |
| bycatch_total_effort_taiwan_longline | Total bycatch effort of Taiwan longline (Integer) |
| bycatch_total_effort_madagascar_longline | Total bycatch effort of Madagascar longline (Integer) |
| bycatch_total_effort_mauritius_longline | Total bycatch effort of Mauritius longline (Integer) |
| bycatch_total_effort_united_kingdom_longline | Total bycatch effort of United Kingdom longline (Integer) |
| bycatch_total_effort_australia_longline | Total bycatch effort of Australia longline (Integer) |
| bycatch_total_effort_mozambique_longline | Total bycatch effort of Mozambique longline (Integer) |
| bycatch_total_effort_malaysia_longline | Total bycatch effort of Malaysia longline (Integer) |
| bycatch_total_effort_indonesia_longline | Total bycatch effort of Indonesia longline (Integer) |
| bycatch_total_effort_kenya_longline | Total bycatch effort of Kenya longline (Integer) |
| .final_pred | Predicted class of the species (String) |
| bycatch_total_effort | Total bycatch effort (Integer) |
| bycatch_total_effort_china_longline | Total bycatch effort of China longline (Integer) |
| bycatch_total_effort_korea_longline | Total bycatch effort of Korea longline (Integer) |
| bycatch_total_effort_japan_longline | Total bycatch effort of Japan longline (Integer) |
| sd_chla | Standard deviation of chlorophyll-a concentration (Float) |
| se_chla | Standard error of chlorophyll-a concentration (Float) |
| cv_chla | Coefficient of variation of chlorophyll-a concentration (Float) |
| mean_ssh | Mean sea surface height (Float) |
| median_ssh | Median sea surface height (Float) |
File: WCPFC_ll_models_others_results.csv
| Column name | Description |
|---|---|
| environmental_value | The environmental value associated with the area. (Float) |
| include_ssh | Whether or not sea surface height was included in the model. (Boolean) |
| price | The price of the data. (Float) |
| catch_transformation | The transformation applied to the catch data. (String) |
| mtry_class | The maximum number of variables randomly sampled at each split in the classification tree. (Integer) |
| min_n_class | The minimum observations in a node for a split to be considered valid. (Integer) |
| mtry_reg | The maximum number of variables randomly sampled at each split in the regression tree. (Integer) |
| min_n_reg | The minimum observations in a node for a split to be considered valid. (Integer) |
| rmse | The root mean square error. (Float) |
| rsq | The coefficient of determination. (Float) |
| mae | The mean absolute error. (Float) |
If you use this dataset in your research, please credit the original authors.
If you use this dataset in your research, please credit .
CREATE TABLE iattc_ll_models_others_results (
"environmental_value" VARCHAR,
"include_ssh" BOOLEAN,
"price" VARCHAR,
"catch_transformation" VARCHAR,
"mtry_class" BIGINT,
"min_n_class" BIGINT,
"mtry_reg" BIGINT,
"min_n_reg" BIGINT,
"rmse" DOUBLE,
"rsq" DOUBLE,
"mae" DOUBLE
);CREATE TABLE iattc_ll_untuned_final_predict (
"n__pred_class" BIGINT -- .pred Class,
"pres_abs" VARCHAR,
"catch" DOUBLE,
"rfmo" VARCHAR,
"year" BIGINT,
"latitude" DOUBLE,
"longitude" DOUBLE,
"species_sciname" VARCHAR,
"gear_group" VARCHAR,
"spatial_notes" VARCHAR,
"species_commonname" VARCHAR,
"species_group" VARCHAR,
"species_resolution" VARCHAR,
"median_price_group" DOUBLE,
"median_price_species" DOUBLE,
"catch_units" VARCHAR,
"original_effort" DOUBLE,
"sdm" DOUBLE,
"zone" VARCHAR,
"location_cluster" BIGINT,
"target_effort" DOUBLE,
"target_effort_belize_longline" DOUBLE,
"target_effort_china_longline" DOUBLE,
"target_effort_french_polynesia_longline" DOUBLE,
"target_effort_japan_longline" DOUBLE,
"target_effort_korea_longline" DOUBLE,
"target_effort_spain_longline" DOUBLE,
"target_effort_taipei_longline" DOUBLE,
"target_effort_united_states_of_america_longline" DOUBLE,
"target_effort_vanuatu_longline" DOUBLE,
"target_effort_panama_longline" DOUBLE,
"mean_sst" DOUBLE,
"median_sst" DOUBLE,
"min_sst" DOUBLE,
"max_sst" DOUBLE,
"sd_sst" DOUBLE,
"se_sst" DOUBLE,
"cv_sst" DOUBLE,
"mean_chla" DOUBLE,
"median_chla" DOUBLE,
"min_chla" DOUBLE,
"max_chla" DOUBLE,
"sd_chla" VARCHAR,
"se_chla" VARCHAR,
"cv_chla" VARCHAR,
"mean_ssh" DOUBLE,
"median_ssh" DOUBLE,
"min_ssh" DOUBLE,
"max_ssh" DOUBLE,
"sd_ssh" DOUBLE,
"se_ssh" DOUBLE,
"cv_ssh" DOUBLE,
"n__pred" DOUBLE -- .pred,
"n__final_pred" DOUBLE -- .final Pred
);CREATE TABLE iccat_ll_models_others_results (
"environmental_value" VARCHAR,
"include_ssh" BOOLEAN,
"price" VARCHAR,
"catch_transformation" VARCHAR,
"mtry_class" BIGINT,
"min_n_class" BIGINT,
"mtry_reg" BIGINT,
"min_n_reg" BIGINT,
"rmse" DOUBLE,
"rsq" DOUBLE,
"mae" DOUBLE
);CREATE TABLE iccat_ll_untuned_final_predict (
"n__pred_class" BIGINT -- .pred Class,
"pres_abs" VARCHAR,
"catch" DOUBLE,
"rfmo" VARCHAR,
"year" BIGINT,
"latitude" BIGINT,
"longitude" BIGINT,
"species_sciname" VARCHAR,
"catch_units" VARCHAR,
"gear_group" VARCHAR,
"spatial_notes" VARCHAR,
"original_effort" BIGINT,
"species_commonname" VARCHAR,
"species_group" VARCHAR,
"species_resolution" VARCHAR,
"median_price_group" DOUBLE,
"median_price_species" DOUBLE,
"sdm" DOUBLE,
"zone" VARCHAR,
"location_cluster" BIGINT,
"target_effort" DOUBLE,
"target_effort_belize_longline" DOUBLE,
"target_effort_china_longline" DOUBLE,
"target_effort_japan_longline" DOUBLE,
"target_effort_korea_longline" DOUBLE,
"target_effort_spain_longline" DOUBLE,
"target_effort_taipei_longline" DOUBLE,
"target_effort_united_states_of_america_longline" DOUBLE,
"target_effort_vanuatu_longline" DOUBLE,
"mean_sst" DOUBLE,
"median_sst" DOUBLE,
"min_sst" DOUBLE,
"max_sst" DOUBLE,
"sd_sst" DOUBLE,
"se_sst" DOUBLE,
"cv_sst" DOUBLE,
"mean_chla" DOUBLE,
"median_chla" DOUBLE,
"min_chla" DOUBLE,
"max_chla" DOUBLE,
"sd_chla" DOUBLE,
"se_chla" DOUBLE,
"cv_chla" DOUBLE,
"mean_ssh" DOUBLE,
"median_ssh" DOUBLE,
"min_ssh" DOUBLE,
"max_ssh" DOUBLE,
"sd_ssh" DOUBLE,
"se_ssh" DOUBLE,
"cv_ssh" DOUBLE,
"target_effort_barbados_longline" DOUBLE,
"target_effort_bermuda_longline" DOUBLE,
"target_effort_brazil_longline" DOUBLE,
"target_effort_canada_longline" DOUBLE,
"target_effort_malta_longline" DOUBLE,
"target_effort_maroc_longline" DOUBLE,
"target_effort_mexico_longline" DOUBLE,
"target_effort_namibia_longline" DOUBLE,
"target_effort_philippines_longline" DOUBLE,
"target_effort_st_vincent_and_grenadines_longline" DOUBLE,
"target_effort_trinidad_and_tobago_longline" DOUBLE,
"target_effort_uruguay_longline" DOUBLE,
"target_effort_venezuela_longline" DOUBLE,
"target_effort_turks_and_caicos_longline" DOUBLE,
"target_effort_great_britain_longline" DOUBLE,
"target_effort_cote_d_ivoire_longline" DOUBLE,
"target_effort_cyprus_longline" DOUBLE,
"n__pred" DOUBLE -- .pred,
"n__final_pred" DOUBLE -- .final Pred
);CREATE TABLE iotc_ll_models_others_results (
"environmental_value" VARCHAR,
"include_ssh" BOOLEAN,
"price" VARCHAR,
"catch_transformation" VARCHAR,
"mtry_class" BIGINT,
"min_n_class" BIGINT,
"mtry_reg" BIGINT,
"min_n_reg" BIGINT,
"rmse" DOUBLE,
"rsq" DOUBLE,
"mae" DOUBLE
);CREATE TABLE iotc_ll_untuned_final_predict (
"n__pred_class" BIGINT -- .pred Class,
"pres_abs" VARCHAR,
"catch" DOUBLE,
"rfmo" VARCHAR,
"year" BIGINT,
"latitude" DOUBLE,
"longitude" BIGINT,
"species_sciname" VARCHAR,
"catch_units" VARCHAR,
"gear_group" VARCHAR,
"spatial_notes" VARCHAR,
"original_effort" BIGINT,
"species_commonname" VARCHAR,
"species_group" VARCHAR,
"species_resolution" VARCHAR,
"median_price_group" DOUBLE,
"median_price_species" DOUBLE,
"sdm" DOUBLE,
"zone" VARCHAR,
"location_cluster" BIGINT,
"bycatch_total_effort" DOUBLE,
"bycatch_total_effort_china_longline" DOUBLE,
"bycatch_total_effort_korea_longline" DOUBLE,
"bycatch_total_effort_japan_longline" DOUBLE,
"mean_sst" DOUBLE,
"median_sst" DOUBLE,
"min_sst" DOUBLE,
"max_sst" DOUBLE,
"sd_sst" DOUBLE,
"se_sst" DOUBLE,
"cv_sst" DOUBLE,
"mean_chla" DOUBLE,
"median_chla" DOUBLE,
"min_chla" DOUBLE,
"max_chla" DOUBLE,
"sd_chla" DOUBLE,
"se_chla" DOUBLE,
"cv_chla" DOUBLE,
"mean_ssh" DOUBLE,
"median_ssh" DOUBLE,
"min_ssh" DOUBLE,
"max_ssh" DOUBLE,
"sd_ssh" DOUBLE,
"se_ssh" DOUBLE,
"cv_ssh" DOUBLE,
"bycatch_total_effort_portugal_longline" DOUBLE,
"bycatch_total_effort_spain_longline" DOUBLE,
"bycatch_total_effort_france_longline" DOUBLE,
"bycatch_total_effort_india_longline" DOUBLE,
"bycatch_total_effort_seychelles_longline" DOUBLE,
"bycatch_total_effort_taiwan_longline" DOUBLE,
"bycatch_total_effort_madagascar_longline" DOUBLE,
"bycatch_total_effort_mauritius_longline" DOUBLE,
"bycatch_total_effort_united_kingdom_longline" DOUBLE,
"bycatch_total_effort_australia_longline" DOUBLE,
"bycatch_total_effort_mozambique_longline" DOUBLE,
"bycatch_total_effort_malaysia_longline" DOUBLE,
"bycatch_total_effort_indonesia_longline" DOUBLE,
"bycatch_total_effort_kenya_longline" DOUBLE,
"n__pred" DOUBLE -- .pred,
"n__final_pred" DOUBLE -- .final Pred
);CREATE TABLE n_1x1_count_all_rfmos_ll_effort_results (
"model" VARCHAR,
"effort_source" VARCHAR,
"rfmo" VARCHAR,
"rmse" DOUBLE,
"rsq" DOUBLE,
"mae" DOUBLE
);CREATE TABLE n_1x1_mt_to_count_all_rfmos_ll_effort_results (
"model" VARCHAR,
"effort_source" VARCHAR,
"rfmo" VARCHAR,
"rmse" DOUBLE,
"rsq" DOUBLE,
"mae" DOUBLE
);CREATE TABLE n_5x5_count_all_rfmos_ll_effort_results (
"model" VARCHAR,
"effort_source" VARCHAR,
"rfmo" VARCHAR,
"rmse" DOUBLE,
"rsq" DOUBLE,
"mae" DOUBLE
);CREATE TABLE n_5x5_mt_to_count_all_rfmos_ll_effort_results (
"model" VARCHAR,
"effort_source" VARCHAR,
"rfmo" VARCHAR,
"rmse" DOUBLE,
"rsq" DOUBLE,
"mae" DOUBLE
);CREATE TABLE wcpfc_ll_models_others_results (
"environmental_value" VARCHAR,
"include_ssh" BOOLEAN,
"price" VARCHAR,
"catch_transformation" VARCHAR,
"mtry_class" BIGINT,
"min_n_class" BIGINT,
"mtry_reg" BIGINT,
"min_n_reg" BIGINT,
"rmse" DOUBLE,
"rsq" DOUBLE,
"mae" DOUBLE
);CREATE TABLE wcpfc_ll_untuned_final_predict (
"n__pred_class" BIGINT -- .pred Class,
"pres_abs" VARCHAR,
"catch" BIGINT,
"rfmo" VARCHAR,
"year" BIGINT,
"latitude" DOUBLE,
"longitude" DOUBLE,
"species_sciname" VARCHAR,
"catch_units" VARCHAR,
"gear_group" VARCHAR,
"spatial_notes" VARCHAR,
"original_effort" BIGINT,
"species_commonname" VARCHAR,
"species_group" VARCHAR,
"species_resolution" VARCHAR,
"median_price_group" DOUBLE,
"median_price_species" DOUBLE,
"sdm" DOUBLE,
"zone" VARCHAR,
"location_cluster" BIGINT,
"bycatch_total_effort" DOUBLE,
"mean_sst" DOUBLE,
"median_sst" DOUBLE,
"min_sst" DOUBLE,
"max_sst" DOUBLE,
"sd_sst" DOUBLE,
"se_sst" DOUBLE,
"cv_sst" DOUBLE,
"mean_chla" DOUBLE,
"median_chla" DOUBLE,
"min_chla" DOUBLE,
"max_chla" DOUBLE,
"sd_chla" DOUBLE,
"se_chla" DOUBLE,
"cv_chla" DOUBLE,
"mean_ssh" DOUBLE,
"median_ssh" DOUBLE,
"min_ssh" DOUBLE,
"max_ssh" DOUBLE,
"sd_ssh" DOUBLE,
"se_ssh" DOUBLE,
"cv_ssh" DOUBLE,
"bycatch_total_effort_na_longline" DOUBLE,
"n__pred" DOUBLE -- .pred,
"n__final_pred" DOUBLE -- .final Pred
);Anyone who has the link will be able to view this.