Baselight

NYC STEW-MAP Staten Island Organizations' Website Hyperlink Webscrape

U.S. Environmental Protection Agency

@usgov.epa_gov_nyc_stew_map_staten_island_organizations_websi_c1cab08a

Loading...
Loading...

About this Dataset

NYC STEW-MAP Staten Island Organizations' Website Hyperlink Webscrape

The data represent web-scraping of hyperlinks from a selection of environmental stewardship organizations that were identified in the 2017 NYC Stewardship Mapping and Assessment Project (STEW-MAP) (USDA 2017). There are two data sets: 1) the original scrape containing all hyperlinks within the websites and associated attribute values (see "README" file); 2) a cleaned and reduced dataset formatted for network analysis.

For dataset 1: Organizations were selected from from the 2017 NYC Stewardship Mapping and Assessment Project (STEW-MAP) (USDA 2017), a publicly available, spatial data set about environmental stewardship organizations working in New York City, USA (N = 719). To create a smaller and more manageable sample to analyze, all organizations that intersected (i.e., worked entirely within or overlapped) the NYC borough of Staten Island were selected for a geographically bounded sample. Only organizations with working websites and that the web scraper could access were retained for the study (n = 78). The websites were scraped between 09 and 17 June 2020 to a maximum search depth of ten using the snaWeb package (version 1.0.1, Stockton 2020) in the R computational language environment (R Core Team 2020).

For dataset 2: The complete scrape results were cleaned, reduced, and formatted as a standard edge-array (node1, node2, edge attribute) for network analysis. See "READ ME" file for further details.

References:
R Core Team. (2020). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL https://www.R-project.org/. Version 4.0.3.

Stockton, T. (2020). snaWeb Package: An R package for finding and building social networks for a website, version 1.0.1.

USDA Forest Service. (2017). Stewardship Mapping and Assessment Project (STEW-MAP). New York City Data Set. Available online at https://www.nrs.fs.fed.us/STEW-MAP/data/.

This dataset is associated with the following publication:
Sayles, J., R. Furey, and M. Ten Brink. How deep to dig: effects of web-scraping search depth on hyperlink network analysis of environmental stewardship organizations. Applied Network Science. Springer Nature, New York, NY, 7: 36, (2022).
Organization: U.S. Environmental Protection Agency
Last updated: 2022-11-21T13:44:08.015343
Tags: decision-support-tools, environmental-governance, environmental-stewardship, hyperlink-networks, sna, social-network-analysis, web-scraping

Tables

@usgov.epa_gov_nyc_stew_map_staten_island_organizations_websi_c1cab08a.nyc_staten_island_stew_map_hyperlink_webscrape_reduced_9fecbda1
  • 36.1 KB
  • 2565 rows
  • 3 columns
Loading...

CREATE TABLE nyc_staten_island_stew_map_hyperlink_webscrape_reduced_9fecbda1 (
  "searchurl_cleaned" VARCHAR,
  "root_with_recodes" VARCHAR,
  "depth" BIGINT
);

Share link

Anyone who has the link will be able to view this.