Baselight

Probelab Network Crawls Dataset

Probelab networking datasets for different web3 networks

@probelab.nebula_crawls

Loading...
Loading...

About this Dataset

Probelab Network Crawls Dataset

Nebula is Probelab's DHT crawler and monitor that is designed to track the liveliness and availability of peers. Nebula-based experiments are aimed at monitoring and improving the resilience and reliability of distributed systems by developing better tools for monitoring and managing decentralized peer-to-peer networks.

This dataset contains historical information about crawled data for different networks, including peer interactions, connection errors, protocols, multi-addresses, and timestamps. It provides a comprehensive view of these networks' structure and behavior, captured through regular crawls performed every 2 hours. The data is useful for researchers, developers, and network analysts interested in understanding peer connectivity, protocol usage, and network reliability. It includes both successful and failed connection attempts, along with detailed error information, making it a valuable resource for troubleshooting and optimizing network performance.

Each table in the dataset includes the crawling information for different networks.

Key Features

1. Crawl Identification

  • crawl_id: A unique ID identifying the crawl that generated this datapoint. Crawls are performed every 2 hours.

2. Peer Information

  • multi_hash: The multi-hash of the peer being crawled.
  • agent_version: The agent version that this peer advertised. Can be null.
  • peer_properties: Generic properties specific to certain networks.

3. Error Information

  • connect_error: The error when trying to establish a connection to the peer.
  • crawl_error: The error after successfully connecting to the peer and trying to drain its routing table.

4. Timestamps

  • visit_started_at: The timestamp when the crawl started.
  • visit_ended_at: The timestamp when the crawl finished.

5. Protocols and Addresses

  • protocols: A sorted array of advertised protocols. Can be empty if the connection was not established or the libp2p identify exchange was not completed.
  • multi_addresses: A sorted array of advertised multi-addresses used to connect to the peer.

Use Cases

  1. Network Analysis: Analyze the structure and behavior of the Filecoin network, including peer distribution and protocol usage.
  2. Error Analysis: Identify and troubleshoot common connection and crawl errors to improve network reliability.
  3. Protocol Usage: Study the adoption and usage of different protocols within the Filecoin network.
  4. Peer Behavior: Investigate patterns in peer behavior, including connectivity and activity over time.
  5. Time-Series Analysis: Perform time-based analysis of network activity, error rates, and protocol adoption.
  6. Tool Development: Develop tools and applications that leverage the data to improve Filecoin network monitoring and analysis.

Tables

Celestia Network Information

@probelab.nebula_crawls.celestia
  • 57.15 MB
  • 2826277 rows
  • 10 columns
Loading...

CREATE TABLE celestia (
  "crawl_id" BIGINT,
  "multi_hash" VARCHAR,
  "agent_version" VARCHAR,
  "connect_error" VARCHAR,
  "crawl_error" VARCHAR,
  "peer_properties" VARCHAR,
  "visit_started_at" TIMESTAMP,
  "visit_ended_at" TIMESTAMP,
  "protocols" VARCHAR,
  "multi_addresses" VARCHAR
);

Ethereum Consensus Layer Network Information

@probelab.nebula_crawls.ethcl
  • 39.59 GB
  • 209551928 rows
  • 10 columns
Loading...

CREATE TABLE ethcl (
  "crawl_id" BIGINT,
  "multi_hash" VARCHAR,
  "agent_version" VARCHAR,
  "connect_error" VARCHAR,
  "crawl_error" VARCHAR,
  "peer_properties" VARCHAR,
  "visit_started_at" TIMESTAMP,
  "visit_ended_at" TIMESTAMP,
  "protocols" VARCHAR,
  "multi_addresses" VARCHAR
);

Ethereum Execution Layer Network Information

@probelab.nebula_crawls.ethel
  • 80.58 GB
  • 408147260 rows
  • 10 columns
Loading...

CREATE TABLE ethel (
  "crawl_id" BIGINT,
  "multi_hash" VARCHAR,
  "agent_version" VARCHAR,
  "connect_error" VARCHAR,
  "crawl_error" VARCHAR,
  "peer_properties" VARCHAR,
  "visit_started_at" TIMESTAMP,
  "visit_ended_at" TIMESTAMP,
  "protocols" VARCHAR,
  "multi_addresses" VARCHAR
);

Filecoin Network Information

@probelab.nebula_crawls.filecoin
  • 521.61 MB
  • 21821810 rows
  • 10 columns
Loading...

CREATE TABLE filecoin (
  "crawl_id" BIGINT,
  "multi_hash" VARCHAR,
  "agent_version" VARCHAR,
  "connect_error" VARCHAR,
  "crawl_error" VARCHAR,
  "peer_properties" VARCHAR,
  "visit_started_at" TIMESTAMP,
  "visit_ended_at" TIMESTAMP,
  "protocols" VARCHAR,
  "multi_addresses" VARCHAR
);

Ipfs Network Information

@probelab.nebula_crawls.ipfs
  • 8.45 GB
  • 88068529 rows
  • 10 columns
Loading...

CREATE TABLE ipfs (
  "crawl_id" BIGINT,
  "multi_hash" VARCHAR,
  "agent_version" VARCHAR,
  "connect_error" VARCHAR,
  "crawl_error" VARCHAR,
  "peer_properties" VARCHAR,
  "visit_started_at" TIMESTAMP,
  "visit_ended_at" TIMESTAMP,
  "protocols" VARCHAR,
  "multi_addresses" VARCHAR
);

Polkadot Network Information

@probelab.nebula_crawls.polkadot
  • 2.85 GB
  • 154844910 rows
  • 10 columns
Loading...

CREATE TABLE polkadot (
  "crawl_id" BIGINT,
  "multi_hash" VARCHAR,
  "agent_version" VARCHAR,
  "connect_error" VARCHAR,
  "crawl_error" VARCHAR,
  "peer_properties" VARCHAR,
  "visit_started_at" TIMESTAMP,
  "visit_ended_at" TIMESTAMP,
  "protocols" VARCHAR,
  "multi_addresses" VARCHAR
);

Share link

Anyone who has the link will be able to view this.