Relato Business Graph Database
Visualizing Company Relationships & Market Trends
By Russell Jurney [source]
About this dataset
This dataset contains 373,663 links between businesses pulled from the web, providing an insightful reflection of intercompany relationships. It is sourced from a graph database hosted by the startup Relato, the origins of which stem from a Turk system in which users would log partnerships, example customers and more onto an autocomplete form over a database.
These edges are defined primarily using domain names as unique IDs - allowing for information to be collated without succumbing to the extremely complex problem of entity resolution of companies. This data was used to build lead generation systems and market visualization systems – enabling powerful insights into patterns emerging from business relations.
This largely unstructured dataset gives rise to many questions: How does the business graph operate? How do companies relate to one another? What other problems can this dataset be used for and how else can it be extended? By exploring such questions with this set we can enrich our understanding of corporate connections and discover potential opportunities for further research and marketing efforts.
Get involved with this project: Contribute new edges; add metadata about companies; or analyze this substantial source material alone or in conjunction with various other public datasets!
More Datasets
For more datasets, click here.
Featured Notebooks
- 🚨 Your notebook can be here! 🚨!
How to use the dataset
This dataset is a great resource for researchers and practitioners interested in understanding the business relationships between companies. The dataset contains 373,663 links between companies, including information about the type of link, time it was updated, and domains of the two companies. This can be used to identify new potential business partners or competitors by understanding connections within industries or networks of customers.
To get started with this dataset, explore the various columns provided in this data set such as update_time (the timestamp when an entry was last update), domain hostname where a relationship was found for one company during mining), username (user who last updated entry), home_name (name of the home company mentioned in link information) , link _name ( name of linked company mentioned in link information) , type (type o relationship between home and linked company like partnership etc.), home_domain(domain examined where relationship was found for onee compmany during mining) ,link domain( domain deriving from second organisation). You can also browse through any online visualization tools such as Gephi to understand connection patterns from this dataset .
Say you want to explore connections within a certain industry-you could parse out entries for that industry by filtering columns. For example if you wanted to understand how tech companies are connected - use that column values like ' Technology ', 'Telecoms'andrelated words in all columns to parsethrough data entries efficiently . Once you have identified those entries then leverage details about their partners/customers/potential investors from linkage points across different firms .
Using graph analysis on your subset looking at most prefered connection points amongst firms might enable identification into potential areas or marketsfor pursuing collaborations within your sector or even spotting upcoming trends & risk around interdependencies among sectors over time etc. Further breakdown consistency in time updates might allow you track shifting dynamicsof relationshpis form 1 quarter to another etc.. Finally once completed never forget double check whether these observed patterns have enough evidence since many times realtion ship data gathered may lack accuracy due its collection manually over web invormation which sometimes is not timely kept up too date hence always make sure to be aware of errors incurred common;y due extractions form internet sources especially whenits manually done
Hopefully these tips would help provide starting point for exploring such graphs! Have fun data hunting!
Research Ideas
- Building an AI/Machine learning powered sales lead manager. By analyzing the graph, this tool could provide insights about a company's potential customers, partners, competitors and suppliers.
- Creating a visualization tool for marketers to understand their markets better by visualizing the connections between companies in different industries and sectors.
- Creating an interactive web-based search engine that allows users to quickly find information on the company they are looking for by exploring relationships between companies across different industries or sectors in real-time
Acknowledgements
If you use this dataset in your research, please credit the original authors.
Data Source
License
License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication
No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.
Columns
File: companies.csv
Column name |
Description |
update_time |
The time a link was last updated. (DateTime) |
domain |
The domain name of one company involved in a given link. (String) |
username |
The user who entered or updated a given link. (String) |
File: links.csv
Column name |
Description |
update_time |
The time a link was last updated. (DateTime) |
username |
The user who entered or updated a given link. (String) |
domain |
The domain name of one company involved in a given link. (String) |
home_name |
The name of one company involved in a given link. (String) |
link_name |
The name of another company involved in a given link. (String) |
type |
The type of relationship between two companies. (String) |
home_domain |
The domain name of one company involved. (String) |
link_domain |
The domain name of another company involved. (String) |
Acknowledgements
If you use this dataset in your research, please credit the original authors.
If you use this dataset in your research, please credit Russell Jurney.