Scicite (Classifying Citation Intents In Papers)
Classifying citation intents in academic papers
By Huggingface Hub [source]
About this dataset
Discover a world of knowledge and power with scicite! Through its labeled data of scholarly citations extracted from scientific articles, scicite unlocks the key to uncovering information in multiple fields like computer science, biomedicine, ecology and beyond. Laid out in easily digestible columns including strings, section names, labels, isKeyCitations, label2s and more – you’ll soon find yourself losing track of time as you explore this goldmine of facts and figures. With a quick glance at each entry noted down in the dataset’s information log, you can quickly start pinpointing pertinent pieces of info straight away; from sources to key citations to start/end indices that say it all. Don't be left behind - unlock the power hidden within today!
More Datasets
For more datasets, click here.
Featured Notebooks
- 🚨 Your notebook can be here! 🚨!
How to use the dataset
This dataset consists of three CSV files, each containing different elements related to scholarly citations gathered from scientific articles: train.csv, test.csv and validation.csv. These can be used in a variety of ways in order to gain insight into the research process and improve its accuracy and efficiency.
-
Extracting useful information from citations: The labels attached to each citation section can help in extracting specific information about the sources cited or any other data included for research purposes. Additionally, isKeyCitation gives an indication if the source referred is a key citation which could be looked into in greater detail by researchers or practitioners.
-
Identifying relationships between citations: scicite's sectionName column helps identify related elements of writing including introduction and abstracts that enable the identification of Potential relationships between these elements and references found within them thus helping better understand what connections scholar have made previously with their research pieces
-
Improving accuracy in data gathering: With string, citeStart and citeEnd columns available along with source labels one can easily identify if certain references are repeated multiple times while also double checking accuracy through start/end values associated with them
-
Validation purposes : Last but not least one can also use this dataset for validating documents written by scholars for peer review where similar sections found prior inside unrelated documents can be used as reference points that need to match signaling correctness on original authors part
Research Ideas
- Developing a search engine to quickly find citations relevant to specific topics and research areas.
- Creating algorithms that can predict key citations and streamline the research process by automatically including only the most important references in a paper.
- Designing AI systems that can accurately classify, analyze and summarize different scholarly works based on the citation frequency, source type & label assigned to them
Acknowledgements
If you use this dataset in your research, please credit the original authors.
Data Source
License
License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication
No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.
Columns
File: validation.csv
Column name |
Description |
string |
The string of text associated with the citation. (String) |
sectionName |
The name of the section the citation is found in. (String) |
label |
The label associated with the citation. (String) |
isKeyCitation |
A boolean value indicating whether the citation is a key citation. (Boolean) |
label2 |
The second label associated with the citation. (String) |
citeEnd |
The end index of the citation in the text. (Integer) |
citeStart |
The start index of the citation in the text. (Integer) |
source |
The source of the citation. (String) |
File: train.csv
Column name |
Description |
string |
The string of text associated with the citation. (String) |
sectionName |
The name of the section the citation is found in. (String) |
label |
The label associated with the citation. (String) |
isKeyCitation |
A boolean value indicating whether the citation is a key citation. (Boolean) |
label2 |
The second label associated with the citation. (String) |
citeEnd |
The end index of the citation in the text. (Integer) |
citeStart |
The start index of the citation in the text. (Integer) |
source |
The source of the citation. (String) |
File: test.csv
Column name |
Description |
string |
The string of text associated with the citation. (String) |
sectionName |
The name of the section the citation is found in. (String) |
label |
The label associated with the citation. (String) |
isKeyCitation |
A boolean value indicating whether the citation is a key citation. (Boolean) |
label2 |
The second label associated with the citation. (String) |
citeEnd |
The end index of the citation in the text. (Integer) |
citeStart |
The start index of the citation in the text. (Integer) |
source |
The source of the citation. (String) |
Acknowledgements
If you use this dataset in your research, please credit the original authors.
If you use this dataset in your research, please credit Huggingface Hub.