Baselight

370k English Words Corpus

Part of Speech Tagging

@kaggle.ruchi798_part_of_speech_tagging

About this Dataset

370k English Words Corpus

Context

Abbreviation Meaning
CC coordinating conjunction
CD cardinal digit
DT determiner
EX existential there
FW foreign word
IN preposition/subordinating conjunction
JJ adjective (large)
JJR adjective, comparative (larger)
JJS adjective, superlative (largest)
LS list item marker
MD modal (could, will)
NN noun, singular
NNS noun plural
NNP proper noun, singular
NNPS proper noun, plural
PDT predeterminer
POS possessive ending (parent\ 's)
PRP personal pronoun (hers, herself, him,himself)
PRP dollar-sign possessive pronoun (her, his, mine, my, our )
RB adverb (occasionally, swiftly)
RBR adverb, comparative (greater)
RBS adverb, superlative (biggest)
RP particle (about)
SYM symbol
TO infinite marker (to)
UH interjection (goodbye)
VB verb (ask)
VBG verb gerund (judging)
VBD verb past tense (pleaded)
VBN verb past participle (reunified)
VBP verb, present tense not 3rd person singular(wrap)
VBZ verb, present tense with 3rd person singular (bases)
WDT wh-determiner (that, what)
WP wh- pronoun (who)
WP dollar-sign possessive wh-pronoun
WRB wh- adverb (how)

Penn Part of Speech Tags

Content

This is the distribution of part-of-speech tags in this dataset:

Methodology

I've created this dataset with the help of nltk POS-tagger on an existing corpus of English words.

Share link

Anyone who has the link will be able to view this.