Skip to content

reja273/NLP

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Sports Named Entity Recognition (Sports NER) Dataset

This dataset is designed for Named Entity Recognition (NER) tasks in the sports domain, focusing on extracting structured information from unstructured sports-related text. Named Entity Recognition is a fundamental task in Natural Language Processing (NLP) that identifies and classifies key elements such as names, locations, and events within text data :contentReference[oaicite:0]{index=0}.

The dataset contains annotated sentences covering a wide range of sports contexts, including match descriptions, player performances, tournament details, and rule-based scenarios.

Each text sample is labeled with domain-specific entity categories, including:

  • Player Name
  • Team Name
  • Tournament Name
  • Location
  • Equipment Name
  • Rules or Penalty
  • Common Sports Terms (CST)
  • Date and Time

These annotations enable the development of customized NER models tailored to sports analytics, where traditional general-purpose NER systems may not perform effectively due to domain-specific terminology :contentReference[oaicite:1]{index=1}.

The dataset can be used for various applications such as:

  • Sports analytics and information extraction
  • Intelligent sports news summarization
  • Chatbots and question-answering systems
  • Automated commentary analysis
  • Knowledge graph construction in sports domain

This dataset is publicly available on Mendeley Data and can be accessed via the following DOI: https://data.mendeley.com/datasets/rcf4kbxtf8/2

License: Creative Commons Attribution 4.0 (CC BY 4.0)

If you use this dataset in your research or project, please cite the original source.

About

No description or website provided.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors