Skip to content

katharinawuensche/NLPdf

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

NLPdf

PDF Extractor using Natural Language Processing

Quickstart

  1. Download the repository
  2. Install the requirements:
pip3 install -r requirements.txt 
  1. Load the language model for Spacy:
python3 -m spacy download en
  1. Copy the PDF files to be cleaned into the directory "PDFs"
  2. Run the extraction tool:
python3 run.py 
  1. The output is written to the directory "output"

About

PDF Extractor using Natural Language Processong

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages