Home
Blogs
Questions
Jobs
Monetize

Home

About Us

Blogs

Questions

Jobs

Monetize

Post Job

banner

Questions about pdfminer

Read more about pdfminer

python (12.9k questions)

javascript (9.2k questions)

reactjs (4.7k questions)

java (4.2k questions)

java (4.2k questions)

c# (3.5k questions)

c# (3.5k questions)

html (3.3k questions)

Questions - pdfminer

Tab separated data is confused to tables when parsing pdf to text

I am using pdfMiner to convert pdf to txt. When there are tabs, the data is read column wise instead of row wise. For example, the below snippet in a PDF: titel1 : text1 title2: text2 titl...
test-img

A_Matar

pdf

layout

pdfminer

Votes: 0

Answers: 1

Latest Answer

I faced the same issu. Try it with pdfplumer (https://pypi.org/project/pdfplumber/) this is built up on pdfMiner. This Code worked perfectly fine for me: def pdf2txt(path): with pdfplumber.open(pa...
test-img

Nico Petermann

Extracting email address, first name and last name from multiple PDF files within a folder

I am trying to extract the following information from all PDF files within a folder, the PDF files are CV's: Email Address, First Name, Last Name for a work project. I have successfully managed to ext...
test-img

Berci Vagyok

python

pdf

text-extraction

pdfminer

Votes: 0

Answers: 1

Latest Answer

You can find email information because there is logic behind it match = re.search(r'[\w\.-]+@[a-z0-9\.-]+', text) But also you have to figure out a logic to find out first and last names of your PDF ...
test-img

pedro_bb7

Posts

Questions

Blogs

Jobs

The ultimate platform for coders and IT specialists

About

  • Company
  • Support

  • Platform

  • Terms & Conditions
  • Privacy statement
  • Cookie policy
  • Cookie option
  • OnlyCoders © 2025  |  All rights reserved