You need to extract a table from a PDF to Excel free of charge, and you've already tried the obvious route: select the table, copy, paste into a spreadsheet. What you got was a single column of jumbled text where rows and columns used to be. That's not a bug in your clipboard. It's how PDFs work, and it's why extracting tables from them is one of the most common frustrations in office work.

The good news: there are reliable ways to get a clean table out of a PDF and into Excel without paying for software or installing anything. This guide covers three methods, starting with the one that works on the widest range of documents.

Why Tables Break When You Copy-Paste from PDF

A PDF doesn't store tables as rows and columns. According to the PDF specification (ISO 32000), a PDF stores each character individually with an x,y coordinate on the page. There's no concept of a "cell" or a "row" in the file format itself. What looks like a neatly aligned table on screen is actually hundreds of separate text fragments positioned to create the visual illusion of structure.

When you copy-paste, your operating system reads those fragments left-to-right, top-to-bottom, and dumps them as a single text stream. Column boundaries vanish. Numbers that belonged in "Quantity" end up next to text from "Description." Multi-line cell content gets split across rows it doesn't belong to. The more complex the table layout, the worse the result.

This is why PDF copy-paste breaks your data every time. The format was designed for printing, not for data exchange. Any tool that successfully extracts tables from PDFs has to reconstruct the grid structure from visual positioning — essentially reading the document the way a person reads it, inferring where one column ends and another begins.

3 Ways to Extract a Table from PDF to Excel

Method 1: CleanTably AI (Best for Any PDF)

CleanTably uses AI to read the visual layout of your PDF and reconstruct the table structure. It works with digital PDFs, scanned documents, and photos of printed pages. Because it interprets the document visually rather than parsing tagged markup, it handles the irregular layouts that break simpler tools — merged cells, multi-level headers, tables that span multiple pages, and documents with mixed content (text paragraphs alongside tables).

This method requires no software installation and no account. You upload the file, wait about 20 seconds, and download an Excel spreadsheet with the table data in clean rows and columns. It's free for up to 20 documents per day.

Method 2: Excel Power Query (Simple PDFs Only)

If you have Excel for Microsoft 365 (desktop version), Power Query can import data from PDF files directly. Go to Data > Get Data > From File > From PDF. Excel will attempt to detect tables in the document and let you choose which one to import. Microsoft's Power Query documentation walks through the interface.

The limitation: Power Query works best on PDFs with simple, well-structured tables — clean gridlines, no merged cells, no multi-level headers. It relies on tagged PDF structure, so scanned documents and image-based PDFs won't work at all. If your PDF was generated from a database or a well-formatted report, Power Query is a solid option. For anything more complex, it tends to produce messy output that requires significant cleanup.

Method 3: Manual Copy-Paste Cleanup

If the table is small (under 20 rows) and the layout is simple, you can sometimes get away with copy-paste plus manual repair. Copy the table from the PDF, paste into Excel, then use Text to Columns (Data tab) to split the jumbled text into separate columns. You'll need to manually fix misaligned rows, re-merge split cells, and verify every number.

This works in a pinch for a single small table. It doesn't scale. If you're extracting tables regularly or working with anything beyond a basic grid, the time spent on manual cleanup adds up fast.

Step-by-Step: Extract a Table with CleanTably

  1. Get your PDF ready. Download it from whatever system generated it — an email attachment, a government portal, an accounting application. If you only have a paper document, take a clear photo with your phone (flat surface, even lighting, all edges visible).
  2. Go to cleantably.com.
  3. Upload the file. Drag the PDF, JPG, or PNG into the upload area, or click to browse your files. Multi-page PDFs are processed as a single document.
  4. Wait for processing. The AI reads the document, identifies table structures, and maps them into rows and columns. This typically takes 15–30 seconds depending on complexity.
  5. Download the Excel file. Click the download button to save the .xlsx file. Open it in Excel, Google Sheets, or any spreadsheet application.

Numbers come out as actual numeric values, not text strings, so formulas like SUM and AVERAGE work immediately. Column headers are preserved from the original table. If the PDF contains multiple tables, they appear in the spreadsheet in the same order they appear in the document.

Try it nowUpload a PDF with a table and get a clean spreadsheet in seconds. Free, no account required.

Tips for Complex Tables

Most tables extract cleanly without any intervention, but certain layouts deserve extra attention.

Merged cells. When a PDF table has cells that span multiple columns or rows (common in financial reports and government forms), the AI has to decide how to represent that merge in a flat spreadsheet grid. Usually it places the merged value in the top-left cell of the range. If your table has heavy merging, open the output and verify that values landed in the right columns.

Multi-page tables. Tables that continue across page breaks are one of the hardest cases for extraction tools. Headers may repeat on each page, page numbers may intrude into the table area, and row alignment can shift between pages. CleanTably processes all pages together (up to 20) and attempts to stitch the table into a continuous dataset. For best results, upload the full multi-page document as a single PDF rather than splitting it into individual pages.

Scanned PDFs and photos. A scanned document is just an image embedded in a PDF wrapper — there's no text data to extract, only pixels. The AI reads the image the same way it reads a photograph of a printed page. Quality matters: 200 DPI or higher produces reliable results. Faded ink, skewed scans, or low-resolution photos can cause character misreads, especially in small-print columns like decimals and dates.

Repeated headers across pages. Some multi-page tables repeat the header row on every page. The AI typically recognizes these as repeated headers and includes them once in the output. If you see duplicate header rows in your spreadsheet, just delete the extras — the data rows underneath will be correct.

Tables mixed with text. Many PDFs contain paragraphs of text above, below, or alongside tables. The AI focuses on extracting tabular data and ignores surrounding text content. If the table is embedded in a dense document (like a 40-page annual report), the table data still gets extracted cleanly without pulling in the narrative sections.

Common Use Cases for PDF Table Extraction

Table extraction isn't one problem — it's a dozen different problems wearing the same hat. The specific challenges depend on what kind of document you're working with.

Financial statements and annual reports. These typically have well-structured tables but span many pages with subtotals, section headers, and footnotes mixed into the table flow. The key challenge is preserving the hierarchy — keeping "Total Revenue" separate from individual line items — while producing a flat spreadsheet that's ready for analysis.

Government and regulatory filings. Tax forms, customs declarations, and compliance reports are notorious for irregular table layouts. Cells span multiple columns, sections have different numbers of columns, and labels sit in unexpected positions. These are the documents where manual copy-paste fails most dramatically.

Scientific and research data. Lab results, survey data, and statistical tables often appear in published PDFs (journal articles, whitepapers) where the original dataset isn't available. Extracting the table into Excel lets you run your own analysis, create new charts, or incorporate the data into a larger study.

Invoices and purchase orders. Line-item tables on invoices — product descriptions, quantities, unit prices, totals — are the data that accounting teams need in their spreadsheets. The layout varies by vendor, which is why template-based extraction tools require a different template for every supplier. AI-based extraction handles the variation without configuration.

Legacy data migration. Organizations sitting on years of PDF reports often need that historical data in a usable format. Extracting tables from hundreds of archived PDFs into Excel is the first step toward loading that data into a modern database or BI tool.

Extract your table now

Upload any PDF and download the table as a clean Excel spreadsheet. Free. No software to install.

Upload PDF

Frequently Asked Questions

Can I extract a table from a scanned PDF?

Yes. CleanTably's AI reads tables from both digital PDFs and scanned images (JPG, PNG). It interprets the visual layout the same way a person would, so scan quality matters more than file format.

What if my PDF has multiple tables on one page?

CleanTably detects and extracts all tables it finds in the document. Multiple tables on the same page are placed into separate sections of the spreadsheet so they stay organized.

Does it handle complex layouts like merged cells or nested headers?

The AI interprets visual structure rather than relying on tagged PDF markup, so it handles merged cells, multi-level headers, and irregular layouts better than rule-based tools. For very complex merges, spot-check the output against the original.

Is there a page limit?

CleanTably processes up to 20 pages per document. If your PDF is longer, it extracts the first 20 pages and includes a notice in the spreadsheet. You can split longer documents into batches.

Do I need to install any software?

No. CleanTably runs entirely in your browser. Upload the PDF, wait a few seconds, and download the Excel file. No desktop software, no plugins, no account required.