How to Export SharePoint Pages to Markdown (2026 Guide)

·

SharePoint is where Microsoft 365 teams store years of institutional knowledge. Meeting notes, project wikis, engineering docs, HR policies — it all accretes there. But when you need that content somewhere else — a static site, an AI knowledge base, a Markdown-first documentation system — SharePoint doesn’t have an export button.

This guide covers the realistic methods to get clean Markdown out of SharePoint pages.

Why Export SharePoint to Markdown?

  • Migrating off SharePoint — to Notion, Confluence, a Git-based docs repo, or a modern static site
  • Building an internal AI knowledge base — feeding a RAG system on Claude, ChatGPT Enterprise, or Microsoft Copilot with clean chunks
  • Archiving retiring sites — before a Microsoft 365 tenant rotation or license downgrade
  • Publishing internal docs externally — turning internal knowledge into a public documentation site
  • Cross-team portability — Markdown is readable in any tool, forever

Method 1: Minibase Chrome Extension (one page at a time)

Minibase converts any SharePoint page to clean Markdown with a single click.

What Minibase captures from SharePoint:

  • Page title, creation date, author, last-modified timestamp
  • Page body with heading structure preserved
  • Inline images as Markdown references
  • Tables rendered as Markdown tables
  • Embedded documents linked as Markdown URLs
  • Web parts: text blocks, quick links, news, lists — extracted as flattened content

What Minibase strips:

  • Site navigation, left-pane, and ribbon
  • Comment and @ mention panels (the comments themselves are kept at the bottom)
  • Promoted-page callouts and admin-only badges

When it’s the right tool: exporting 1–20 pages, working with content you can only access via your browser session, quick one-off conversions.

Method 2: Microsoft Graph API

For bulk exports — entire sites or whole document libraries — the Microsoft Graph API is the canonical path.

Typical flow:

  1. Register an Azure AD app with Sites.Read.All and Files.Read.All permissions
  2. Get an access token via client-credentials or device-code flow
  3. Call /sites/{site-id}/pages to list pages
  4. For each page, call /sites/{site-id}/pages/{page-id} to get the content model
  5. Walk the canvasLayout structure and emit Markdown

Pros: scalable, scriptable, works unattended, preserves structured metadata.

Cons: Azure AD setup, throttling at large scale, requires a developer.

Open-source helpers: PnP PowerShell has Get-PnPPage that returns a page object you can post-process to Markdown. Office365 Python SDK offers similar for Python shops.

Method 3: PowerShell + PnP.PowerShell

Microsoft’s community-maintained PowerShell module is the go-to for SharePoint admins. A basic export loop:

Connect-PnPOnline -Url https://tenant.sharepoint.com/sites/YourSite -Interactive
$pages = Get-PnPClientSidePage
foreach ($page in $pages) {
  $html = (Get-PnPPage -Identity $page.Name).LayoutWebpartsContent
  # Pipe $html through a converter (pandoc, turndown, etc.)
  $html | Out-File "$($page.Name).html"
}

Then convert each HTML file to Markdown with Pandoc or Turndown.

When it shines: bulk exports on admin-accessible tenants, migration projects.

Method 4: SharePoint Online “Export to PDF” + OCR

The nuclear option: print every page to PDF, then OCR the PDFs into Markdown. Not recommended — you lose all structure, table layouts, and link fidelity. Mentioned only so you don’t waste time trying it.

What about OneNote?

OneNote is a sibling problem with a different answer:

  • Best path: open the OneNote web app (onenote.com) and use Minibase on each page — captures pages as clean Markdown
  • Alternative: Microsoft Graph API has /me/onenote/pages endpoints that return page content as HTML, which you convert to Markdown
  • Avoid: OneNote’s built-in “Print to PDF” — same structure loss as SharePoint’s

Handling SharePoint’s quirks

A few gotchas worth knowing:

  • Web parts can embed arbitrary content — news rotators, events, quick links. Minibase flattens these to plain Markdown; the Graph API gives you structured JSON that you can render your own way.
  • Modern vs. classic pages have very different HTML. Save auto-detects; scripts need to branch on the page type.
  • Permissions travel with the user. If you’re exporting for migration, make sure you have Site Owner access — some sections may be hidden to standard members.
  • Large images are inlined as base64 in some exports. Minibase gives you URL-linked references; strip the base64 from scripted exports if you hit file-size issues.

The practical choice for most teams

  • 1–20 pages: Minibase extension. Free tier handles 3/mo; Plus unlocks unlimited for $5.99.
  • 20–500 pages: PnP PowerShell + Pandoc conversion. Spin up a script, run it once.
  • 500+ pages: Graph API with a proper ingestion pipeline, scheduled.
  • You want the content in Claude or ChatGPT right now: Minibase — exports are already optimised for LLM context windows.

SharePoint content was never meant to stay locked there. Markdown is the portable format. Pick the method that matches your scale.

Continue reading

S

Written by

Save Team

Learn more about Minibase

Ready to save smarter?

Convert any webpage to Markdown with one click.

Add to Chrome