- Python 100%
| converter | ||
| tests | ||
| .gitignore | ||
| convert.py | ||
| README.md | ||
pmwiki-to-outline
End-to-end migration guide and tool for moving a PmWiki installation into Outline.
This document walks an admin through the full journey — from copying wiki.d off the PmWiki server to a working import in Outline. Tool internals (architecture, extending rules, tests) are covered at the bottom.
wiki.d/ wiki-out/
Main.HomePage ─→ Main/
Main.GettingStarted HomePage.md
Recipes.FooBar GettingStarted.md
Recipes/
uploads/ FooBar.md
Main/ attachments/
image.png Main/image.png
TODO — Domain conversion is NOT YET IMPLEMENTED
Absolute-URL links that reference the old PmWiki server by its hostname are currently left unchanged in the output.
Prerequisites
- Local machine with Python 3.9+ installed (no third-party Python packages required — stdlib only).
- Access to the PmWiki server via SSH/SFTP to retrieve
wiki.d/anduploads/directories. - Outline workspace access:
- For bulk ZIP import, you need Outline admin rights (Settings → Preferences → Import).
- Without admin: you can still drag-and-drop
.mdfiles into any Collection you have write access to, one folder at a time.
Step 1 — Copy the PmWiki data to your local machine
PmWiki keeps everything in two directories on the server. Typical paths:
wiki.d/— page files (one file per page, namedGroup.PageName, no extension)uploads/— attachments (organized by group:uploads/<Group>/<filename>)
Paths may differ; check the PmWiki install's config.php for the $WorkDir and $UploadDir variables.
Copy both to the directory where you'll run the converter:
rsync -av user@pmwiki-server:/var/www/pmwiki/wiki.d/ ./wiki.d/
rsync -av user@pmwiki-server:/var/www/pmwiki/uploads/ ./uploads/
Step 2 — Run the converter
python3 convert.py \
--wiki-dir ../wiki.d \
--uploads-dir ../uploads \
--out ../wiki-out \
--report
Flags:
| Flag | Required | Description |
|---|---|---|
--wiki-dir |
yes | Path to PmWiki's wiki.d directory |
--uploads-dir |
no | Path to PmWiki's uploads directory (attachments). Omit if you have none. |
--out |
yes | Where to write the converted Markdown tree (must be empty, or pass --overwrite) |
--overwrite |
no | Allow writing into a non-empty --out directory |
--report |
no | After conversion, scan output for any surviving PmWiki syntax and print a per-category summary |
The output tree mirrors Outline's Collection/document structure:
wiki-out/
├── Main/
│ ├── HomePage.md
│ ├── GettingStarted.md
│ └── ...
├── Recipes/
│ └── FooBar.md
└── attachments/
└── Main/
└── logo.png
Each top-level directory will become a Collection in Outline; the .md files within become documents. Attachments are referenced from page content as ../attachments/<Group>/<filename>.
Step 3 — Package for Outline import
cd wiki-out
zip -r ../wiki.zip .
cd ..
Outline's import accepts a ZIP whose top-level folders become Collections — which is exactly what we produce.
Size note: Outline's bulk import is capped at ~1.5 GB. For most wikis this is fine. If your ZIP exceeds that, split by Collection:
cd wiki-out
for group in */; do
zip -r "../${group%/}.zip" "$group" attachments
done
You'll import one group at a time.
Step 4 — Import into Outline
4a. Admin bulk import (recommended)
- In Outline: Settings → Preferences → Import
- Choose Markdown
- Upload
wiki.zip - Wait for the async import to finish (email notification, or refresh the Import page)
- Each top-level directory in the ZIP becomes a Collection; files within become documents
4b. Non-admin per-collection (drag-drop)
If you don't have admin rights but have write access to at least one Collection:
- Open the target Collection in Outline
- Drag-and-drop
.mdfiles (or a folder of them) directly into the Collection - Repeat for each Collection
4c. API (for scripted re-imports)
If you expect to re-import (e.g. to refine the converter and re-run), scripting against the Outline API is worth it:
curl -X POST https://outline.example.com/api/documents.import \
-H "Authorization: Bearer $OUTLINE_TOKEN" \
-F "file=@Main/HomePage.md" \
-F "collectionId=$COLLECTION_ID" \
-F "publish=true"
The response includes a file operation ID; poll fileOperations.info to know when each import completes.
Step 5 — Verify after import
- Counts match: the number of documents in Outline matches the
.mdcount inwiki-out/(find wiki-out -name '*.md' | wc -l). - Spot-check 5–10 random pages — formatting, links resolve, images render.
- Broken-link pass: inside Outline, search for pages containing
(.md)in prose — any such literal pattern is a link that didn't resolve. The converter tries to match the output tree, so these should be rare. - Attachment pass: open a page with images. The files should render, not show as broken images.
- TODO pass: search for
TODO:inside Outline. These are the converter's visible markers for manual attention (see next section).
Known lossy conversions
| Construct | What the converter does | Action for the reviewer |
|---|---|---|
| Edit history | Not preserved (Outline's import doesn't support per-document history anyway) | Optional: keep a separate git-history backup using pmwiki-to-git (see Optional section below) |
| Authorship | All docs land under the importing user | Create a dedicated "Import" user in Outline so migrated content is attributable |
(:pagelist ...:) |
Replaced with <!-- TODO: (:pagelist ...:) --> |
Manually rebuild the page list or delete the TODO |
(:include OtherPage:) |
Replaced with <!-- TODO: (:include OtherPage:) --> |
Inline the target content or delete |
(:redirect OtherPage:) |
Replaced with a visible pointer: → [OtherPage](./OtherPage.md) <!-- was (:redirect OtherPage:) --> |
Fine as-is, or edit the line |
(:Summary: X:) |
Stripped silently; summary text lost | If summaries matter: switch the converter to <!-- Summary: X --> comments |
(:input ...:) form widgets |
Stripped | No Outline equivalent; accept the loss |
(:table:)...(:tableend:) advanced tables |
Replaced with <!-- TODO: PmWiki advanced table --> + inner content (cell markers stripped) |
Manually reconstruct as a Markdown pipe table |
Long-tail custom PmWiki recipes (e.g. (:workadventure-url:), (:jitsi-url:), (:e_preview:)) |
Left as literal text | Decide per-recipe: delete, or replace with meaningful content |
Unpaired style markers outside the whitelist (e.g. %Siteem%, %LOCALAPPDATA%) |
Preserved as literal text | Hand-fix on affected pages |
URL-encoded UTF-8 bytes like %C3%A4 inside URLs |
Preserved (correct) | Not a bug — these are valid URL syntax |
[[SomePage]] with unusual characters (e.g. [, ] in page names) |
Some edge cases survive as literal [[...]] |
Manually replace with proper Markdown links |
Re-running
The converter is idempotent for a given input. To re-run after refining the converter or cleaning up source pages:
rm -rf wiki-out/
python3 convert.py --wiki-dir ../wiki.d --uploads-dir ../uploads --out ../wiki-out --report
Re-import into a fresh Collection or fresh workspace to avoid duplicate content — Outline imports append, they don't replace.
Troubleshooting
- "Converted 0 page(s)" — check that
--wiki-dirpoints at the directory that contains files namedGroup.PageName, not at a parent directory. - Parse errors listed after the "Converted N" line — the file names are shown; usually indicates unusual characters in filenames. Rename the source file or skip it and add a note.
- Mojibake in output (e.g.
Lösunginstead ofLösung) — the page file's charset isn't being detected correctly. The parser reads the header'scharset=field if present (default UTF-8). Check a page file for acharset=line; if it saysiso-8859-1or similar and the file actually contains UTF-8 bytes, UTF-8 decoding will produce replacement chars. Open an issue if you hit this. - Import fails with "file too large" — split the ZIP by Collection (see Step 3).
- Pages appear but images don't render — confirm the
attachments/folder was included in the ZIP at the top level (same level as the Collection folders).
Optional: keep a git history backup
PmWiki's per-page edit history is lost in the Outline migration. If you want a browsable backup with full history, run pmwiki-to-git separately — it's independent of this tool:
go install github.com/oxzi/pmwiki-pagefileformat-go/cmd/pmwiki-to-git@latest
git init pmwiki-git
pmwiki-to-git -pmwiki ./wiki.d -git pmwiki-git
You now have a git repo with one commit per page edit. Useful as a long-term archive alongside the Outline deployment.
The tool
Architecture
Conversion runs in phases (see converter/ — one module per phase):
| # | Module | What it handles | Status |
|---|---|---|---|
| 0 | pagefile.py |
PmWiki page-file parsing, respecting the charset= header field (default UTF-8) |
done |
| – | pipeline.py |
Orchestration + code-block protection ([@...@] and (:markup:)...(:markupend:) stashed as placeholders before Phase 1 to survive unchanged through every phase) |
done |
| 1 | inline.py |
Headings, bold, italic, bullet/numbered lists, (:comment:) |
done |
| 2 | directives.py |
(:title:), (:if:)/(:else:)/(:ifend:) variants, (:include:), (:redirect:), (:pagelist:), (:div:), display-mode flags (accepts both (:name arg:) and (:name: arg:) separators) |
done |
| 3 | links.py |
All [[...]] forms: [[Page]], [[Group.Page]], [[Group/Page]], display text, anchors, external URLs, mailto |
done |
| 4 | attachments.py |
Attach:file.ext (bare and [[...]]-wrapped, with/without caption, same/cross-group) → Markdown image or link (image extensions rendered inline) |
done |
| 5 | tables.py |
|| simple tables → Markdown pipe tables; (:table:)...(:tableend:) → TODO comment + stripped inner content |
done |
| 6 | misc.py |
%Group%/%Page% PTVs, paired %class%text%% inline styles, lone-marker whitelist strip, strikethrough {-...-}, revision-insertion {+...+}, line continuation \\, horizontal rule ---- |
done |
| 7 | cleanup.py |
Whitespace normalization, blank line before headings, preserves Markdown hard breaks, leading/trailing whitespace stripped, trailing newline | done |
Extending
Each rule has a fixture pair in tests/fixtures/:
tests/fixtures/
headings.pmwiki ← PmWiki input
headings.expected.md ← expected Markdown output
To add a rule:
- Add a fixture pair (or extend an existing one) showing the input/output transformation.
- Add the regex to the right phase module.
- Run
pytest— the test harness auto-discovers all fixture pairs.
Running the tests (requires pytest):
pip install pytest
pytest
Credits
Phase 1 regex rules derive from dohliam/pmdown (MIT). Page-file format reference: PmWiki PageFileFormat docs and oxzi/pmwiki-pagefileformat-go (the canonical Go implementation, including revision-history reconstruction which we don't need for Outline).
License
MIT.