Sign inBlogAboutSupportContact
Content

Find accidental duplicate posts, pages, and products quietly hurting SEO

One command hashes and compares every post. It finds exact-match duplicates across content, titles, or both, without touching a single row. Completely read-only.

4 min read May 2026 find duplicate content

Duplicate content you never wrote is ranking against you

Duplicate content rarely arrives intentionally. A product import ran twice. A blog post was duplicated for editing and the original was never deleted. A WooCommerce variation ended up with the same description as the parent. Google sees two URLs with identical content and has to pick one. It often picks neither and buries both.

You can't fix what you can't find. A 2,000-post catalog has no obvious "find duplicates" button in the WordPress admin. You either pay for a crawl tool that checks the rendered page, which misses draft and private posts, or you query the database directly with SQL that does not normalize whitespace, shortcodes, or block markup before comparing.

What most people do

Run a site crawl with Screaming Frog or Ahrefs Crawlers only see published, publicly accessible URLs. Drafts, private posts, and password-protected pages are invisible. You find maybe half the problem.
Write a SQL query against wp_posts A direct string comparison misses duplicates where one post has a trailing space, a different block comment, or a shortcode the other lacks. You get false negatives and false positives in the same run.
Install a duplicate-detection plugin Most work on post save only, so historical duplicates already in the database are never scanned. The ones that do scan everything are slow, add admin menu clutter, and still need you to decide what to do with each match.

A better way: hash-based comparison across every post type

TrueCommander's find duplicate content command strips HTML tags, Gutenberg block markers, shortcodes, and extra whitespace before comparing. It normalizes to lowercase and hashes the result, so two posts are only flagged as duplicates when their actual readable content is identical, not just similar.

TrueCommander
12 duplicate groups found
Read-only scan, nothing changed
12 groups · 28 posts affectedcontent match
Scanned post and product types
HTML, blocks, shortcodes stripped before hashingnormalized

This command is completely read-only. It scans and reports. It never deletes, redirects, or modifies any post. Use the results to decide which posts to merge, redirect, or remove, then act with other commands or the editor.

How it works

1
Normalize each post's content HTML tags, Gutenberg block comment markers, shortcodes, and extra whitespace are stripped. The result is lowercased. Anything shorter than -min_length characters (default 100) is skipped in content mode.
2
Hash and group matching posts Each normalized string is hashed. Posts that share a hash are grouped as duplicates. Using -by=both requires both the title hash and the content hash to match before two posts are grouped.
3
Report duplicate groups with post IDs and titles Each group shows the post IDs and titles involved. Pipe those IDs into find and replace, redirect, or another command to act on them.
ParameterDetails
-typesCSV of post types to scan, e.g. post,page,product. Revisions, attachments, and WooCommerce variations are always excluded even if listed.
-postBoolean shortcut alias to include the post type. Default true.
-pageBoolean shortcut alias to include the page type. Default true.
-productBoolean shortcut alias to include the product type. Default true.
-bycontent (default): hash normalized body text. title: hash normalized titles only. both: require both title and content to match.
-post_statusDefault publish. Set to publish,draft,private to catch unpublished duplicates too.
-min_lengthSkip posts whose normalized content is shorter than N characters. Default 100. Applies in content and both modes only.
-limitMax posts to scan per run. Default 1000, max 10000.
Can be used in

Real example

An agency migrated a client's WooCommerce store from one platform to another. The import script ran twice: once as a test and once for real. Nobody noticed the test run populated the database. Six months later, organic traffic to the product catalog has dropped 18% and Google Search Console shows dozens of "duplicate content" hints in the coverage report.

You run tp find duplicate content -product=true -by=content -post_status=publish,draft. The command comes back with 34 duplicate groups covering 68 products. Every group is a test-import product paired with its live counterpart. You take the list of IDs for the test-run copies, trash them in bulk from the editor, and set up 301 redirects for any that had already been indexed.

Two weeks later the coverage warnings are gone and the 18% traffic drop starts recovering. The whole diagnosis took under a minute.

Goes further with TrueCommander

Ready?

Find the duplicates that are hurting your rankings.

One of 91 commands. All included with every license.

Cookies. The short version.

Essential cookies keep the cart and theme working. Analytics only fire if you say yes. Read our policy.