Skip to main content
Back to problems
Leetcode
Medium
Trees
Hash Maps
Strings
Google
Meta
Delete Duplicate Folders In System

Remove every folder that has an identical non-empty subtree somewhere else in the folder system.

Acceptance 100%
Problem Statement

Delete Duplicate Folders in System

You are given a folder hierarchy rooted at /. Each folder has a name and may contain zero or more subfolders.

A folder is considered duplicate if there exists another folder in the system whose entire subtree is structurally identical and has the same folder names in the same arrangement. Two folder subtrees are identical when:

  • their root folder names are the same, and
  • their children form the same set of subtrees recursively.

If a duplicate folder is found, all folders in that duplicate subtree must be deleted. The root folder / is never deleted.

Return the remaining folders after all duplicate subtrees are removed.

A typical way to solve this is to assign a canonical representation to each folder subtree, count how often each representation appears, and then remove all subtrees whose representation occurs more than once.

Input Format

  • A folder tree representation, typically given as paths or parent-child relationships.
  • Each folder name is a non-empty string.
  • The hierarchy is rooted at /.

Output Format

  • Return the folder hierarchy after deleting every duplicate subtree.
  • The exact output format depends on the platform representation, but it should preserve the remaining folder structure.

Constraints

  • The folder hierarchy forms a tree.
  • Folder names are lowercase English letters in the common formulation.
  • There may be up to 10410^4 folders in typical interview settings.
  • The root folder is not deleted.

Hints

  • Compare folders by their full subtree structure, not just by immediate children.
  • A postorder traversal helps because you can compute a folder's signature from its children.
  • Use a map from serialized subtree signature to frequency.
  • After counting all signatures, do a second traversal to remove every subtree with frequency greater than 1.

Input Format

  • A rooted folder tree, usually represented as a list of folder paths or nested child lists.
  • Each folder has a name and zero or more child folders.

Output Format

  • The folder tree after removing all duplicate folder subtrees.
  • The root / remains present.

Constraints

  • The structure is a rooted tree.
  • Duplicate detection is based on exact subtree equality.
  • The root folder is not deleted.
  • Folder names are non-empty strings.
Examples
Sample cases returned by the problem API.

Example 1

Input

[
  ["a","b"],
  ["c","b"],
  ["d","e"],
  ["d","f"],
  ["g","e"],
  ["g","f"]
]

Output

[
  ["a","b"],
  ["c","b"]
]

Explanation

The subtrees rooted at d and g are identical (e and f as children), so both are deleted. The folders a/b and c/b remain because their subtrees are not duplicated in a way that makes them removable here.

Example 2

Input

[
  ["x","y"],
  ["x","z"],
  ["p","y"],
  ["p","z"]
]

Output

[]

Explanation

The subtree rooted at x is identical to the subtree rooted at p, so both are removed. No folders remain below the root.

Premium problem context

Unlock deeper context for this problem

Premium adds guided hints, editorial links, similar variants, discussion resources, and concept maps so you can understand why a problem matters, not just solve it once.

Guided hints
Editorial and discussion links
Concept map and variants
Sign in to unlock
Track your progress
Sign in to bookmark this problem, save notes, and manage its revision plan.