Aller au contenu principal

Spec: Catalog Include / Exclude Rules

Context

Katalog currently supports global directory exclusions only — a single list that applies to every catalog during creation and update. This spec defines the evolution toward per-catalog include/exclude rules, drawing on LuckyBackup's mature filter system as a reference.


Current State

AspectCurrent behaviour
ScopeGlobal — applies to all catalogs
Storageparameter table, parameter_type = 'exclude_directory', parameter_value1 = 'All catalogs'
Where managedCreate tab (add/delete), Settings tab (view)
What is supportedDirectory paths only — no wildcards, no file patterns
Applied inCatalogJobStoppable during catalog build

LuckyBackup Reference — Feature Map

LuckyBackup is an rsync frontend; its filter system is the rsync include/exclude engine. The table below maps each feature to Katalog's context.

Exclude

FeatureLuckyBackupKatalog relevance
Exclude specific directory✅ user-definedAlready exists (global) → move to per-catalog
Exclude by filename pattern (*.tmp, *.bak)✅ rsync globNew — high value
Template: temp folders (**/*tmp*/)✅ checkboxNew — quick-select
Template: cache folders (**/*cache*/)✅ checkboxNew — quick-select
Template: lost+found✅ checkboxUseful for full-device catalogs
Template: system folders (/var /proc /dev /sys)✅ checkboxLinux-specific, optional
Template: hidden virtual dirs (.gvfs)✅ checkboxPartially covered by Include Hidden option
Read exclusions from a fileBacklog

Include

FeatureLuckyBackupKatalog relevance
"Only include" mode — whitelist specific paths/patternsNew — useful for partial catalogs
"Normal include" — override specific excludesComplex; backlog

Patterns

LuckyBackup uses rsync glob syntax. Katalog will use a simplified subset (Qt QDir::match() style):

PatternMeaningKatalog support
*.extAny file with that extension✅ Phase 1
nameAny file/folder named exactly name anywhere in tree✅ Phase 1
/nameAnchored to catalog rootPhase 2
name/Directories onlyPhase 2
**/nameAnywhere in tree (recursive)Phase 2
?Single character (not slash)Phase 2
[a-z]Character classBacklog

Proposed Scope — Phase 1

1. Per-catalog exclude folders

Move the current global folder exclusions to be per-catalog. A catalog's own exclusion list overrides (or extends) the global list during its build/update.

Storage: extend parameter table with a catalog reference, or add a dedicated catalog_filter table:

ColumnTypeDescription
filter_idINTEGER PK
filter_catalog_idINTEGER FK → catalogNULL = global
filter_typeTEXT'exclude_folder', 'exclude_pattern', 'include_only'
filter_valueTEXTPath or glob pattern

2. Per-catalog exclude file patterns

Pattern matching by filename or extension: *.tmp, *.log, Thumbs.db. Implemented in CatalogJobStoppable using QDir::match().

3. Quick-select templates

Checkboxes for common categories that generate standard patterns automatically:

  • Temp folders → adds pattern *tmp* (folder)
  • Cache folders → adds pattern *cache* (folder)
  • Backup files → adds pattern *.bak, *~

Future / Backlog

  • "Only include" mode — whitelist: catalog only indexes files/folders matching the list
  • Normal include — override specific excludes (rsync-style ordered rules)
  • Read patterns from file — point to a .gitignore-style file
  • Pattern editor GUI — visual builder for non-trivial patterns
  • Size filter — skip files above/below a size threshold (e.g. skip files > 4 GB)
  • Date filter — only index files newer than a given date
  • Global template defaults — admin-level defaults applied when creating a new catalog

UI Placement

Selected approach: Devices tab → Catalog edit form — left/right split

The Devices_widget_EditCatalogFields widget (File Type, Metadata, Checksum, Hidden Files) is split into two horizontal panels within the existing edit area:

Left panel — existing catalog settings (unchanged):

  • File Type, Include Metadata, Include Checksum, Include Hidden Files, Is Full Device

Right panel — new Include / Exclude section (per-catalog):

  • Excluded folders list
  • Add / remove entry (path or glob pattern)
  • [Phase 2] Quick-select template checkboxes
  • [Phase 2] "Only include" mode toggle + include list

Alternatives considered

OptionAssessment
Separate sub-tab inside Devices tabRequires TabWidget, adds navigation depth — rejected
Settings tab (global only)Already exists for global list — keep as global view
Modal dialog via button on edit formSimpler but hides the feature — rejected for Phase 1

Migration Notes

  • New catalog_filter table introduced in the current dev version (2.10, unreleased) — can be created directly in the 2.10 migration, no additional migration step needed.
  • Existing global exclusions in parameter remain as-is; per-catalog rules are additive.
  • CatalogJobStoppable needs to load both global and per-catalog rules before traversal, applying exclusions in order: catalog-level first, then global.