About Unwatermark

An AI-powered watermark removal tool built as a deep exploration of computer vision, neural inpainting, and the real-world limits of AI precision.

The Journey

Unwatermark started as a straightforward idea: use AI to automatically detect and remove watermarks from images and presentations. What followed was a deep technical journey that taught us more about the boundaries of current AI capabilities than any tutorial or course could.

Detection Pipeline
Built a layered detection system: EasyOCR for text watermarks, Claude and GPT-4o Vision for logos, Lang-SAM for pixel-perfect mask generation. Each layer handles cases the others miss.
Neural Inpainting
Integrated LaMa (Large Mask Inpainting) from Samsung Research -- a state-of-the-art model that reconstructs what was behind the watermark rather than just blurring it out.
Multi-Format Support
Extended beyond single images to handle PDFs (render, clean, reassemble) and PowerPoint files (process embedded slide images in-place while preserving all formatting).
The Precision Problem
Discovered the fundamental challenge: watermark removal requires pixel-perfect precision, but general-purpose AI models (Vision APIs, segmentation models) operate at a much coarser resolution. A 5% bounding box error that's fine for "find the cat" is catastrophic for "remove only these 20 pixels without touching the adjacent text."
Content Preservation
Implemented extensive safeguards -- region size limits, dark-pixel protection, SAM margin caps, multi-pass scanning with heuristic blocking -- all to prevent the removal process from damaging content near the watermark. The guiding principle: better to leave a watermark than destroy content.
Current State
The tool works well for simple watermark cases and provides a real-time streaming processing experience. Complex cases (small watermarks near dense content) remain challenging -- an honest reflection of where AI precision stands today.

Technology Stack

A serious AI pipeline, not a blur filter.

Claude Vision + GPT-4o
AI watermark detection -- analyzes images to locate watermarks, identify backgrounds, and recommend removal strategies
LaMa Inpainting
Samsung Research's large mask inpainting model -- reconstructs content behind watermarks using neural networks
Lang-SAM
Text-prompted segmentation for pixel-perfect mask generation -- produces precise boundaries between watermark and content
EasyOCR
Deterministic text detection -- catches text watermarks like "NotebookLM", "Shutterstock", "DRAFT" reliably and locally
FastAPI + NDJSON
Web framework with streaming progress -- real-time updates as each slide or page is processed
Python + Pillow + NumPy
Image processing foundation -- mask generation, threshold refinement, morphological operations, format conversion

What We Learned

Honest insights from building an AI-powered image editing tool.

Detection is harder than removal

LaMa inpainting produces excellent results when given an accurate mask. The entire quality challenge comes from detection precision -- knowing exactly which pixels are watermark and which are content.

API-based AI has precision limits

General-purpose Vision APIs return bounding boxes with ~3-5% error. That's fine for object detection, but catastrophic for pixel-level editing where even small errors damage adjacent content.

Each fix creates new edge cases

Tightening detection to prevent content damage means some watermarks survive. Loosening it catches more watermarks but risks content. Guard rails help, but the fundamental precision gap remains.

Demo vs. production quality

AI watermark removal demos impressively on simple cases. Production-reliable results across varied real-world files require dedicated fine-tuned models and GPU infrastructure beyond what API wrappers can provide.

Built By

CushLabs AI Services

Unwatermark is a portfolio project by CushLabs AI Services, exploring the real capabilities and limitations of AI-powered image processing. Built with Claude Code as a pair-programming partner.