About Unwatermark

An AI-powered watermark removal tool built as a deep exploration of computer vision, neural inpainting, and the real-world limits of AI precision.

The Journey

Unwatermark started as a straightforward idea: use AI to automatically detect and remove watermarks from images and presentations. What followed was a deep technical journey that taught us more about the boundaries of current AI capabilities than any tutorial or course could.

Detection Pipeline

Built a layered detection system: EasyOCR for text watermarks, Claude and GPT-4o Vision for logos, Lang-SAM for pixel-perfect mask generation. Each layer handles cases the others miss.

Neural Inpainting

Integrated LaMa (Large Mask Inpainting) from Samsung Research -- a state-of-the-art model that reconstructs what was behind the watermark rather than just blurring it out.

Multi-Format Support

Extended beyond single images to handle PDFs (render, clean, reassemble) and PowerPoint files (process embedded slide images in-place while preserving all formatting).

The Precision Problem

Discovered the fundamental challenge: watermark removal requires pixel-perfect precision, but general-purpose AI models (Vision APIs, segmentation models) operate at a much coarser resolution. A 5% bounding box error that's fine for "find the cat" is catastrophic for "remove only these 20 pixels without touching the adjacent text."

Content Preservation

Implemented extensive safeguards -- region size limits, dark-pixel protection, SAM margin caps, multi-pass scanning with heuristic blocking -- all to prevent the removal process from damaging content near the watermark. The guiding principle: better to leave a watermark than destroy content.

Current State

The tool works well for simple watermark cases and provides a real-time streaming processing experience. Complex cases (small watermarks near dense content) remain challenging -- an honest reflection of where AI precision stands today.

Technology Stack

A serious AI pipeline, not a blur filter.

Claude Vision + GPT-4o

AI watermark detection -- analyzes images to locate watermarks, identify backgrounds, and recommend removal strategies

LaMa Inpainting

Samsung Research's large mask inpainting model -- reconstructs content behind watermarks using neural networks

Lang-SAM

Text-prompted segmentation for pixel-perfect mask generation -- produces precise boundaries between watermark and content

EasyOCR

Deterministic text detection -- catches text watermarks like "NotebookLM", "Shutterstock", "DRAFT" reliably and locally

FastAPI + NDJSON

Web framework with streaming progress -- real-time updates as each slide or page is processed

Python + Pillow + NumPy

Image processing foundation -- mask generation, threshold refinement, morphological operations, format conversion

What We Learned

Honest insights from building an AI-powered image editing tool.

Detection is harder than removal

LaMa inpainting produces excellent results when given an accurate mask. The entire quality challenge comes from detection precision -- knowing exactly which pixels are watermark and which are content.

API-based AI has precision limits

General-purpose Vision APIs return bounding boxes with ~3-5% error. That's fine for object detection, but catastrophic for pixel-level editing where even small errors damage adjacent content.

Each fix creates new edge cases

Tightening detection to prevent content damage means some watermarks survive. Loosening it catches more watermarks but risks content. Guard rails help, but the fundamental precision gap remains.

Demo vs. production quality

AI watermark removal demos impressively on simple cases. Production-reliable results across varied real-world files require dedicated fine-tuned models and GPU infrastructure beyond what API wrappers can provide.

Built By

CushLabs AI Services

Unwatermark is a portfolio project by CushLabs AI Services, exploring the real capabilities and limitations of AI-powered image processing. Built with Claude Code as a pair-programming partner.