A deep dive into how DotScramble protects your privacy โ from injecting plausible-but-fake EXIF metadata to detecting and blurring faces and eyes with OpenCV Haar Cascades.
TL;DR โ DotScramble is an open-source desktop app that protects image privacy through two main pillars: metadata obfuscation (stripping or spoofing EXIF data so your camera, GPS, and timestamp can't be traced back to you) and visual redaction (automatically detecting and blurring faces, eyes, license plates, and text). This post digs into the code behind both systems.
Why Metadata Is the Hidden Threat
When you take a photo, the file doesn't just contain pixels. Hidden inside is a block of EXIF metadata that silently stores:
| Field | Example Value | Risk |
|---|---|---|
GPS Latitude/Longitude |
30.0444ยฐ N, 31.2357ยฐ E |
Reveals your exact location |
Camera Make/Model |
Apple iPhone 15 Pro |
Fingerprints your device |
DateTimeOriginal |
2026:06:21 14:32:11 |
Timestamps your movements |
Software |
iOS 17.5 |
Exposes OS/platform |
Most people strip this metadata before sharing photos โ but stripping alone can look suspicious. A completely blank EXIF block on a modern smartphone image is a red flag to any tracking algorithm. DotScramble takes a different approach: instead of silent stripping, it injects plausible-but-fake disinformation.
Part 1: The Metadata Spoofing Engine
The core of the system lives in core/metadata_spoofer.py.
The Camera Database โ Intentionally Vintage
FAKE_CAMERAS: list[tuple[str, str, str]] = [
("Nokia", "3310", "Nokia Imaging 1.0"),
("Motorola", "RAZR V3", "Motorola Camera 1.2"),
("Samsung", "SCH-U340", "Samsung Digimax 2.1"),
("Casio", "QV-10A", "Casio Digital Camera"),
("Polaroid", "PDC 640", "Polaroid Software 1.1"),
("Kodak", "DC40", "Kodak EasyShare 3.0"),
# ... and more
]
The list is intentionally filled with vintage and implausible devices โ a Nokia 3310 doesn't have a camera at all, which is exactly the point. Trackers fingerprint devices by cross-referencing camera model with image resolution, color profile, and noise patterns. Injecting a Nokia 3310 signature into a modern 12MP photo creates an irreconcilable contradiction that defeats fingerprinting.
GPS Presets โ Middle of Nowhere
GPS_PRESETS: dict[str, tuple[float, float]] = {
"pacific": ( 4.2234, -157.4521), # Middle of Pacific Ocean
"antarctica": ( -89.3312, 12.0000), # Antarctica
"arctic": ( 89.1122, -178.4343), # Arctic Ocean
"sahara": ( 23.4122, 10.9988), # Remote Sahara (no cell towers)
"amazon": ( -5.3421, -63.2231), # Deep Amazon basin
}
These aren't random โ each preset was chosen because it's a location with no plausible human population or cell infrastructure, making it impossible to cross-reference the fake GPS against telecom tower data.
A small jitter is added to each preset on every run so the same preset never produces the exact same coordinates twice:
def _pick_gps(preset, custom, jitter=True):
coords = GPS_PRESETS.get(preset, GPS_PRESETS["pacific"])
if jitter:
lat = coords[0] + random.uniform(-0.08, 0.08)
lon = coords[1] + random.uniform(-0.08, 0.08)
return (round(lat, 4), round(lon, 4))
return coords
Converting GPS to EXIF Rationals
EXIF GPS doesn't store decimal degrees directly โ it uses a DMS (Degrees, Minutes, Seconds) format with rational numbers (numerator/denominator pairs). Here's how DotScramble handles the conversion:
def _dms_rational(value: float) -> tuple[tuple[int, int], ...]:
"""Convert decimal degrees โ (deg, min, sec) as EXIF rational tuples."""
abs_v = abs(value)
deg = int(abs_v)
m_f = (abs_v - deg) * 60
mins = int(m_f)
secs = round((m_f - mins) * 60 * 10_000)
return ((deg, 1), (mins, 1), (secs, 10_000))
The result gets packaged into a full GPS IFD:
def _build_gps_ifd(lat: float, lon: float) -> dict:
return {
piexif.GPSIFD.GPSLatitudeRef: b"S" if lat < 0 else b"N",
piexif.GPSIFD.GPSLatitude: _dms_rational(lat),
piexif.GPSIFD.GPSLongitudeRef: b"W" if lon < 0 else b"E",
piexif.GPSIFD.GPSLongitude: _dms_rational(lon),
piexif.GPSIFD.GPSAltitude: (0, 1),
piexif.GPSIFD.GPSMapDatum: b"WGS-84",
}
Spoof Profiles โ Opinionated Presets
Rather than exposing raw settings, DotScramble exposes three high-level profiles:
PROFILES = {
"ghost": {
# Maximum obfuscation โ Nokia 3310 in Antarctica, year 2000
"gps_preset": "antarctica",
"camera": "Nokia 3310",
"fake_datetime_mode": "epoch",
"keep_copyright": False,
},
"troll": {
# Plausibly wrong โ recent vintage camera, random ocean
"gps_preset": "pacific",
"camera": "random",
"fake_datetime_mode": "recent",
"keep_copyright": False,
},
"artist": {
# For photographers โ strips location+device, preserves copyright
"gps_preset": "atlantic",
"camera": "random",
"fake_datetime_mode": "random",
"keep_copyright": True,
},
}
Usage is dead simple from Python:
from core.metadata_spoofer import spoof
result = spoof("photo.jpg", profile="ghost")
print(result)
# {
# "format": "JPEG",
# "camera": "Nokia 3310",
# "gps": {"lat": -89.2891, "lon": 12.0412},
# "datetime": "2000:01:01 00:00:00",
# ...
# }
Or from the CLI:
dotscramble-spoof photo.jpg --profile ghost
dotscramble-spoof photo.jpg --gps-preset pacific --camera "Kodak DC40"
dotscramble-spoof photo.jpg --gps-custom 23.4 -54.2 --keep-copyright
dotscramble-spoof photo.jpg --dry-run --json
JPEG vs PNG โ Different Metadata Formats
JPEG and PNG handle metadata completely differently:
JPEG uses binary EXIF chunks (IFD tables). DotScramble builds these using piexif:
exif_bytes = piexif.dump({
"0th": zeroth_ifd, # Camera make, model, software, datetime
"Exif": exif_ifd, # DateTimeOriginal, ISO, aperture, shutter
"GPS": gps_ifd, # Lat/lon in DMS rational format
})
img.save(output_path, "JPEG", exif=exif_bytes, quality=95)
PNG uses simple text chunks (key-value pairs). No GPS support in the standard:
pnginfo = PngImagePlugin.PngInfo()
pnginfo.add_text("Software", software)
pnginfo.add_text("Author", make)
pnginfo.add_text("Creation Time", png_dt)
pnginfo.add_text("date:create", iso_dt)
pnginfo.add_text("date:modify", iso_dt)
img.save(output_path, "PNG", pnginfo=pnginfo)
Design Note: PNG's text chunks are completely human-readable with any hex editor. DotScramble still populates them because many social platforms and reverse image search engines parse them. Injecting noise is better than leaving them empty.
Part 2: Custom Metadata Control โ Per-Field EXIF Surgery
The spoofing profiles are great for quick use, but power users need surgical control. That's what the Custom Metadata Dialog provides โ a per-field control panel where every EXIF tag can independently be set to one of four actions:
โ
Keep โ preserve the original value exactly
๐ Strip โ remove this field entirely
๐ฒ Spoof โ replace with a random plausible fake
โ๏ธ Custom โ enter your own specific value
The spoof_custom() Function
This is the engine behind the dialog. It takes a field_actions dict and applies each action independently:
def spoof_custom(
input_path: str,
output_path: str | None = None,
*,
field_actions: dict,
) -> dict:
"""
Apply per-field EXIF actions to a JPEG or PNG.
field_actions keys: gps, make, model, software, datetime, copyright, exposure
Values: "keep" | "strip" | "spoof" | {"value": str}
(gps uses {"lat": .., "lon": ..} for custom)
"""
The resolver function handles all four action types elegantly:
def _resolve_text(key: str, fake_fn) -> str | None:
action = field_actions.get(key, "keep")
if action == "strip": return None
if action == "keep": return current.get(key)
if action == "spoof": return fake_fn()
if isinstance(action, dict): return action.get("value", "")
return current.get(key)
# Applied per-field:
r_make = _resolve_text("make", lambda: make)
r_model = _resolve_text("model", lambda: model)
r_software = _resolve_text("software", lambda: software)
r_datetime = _resolve_text("datetime", lambda: _fake_timestamp("random"))
Reading Existing EXIF to Pre-fill the UI
Before showing the dialog, DotScramble reads the current EXIF so the user can see what's actually in the file:
def read_exif_fields(image_path: str) -> dict:
"""Read current EXIF and return structured dict for pre-filling the dialog."""
result = {k: None for k in
("gps", "make", "model", "software", "datetime", "copyright", "exposure")}
exif_dict = piexif.load(image_path)
zeroth = exif_dict.get("0th", {})
exif = exif_dict.get("Exif", {})
gps = exif_dict.get("GPS", {})
result["make"] = _decode(zeroth, piexif.ImageIFD.Make)
result["model"] = _decode(zeroth, piexif.ImageIFD.Model)
result["datetime"] = _decode(zeroth, piexif.ImageIFD.DateTime)
# ... GPS DMS โ decimal conversion, exposure parsing, etc.
return result
The GPS conversion from DMS rational back to decimal is the tricky part:
def _dms_to_decimal(dms_tuple, ref: bytes) -> float | None:
d = _rational(dms_tuple[0]) or 0
m = _rational(dms_tuple[1]) or 0
s = _rational(dms_tuple[2]) or 0
val = d + m / 60 + s / 3600
if ref in (b"S", b"W"):
val = -val
return round(val, 6)
Preset System for the Dialog
The Custom dialog also supports saving and loading named presets, backed by a JSON file:
# Example saved preset
{
"gps": "strip",
"make": "spoof",
"model": "spoof",
"software": "strip",
"datetime": {"value": "2023:01:01 12:00:00"},
"copyright": "keep",
"exposure": "spoof"
}
Factory presets ship with the app (Ghost, Troll, Artist), and users can create their own. Factory presets are read-only and can't be deleted or overwritten.
Part 3: Advanced Face and Eye Blur
The visual redaction engine lives in core/image_processor.py.
The Detection Engine
DotScramble uses OpenCV's Haar Cascade classifiers โ a classical machine learning approach that's fast, works offline, and has no cloud dependencies. The DetectionEngine class wraps the key classifiers:
class DetectionEngine:
"""Advanced detection algorithms"""
@staticmethod
def detect_faces(image):
"""Detect faces using Haar Cascade"""
face_cascade = cv2.CascadeClassifier(
cv2.data.haarcascades + 'haarcascade_frontalface_default.xml'
)
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
faces = face_cascade.detectMultiScale(gray, 1.1, 5, minSize=(30, 30))
return faces
@staticmethod
def detect_eyes(image):
"""Detect eyes using Haar Cascade"""
eye_cascade = cv2.CascadeClassifier(
cv2.data.haarcascades + 'haarcascade_eye.xml'
)
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
eyes = eye_cascade.detectMultiScale(gray, 1.1, 5, minSize=(20, 20))
return eyes
The parameters that matter:
-
scaleFactor=1.1โ how much the image is scaled down per detection pass (1.1 = 10% reduction per pass, slower but more accurate) -
minNeighbors=5โ how many overlapping detections a region needs before it's accepted (higher = fewer false positives) -
minSize=(30, 30)โ minimum region size in pixels (filters out noise)
Applying the Blur โ The gaussian_blur Method
Once regions are detected, DotScramble applies effects. The gaussian_blur implementation has extensive validation to handle edge cases like faces at image borders:
@staticmethod
def gaussian_blur(image, x, y, w, h, strength):
"""Apply Gaussian blur to region with validation"""
img_h, img_w = image.shape[:2]
# Clip region to image boundaries (handles faces at edges)
x_end = min(x + w, img_w)
y_end = min(y + h, img_h)
w = x_end - x
h = y_end - y
# Ensure minimum region size
if w < 3 or h < 3:
raise ValueError(f"Region too small: {w}x{h}")
# GaussianBlur requires odd kernel size
if strength % 2 == 0:
strength += 1
strength = max(3, strength)
region = image[y:y_end, x:x_end]
blurred = cv2.GaussianBlur(region, (strength, strength), 0)
return blurred
Beyond Blur โ The Full Effect Suite
Blur is just one option. DotScramble supports 7 different privacy effects that can be applied to any detected region:
| Effect | Method | Best For |
|---|---|---|
| ๐ซ๏ธ Gaussian Blur | gaussian_blur() |
Professional, natural look |
| ๐ฒ Pixelation | pixelate() |
Classic censoring style |
| โฌ Black Bar | black_bar() |
Strong, unambiguous redaction |
| ๐ญ Gradient Fade | gradient_fade() |
Artistic/subtle censoring |
| ๐ณ Mosaic | mosaic_effect() |
Decorative tile pattern |
| โ๏ธ Frosted Glass | frosted_glass() |
Translucent glass aesthetic |
| ๐จ Oil Paint | oil_paint() |
Artistic painting effect |
The pixelation effect is a neat double-resize trick โ shrink down, then scale back up with nearest-neighbor interpolation:
@staticmethod
def pixelate(image, x, y, w, h, pixel_size):
region = image[y:y_end, x:x_end]
region_h, region_w = region.shape[:2]
pixel_size = max(1, pixel_size)
# Shrink โ enlarge = pixelation
temp_h = max(1, region_h // pixel_size)
temp_w = max(1, region_w // pixel_size)
temp = cv2.resize(region, (temp_w, temp_h), interpolation=cv2.INTER_LINEAR)
pixelated = cv2.resize(temp, (region_w, region_h), interpolation=cv2.INTER_NEAREST)
return pixelated
The frosted glass effect combines PIL and OpenCV:
@staticmethod
def frosted_glass(image, x, y, w, h, strength=15):
region = image[y:y_end, x:x_end]
# Convert to PIL for advanced filtering
pil_region = Image.fromarray(cv2.cvtColor(region, cv2.COLOR_BGR2RGB))
blurred = pil_region.filter(ImageFilter.GaussianBlur(strength))
enhanced = ImageEnhance.Brightness(blurred).enhance(1.1)
final = enhanced.filter(ImageFilter.EDGE_ENHANCE)
# Back to OpenCV format
return cv2.cvtColor(np.array(final), cv2.COLOR_RGB2BGR)
Opacity Blending โ Partial Redaction
All effects support an opacity parameter that blends the processed region with the original:
@staticmethod
def apply_opacity(original, processed, opacity):
"""Blend processed region with original based on opacity"""
if original.shape != processed.shape:
processed = cv2.resize(processed, (original.shape[1], original.shape[0]))
alpha = opacity / 100.0
return cv2.addWeighted(processed, alpha, original, 1 - alpha, 0)
This is useful for subtle watermark-style redaction (e.g., 50% opacity blur) vs. hard censoring (100%).
The Full Workflow
๐ Load Image
โ
โผ
๐ฏ Select Detection Mode
โโโ Face Detection (Haar Cascade)
โโโ Eye Detection (Haar Cascade)
โโโ Full Body Detection
โโโ License Plate (contour analysis)
โโโ Text Detection (Tesseract OCR)
โโโ Manual Selection (draw regions)
โโโ Full Image
โ
โผ
๐จ Choose Effect + Strength + Opacity
โ
โผ
๐ก๏ธ Apply Metadata Control
โโโ Quick Profile (ghost / troll / artist)
โโโ Custom Per-Field (keep/strip/spoof/custom)
โ
โผ
๐พ Save / Batch Export
Why Haar Cascades Instead of Deep Learning?
I get this question a lot. DotScramble intentionally chose Haar Cascades over modern deep learning detectors like YOLO or MediaPipe face mesh. The reasons:
- No cloud, no internet โ Haar Cascades run entirely offline. Privacy tools should never phone home.
- Zero model download on first run โ the XML classifiers are bundled with OpenCV. No 100MB model download.
- Fast on CPU โ users don't need a GPU. The app runs fine on a 5-year-old laptop.
- Good enough for the use case โ the goal is privacy, not 99.9% recall. If a face is missed, Manual Selection is always available as fallback.
That said, AI-powered tracking (YOLOv8 or similar) is on the roadmap for v1.4.0 to improve detection of faces at angles, in poor lighting, or partially occluded.
Getting Started
# Clone
git clone https://github.com/kareem2099/DotScramble.git
cd DotScramble
# Install dependencies
pip install -r requirements.txt
# pip install opencv-python numpy Pillow pytesseract piexif PySide6
# Optional: Tesseract for text detection
sudo apt install tesseract-ocr
# Run
python src/main.py
Or grab the standalone Linux executable โ no Python required โ from one of these:
| Platform | Link |
|---|---|
| ๐ GitHub Releases | Latest Release |
| ๐ฅ๏ธ OpenDesktop / Pling | opendesktop.org/p/2362477 |
Conclusion
DotScramble's metadata system takes a "active disinformation" approach rather than passive stripping โ the result is metadata that looks legitimate while being completely meaningless. Combined with the visual redaction engine, it provides a comprehensive privacy toolkit that works entirely offline.
The code is Apache 2.0 licensed and open to contributions. If you're interested in adding AI tracking, GPU acceleration, or a web interface โ pull requests are very welcome.
GitHub: github.com/kareem2099/DotScramble
OpenDesktop: opendesktop.org/p/2362477
Made with โค๏ธ for privacy protection













