Open Source Signal #010 / Сигнал відкритих джерел

Editorial frame

What this is: A weekly tool radar for public-interest OSINT. Each item explains what the tool or workflow does, why it matters for Ukrainian accountability work, how to use it safely, and where its limits are.

What this is not: Doxxing, stalking, credential hunting, leaked-database abuse, private-person deanonymization, unsafe facial recognition, live targeting, or tools that facilitate harm against private people.

Rubric map

🧰 Toolbox

🛡️ Investigator OPSEC

🤖 AI Verification

🗺️ Map Room

🧰ToolboxІнструментарій

#01

Browsertrix Crawler preserves bounded web evidence as WACZ archives

Source: Webrecorder / Browsertrix Crawler GitHub · latest release v1.12.4, 1 April 2026

What happened

Browsertrix Crawler is a standalone, browser-based high-fidelity crawler designed to run customizable web-archive crawls in a Docker container. The GitHub release page lists v1.12.4 as the latest release, with fixes around collection crawl IDs, WACZ reference handling and seed URL hash handling.

Why it matters

This is useful when one page is not enough: occupation-administration sites, propaganda portals, official claim pages, local authority archives, recruitment pages or small thematic sections can be captured as bounded evidence packages before they are edited or removed.

How to use it

Practical fields: does — browser-based web crawling and WACZ/WARC-style preservation; use case — bounded public-source captures; input — seed URL or crawl configuration; output — crawl collection and WACZ archive; license/pricing — AGPL-3.0, free/open source; original source — Webrecorder GitHub and docs. Use specific release tags in production, define scope before crawling, save the crawl config, record operator/time/source list, hash the archive and keep raw captures separate from working copies.

Limits

This is not a chain-of-custody system by itself. Crawling must stay narrow: do not vacuum private-person data, private groups, credentials or live operational details. Browser state, cookies, logged-in sessions, translation extensions and crawl scope can contaminate evidence faster than a bureaucrat can misplace a goat.

Browsertrix Crawler зберігає обмежені вебмасиви як WACZ-архіви

Джерело: Webrecorder / Browsertrix Crawler GitHub · останній реліз v1.12.4, 1 квітня 2026

Що сталося

Browsertrix Crawler — це автономний browser-based high-fidelity crawler для контрольованого вебархівування в Docker-контейнері. На GitHub останнім стабільним релізом позначено v1.12.4 із виправленнями щодо crawl IDs у колекціях, WACZ-посилань і seed URL із hash-фрагментами.

Чому це важливо

Це потрібно там, де однієї сторінки замало: сайти окупаційних адміністрацій, пропагандистські портали, офіційні заяви, архіви місцевих органів, рекрутингові сторінки або невеликі тематичні розділи можна зберегти як обмежений доказовий пакет до редагування чи видалення.

Як це застосувати

Практичні поля: що робить — browser-based crawl і WACZ/WARC-style збереження; кейс — контрольована фіксація відкритих вебджерел; input — seed URL або crawl config; output — crawl collection і WACZ-архів; ліцензія/ціна — AGPL-3.0, free/open source; першоджерело — Webrecorder GitHub і документація. У продакшені використовуйте конкретний release tag, задавайте scope до запуску, зберігайте config, фіксуйте оператора/час/список URL, хешуйте архів і відділяйте raw capture від робочих копій.

Обмеження

Це не повноцінна chain-of-custody система. Crawl має бути вузьким: не пилососьте приватні дані, закриті групи, credentials або live operational details. Стан браузера, cookies, logged-in sessions, перекладачі й неправильний scope можуть зіпсувати доказ швидше, ніж бюрократ загубить козу.

browsertrixweb-archivingwaczevidence-preservationaccountability-osint

🛡️Investigator OPSECБезпека дослідника

#02

Dangerzone turns suspicious documents into safer flat PDFs

Source: Freedom of the Press Foundation / Dangerzone · accessed 23 May 2026

What happened

Dangerzone is an open-source tool for taking potentially dangerous PDFs, Office documents or images and converting them into safer PDFs. Its current 0.10.0 release adds automatic sandbox updates and embeds Podman for Windows and macOS builds, reducing Docker Desktop friction for non-Linux users.

Why it matters

OSINT teams receive attachments: PDFs, scans, office files, screenshots, supposed leaks, court documents and media packs. Opening unknown files directly is bad hygiene; for investigators, journalists and volunteers it can become a compromise route.

How to use it

Practical fields: does — opens a suspicious document in an isolated conversion environment, rasterizes it and rebuilds a safe PDF; use case — intake of unknown documents before reading or sharing; input — PDF, Office, ODF, EPUB and common image formats; output — safe PDF with optional OCR; license/pricing — AGPL-3.0 repository, free/open source; original source — Dangerzone site and GitHub. Put it before manual review in the intake workflow: receive file, hash original, convert with Dangerzone, inspect only the safe PDF, keep the original quarantined.

Limits

Dangerzone reduces risk; it does not make hostile files harmless by divine decree. Do not upload sensitive evidence to random online converters. Keep the original file isolated, avoid opening it outside a sandbox, and remember that metadata and active content may be lost during conversion.

Dangerzone перетворює підозрілі документи на безпечніші flat PDFs

Джерело: Freedom of the Press Foundation / Dangerzone · переглянуто 23 травня 2026

Що сталося

Dangerzone — open-source інструмент для перетворення потенційно небезпечних PDF, Office-документів або зображень на безпечніші PDF. Поточний реліз 0.10.0 додає автоматичні оновлення sandbox і вбудовує Podman у збірки для Windows та macOS, зменшуючи залежність від Docker Desktop.

Чому це важливо

OSINT-команди постійно отримують вкладення: PDF, скани, office files, скріншоти, нібито витоки, судові документи й медіапаки. Відкривати невідомий файл напряму — погана гігієна; для розслідувачів, журналістів і волонтерів це може стати шляхом компрометації.

Як це застосувати

Практичні поля: що робить — відкриває підозрілий документ в ізольованому conversion environment, растеризує його й збирає safe PDF; кейс — прийом невідомих документів до читання чи пересилання; input — PDF, Office, ODF, EPUB і поширені image formats; output — safe PDF з optional OCR; ліцензія/ціна — AGPL-3.0 repository, free/open source; першоджерело — сайт Dangerzone і GitHub. Ставте його на етап intake: отримали файл, захешували оригінал, конвертували через Dangerzone, читаєте safe PDF, оригінал тримаєте в карантині.

Обмеження

Dangerzone знижує ризик, але не робить ворожий файл безпечним актом божественної канцелярії. Не завантажуйте чутливі докази в випадкові онлайн-конвертери. Оригінал тримайте ізольовано, не відкривайте його поза sandbox і пам’ятайте, що метадані та active content можуть втрачатися під час конвертації.

dangerzoneinvestigator-opsecmalicious-documentsevidence-intakesandboxing

🧰ToolboxІнструментарій

#03

ExifTool remains the metadata triage workhorse

Source: Phil Harvey / ExifTool version history · Version 13.58, 5 May 2026

What happened

ExifTool’s official version history lists 13.58 on 5 May 2026 and notes that 13.55 is the most recent production release. Recent changes include timed GPS decoding for INNOVV N2 TS videos, QuickTime metadata updates, new Exif 3.1 tags in 13.56 and Garmin FIT-related improvements.

Why it matters

In war-crimes verification, metadata is rarely proof by itself, but it is often the first sanity check: timestamps, GPS, camera model, editing traces, embedded previews, timed video metadata and contradictions between a claim and the file.

How to use it

Practical fields: does — extracts, compares and exports metadata from images, video, audio, PDFs and many other formats; use case — media triage and provenance checks; input — original media copy; output — text, JSON or CSV metadata report; license/pricing — free/open-source under GPL/Artistic terms; original source — exiftool.org. Run it on copies, export JSON, hash raw files, compare timestamps with upload context, geolocation, shadows, weather and independent sources.

Limits

Metadata can be stripped, forged or shifted by platforms and editing software. Do not publish raw GPS or device identifiers that expose civilians, witnesses, POW relatives or investigators. Treat metadata as corroboration, not as a courtroom angel with a trumpet.

ExifTool залишається базовим інструментом metadata triage

Джерело: Phil Harvey / ExifTool version history · версія 13.58, 5 травня 2026

Що сталося

Офіційна історія версій ExifTool містить 13.58 від 5 травня 2026 року й окремо зазначає, що 13.55 є останнім production release. Серед свіжих змін — decoding timed GPS для INNOVV N2 TS videos, оновлення QuickTime metadata, теги Exif 3.1 у 13.56 та покращення щодо Garmin FIT.

Чому це важливо

У верифікації воєнних злочинів метадані рідко є доказом самі по собі, але часто дають першу перевірку здорового глузду: timestamps, GPS, модель камери, сліди редагування, embedded previews, timed video metadata і розбіжності між заявою та файлом.

Як це застосувати

Практичні поля: що робить — витягує, порівнює й експортує метадані з фото, відео, audio, PDF та багатьох інших форматів; кейс — media triage і provenance checks; input — копія оригінального медіа; output — text, JSON або CSV metadata report; ліцензія/ціна — free/open-source GPL/Artistic; першоджерело — exiftool.org. Запускайте на копіях, експортуйте JSON, хешуйте raw files, звіряйте timestamps із контекстом завантаження, геолокацією, тінями, погодою та незалежними джерелами.

Обмеження

Метадані можуть бути стерті, підроблені або змінені платформами й редакторами. Не публікуйте raw GPS або device identifiers, якщо це може викрити цивільних, свідків, родини полонених чи дослідників. Метадані — це corroboration, а не ангел у суді з фанфарами.

exiftoolmetadatamedia-verificationgpsevidence-triage

🤖AI VerificationШІ та верифікація

#04

GeoSearch is a useful AI-geolocation workflow only if humans keep the final vote

Source: arXiv / SIGIR 2026 paper · submitted 28 April 2026

What happened

The GeoSearch paper proposes an open-world image geolocalization framework that integrates web-scale reverse image search into a retrieval-augmented pipeline. It adds textual evidence from web pages, image matching and confidence-based gating; the authors state that code and data are publicly available for reproducibility.

Why it matters

The useful part is not “AI guessed the place.” The useful part is a disciplined workflow: retrieve candidates, collect supporting web evidence, filter weak matches and send the result back to a human investigator. That is safer than treating visual similarity as revelation from the Ministry of Coordinates.

How to use it

Practical fields: does — research workflow for visual geolocation using reverse-image retrieval, image matching and confidence gating; use case — candidate generation for public-interest geolocation; input — image needing location hypothesis; output — candidate locations and supporting web evidence; license/pricing — verify the linked code repository before operational use; original source — arXiv paper. Use it to generate hypotheses, then verify with landmarks, terrain, road geometry, shadows, weather, satellite imagery, upload context and independent sources.

Limits

Do not use AI geolocation to locate private homes, shelters, children, witnesses, POW families or private individuals. Model confidence is not verification. Benchmarks can hide leakage, web results can be wrong, and visually similar places can live on opposite ends of the map like two very committed coconuts.

GeoSearch корисний для AI-геолокації лише якщо фінальне рішення лишається за людиною

Джерело: arXiv / SIGIR 2026 paper · подано 28 квітня 2026

Що сталося

Стаття GeoSearch пропонує open-world image geolocalization framework, який інтегрує web-scale reverse image search у retrieval-augmented pipeline. Метод додає текстові докази з вебсторінок, image matching і confidence-based gating; автори зазначають, що код і дані доступні для відтворення.

Чому це важливо

Корисне тут не “ШІ вгадав місце”. Корисна схема: зібрати кандидатів, підтягнути вебдокази, відфільтрувати слабкі збіги й повернути результат людині-верифікатору. Це безпечніше, ніж сприймати візуальну схожість як одкровення Міністерства Координат.

Як це застосувати

Практичні поля: що робить — research workflow для visual geolocation через reverse-image retrieval, image matching і confidence gating; кейс — генерація кандидатів для public-interest geolocation; input — зображення, яке потребує location hypothesis; output — candidate locations і supporting web evidence; ліцензія/ціна — перевіряти linked code repository перед operational use; першоджерело — arXiv paper. Використовуйте для гіпотез, а далі перевіряйте landmarks, рельєф, дорожню геометрію, тіні, погоду, супутникові знімки, upload context і незалежні джерела.

Обмеження

Не використовуйте AI-геолокацію для пошуку приватних будинків, укриттів, дітей, свідків, родин полонених або приватних осіб. Model confidence не є верифікацією. Benchmarks можуть мати leakage, вебрезультати можуть брехати, а схожі місця іноді живуть на протилежних кінцях мапи, як два дуже вперті кокоси.

geosearchai-verificationgeolocationreverse-image-searchhuman-in-the-loop

🗺️Map RoomКартографічна кімната

#05

Ames Stereo Pipeline brings terrain discipline to satellite verification

Source: NASA Ames Stereo Pipeline documentation · documentation accessed 23 May 2026

What happened

Ames Stereo Pipeline is NASA’s free and open-source suite for geodesy and stereogrammetry. The documentation says it processes images from satellites, rovers, aerial cameras, low-cost satellites and historical imagery, producing digital terrain models, ortho-projected images, 3D models and camera networks.

Why it matters

Satellite verification in war reporting is often reduced to before/after images and a red circle. ASP points toward a more rigorous layer: terrain, orthorectification, camera geometry and uncertainty around damage or landscape change.

How to use it

Practical fields: does — stereo photogrammetry, terrain generation, orthoimage production and camera adjustment; use case — terrain-aware satellite analysis for damage, crater, fortification or infrastructure-change verification; input — suitable stereo/satellite/aerial imagery and camera metadata; output — DTM/DEM, point cloud, mesh, orthoimage; license/pricing — open-source software, imagery access may be paid/licensed separately; original source — NASA ASP docs and GitHub. Pair outputs with QGIS, source-image metadata and a written uncertainty note.

Limits

This is not a one-click OSINT toy. It requires suitable imagery, technical setup and domain judgement. Bad DEMs with confident captions are cartographic witchcraft; regrettably, not even the entertaining kind.

Ames Stereo Pipeline додає дисципліну рельєфу до супутникової верифікації

Джерело: NASA Ames Stereo Pipeline documentation · документацію переглянуто 23 травня 2026

Що сталося

Ames Stereo Pipeline — безкоштовний open-source suite NASA для geodesy і stereogrammetry. Документація описує роботу із супутниковими знімками, rover imagery, aerial cameras, low-cost satellites та історичними зображеннями; на виході — digital terrain models, ortho-projected images, 3D models і camera networks.

Чому це важливо

Супутникова верифікація у воєнних матеріалах часто зводиться до before/after і червоного кружечка. ASP дає серйозніший шар: рельєф, orthorectification, геометрія камери й uncertainty щодо пошкоджень або змін ландшафту.

Як це застосувати

Практичні поля: що робить — stereo photogrammetry, terrain generation, orthoimage production і camera adjustment; кейс — terrain-aware satellite analysis для перевірки руйнувань, воронок, укріплень або змін інфраструктури; input — придатні stereo/satellite/aerial imagery і camera metadata; output — DTM/DEM, point cloud, mesh, orthoimage; ліцензія/ціна — open-source software, imagery може бути платним або ліцензійним; першоджерело — NASA ASP docs і GitHub. Поєднуйте результати з QGIS, metadata знімків і окремою note про uncertainty.

Обмеження

Це не one-click OSINT toy. Потрібні придатні знімки, технічний setup і фахове тлумачення. Погані DEM із самовпевненими підписами — це картографічне чаклунство; на жаль, навіть не смішне.

ames-stereo-pipelinesatellite-imagerygeointterrain-analysisdamage-verification

🤖AI VerificationШІ та верифікація

#06

c2patool checks provenance metadata, but absence of C2PA proves almost nothing

Source: Content Authenticity Initiative / c2pa-rs · c2patool-v0.26.59, 12 May 2026

What happened

The CAI documentation describes c2patool as a command-line tool for working with C2PA manifests in audio, image and video files: it can read summary JSON, read low-level manifest data and add C2PA manifests. The active c2pa-rs repository lists c2patool-v0.26.59 as the latest release on 12 May 2026.

Why it matters

C2PA is becoming part of media-authenticity workflows. For OSINT, it is useful as a provenance check: who or what signed the file, whether the manifest validates, and what edit or creation history is present. It is not a magic fake detector.

How to use it

Practical fields: does — reads and validates C2PA manifests and exports structured reports; use case — provenance triage for media received from newsrooms, official sources, cameras, platforms or AI tools; input — supported image, video or audio file; output — summary JSON or low-level manifest report; license/pricing — c2pa-rs is MIT/Apache-2.0 and open source; original source — CAI docs and c2pa-rs GitHub. Run it before publication when a file claims provenance or AI-origin labeling, and archive both the original file and the JSON report.

Limits

C2PA can be stripped by platforms, screenshots and recompression. A valid credential proves a signed provenance statement, not that the depicted event is true. An absent credential is not evidence of fakery. Do not outsource editorial judgement to a metadata badge wearing a tiny official hat.

c2patool перевіряє provenance metadata, але відсутність C2PA майже нічого не доводить

Джерело: Content Authenticity Initiative / c2pa-rs · c2patool-v0.26.59, 12 травня 2026

Що сталося

Документація CAI описує c2patool як command-line tool для роботи з C2PA manifests в audio, image та video files: він читає summary JSON, low-level manifest data і може додавати C2PA manifests. Активний репозиторій c2pa-rs позначає c2patool-v0.26.59 як latest release від 12 травня 2026 року.

Чому це важливо

C2PA дедалі частіше стає частиною media-authenticity workflows. Для OSINT це корисно як provenance check: хто або що підписало файл, чи проходить manifest validation і яка історія створення чи редагування є всередині. Це не магічний детектор фейків.

Як це застосувати

Практичні поля: що робить — читає й validates C2PA manifests, експортує structured reports; кейс — provenance triage для медіа з newsroom, official sources, cameras, platforms або AI tools; input — підтримуваний image, video або audio file; output — summary JSON або low-level manifest report; ліцензія/ціна — c2pa-rs має MIT/Apache-2.0 і є open source; першоджерело — CAI docs і c2pa-rs GitHub. Запускайте перед публікацією, якщо файл заявляє provenance або AI-origin labeling, і архівуйте original file разом із JSON report.

Обмеження

C2PA може зникати після platform upload, screenshots і recompression. Валідний credential доводить підписану provenance statement, а не те, що подія на зображенні правдива. Відсутність credential не доводить фейк. Не віддавайте редакційне судження metadata badge у маленькому офіційному капелюсі.

c2pacontent-credentialsmedia-provenanceai-verificationmetadata