Innumbra — Dark Web Meta-Search Engine for OSINT Research

5,583
Sites
1,174
Online
5,250
Entities
9,448
Connections
📊 AI WEEKLY DIGEST
Based on the statistics provided in the given text, the dark web monitoring index has indexed 1768 total sites, with 312 currently online and 1456 currently offline. The index has recorded 1,515 new sites this week, with 1,000 of them being offline. The top categories for sites indexed are marketplace (31), maas (29), blog (26), email (20), and ransomware-panel (16). The top CTI types are Ransomware (480), Market (70), Forum (34), MaaS (4), and most active regions are Europe (14), Germany (6), U
Generated: 2026-02-27 01:41

What Is Innumbra?

Innumbra is a free, metadata-only dark web meta-search engine that indexes .onion hidden services on the Tor network. It is designed for security researchers, journalists, and threat intelligence analysts who need to discover, monitor, and analyze Tor hidden services without accessing their content directly.

Innumbra extracts and indexes: site titles, HTTP server headers, technology stacks (CMS, frameworks, databases), page language, uptime history, content hashes for mirror detection, OSINT entities (cryptocurrency wallets, PGP keys, email addresses, messaging handles), link relationships between sites, warrant canaries, law enforcement seizure banners, and threat intelligence classifications. No actual page content is stored, cached, or proxied.

Last updated: February 27, 2026 · 5,583 services indexed · 1,174 currently online

How to Search the Dark Web With Innumbra

Innumbra supports keyword search, advanced filter operators, and direct entity lookups. You can combine any of these in a single query to narrow results precisely.

Keywords
marketplace bitcoin
Entity search
bc1qxy2k...xyz or user@mail.com
Status filter
status:online status:offline
CTI type
cti:ransomware cti:marketplace
Technology
tech:nginx tech:wordpress
Language
lang:ru lang:de lang:en
Date range
after:2025-01 before:2026-01
Combine freely
forum status:online lang:en

Entity pivot search: Paste any entity value directly into the search bar — a Bitcoin address, email, PGP fingerprint, Jabber ID, or clearnet domain. Innumbra will show every indexed .onion site that references that identifier, enabling cross-site pivot analysis for OSINT investigations.

What OSINT Entities Does Innumbra Extract?

The automated crawler extracts 17+ categories of identifiers from every indexed page. These entities enable cross-referencing between sites — for example, searching a single Bitcoin wallet address reveals every .onion service that shares it, a technique used in threat intelligence and law enforcement investigations.

💰 bitcoin — BTC wallet addresses
💎 ethereum — ETH wallet addresses
🔑 monero — XMR wallet addresses
🐕 dogecoin — DOGE addresses
🛡 zcash — ZEC transparent + shielded
📧 email — Email addresses
🔐 pgp_fingerprint — PGP key fingerprints
💬 jabber — Jabber/XMPP IDs
telegram — Telegram handles
📞 tox_id — Tox messenger IDs
🌐 clearnet_ref — Clearnet dependencies
🧅 onion_link — Linked .onion services
🔗 i2p_address — I2P eepsite addresses

How Does Innumbra Discover Dark Web Sites?

Innumbra uses a four-phase discovery pipeline that combines public intelligence feeds, automated crawling, and community submissions:

  1. Public directory imports — Bulk ingestion from Ahmia's hidden service index (18,000+ addresses), with CSAM blocklist filtering applied before any address enters the database.
  2. Threat intelligence feeds — Automated imports from deepdarkCTI, ransomwatch, RansomLook, and curated CISA advisory sources covering ransomware data leak sites, dark web marketplaces, and forums.
  3. automated link crawling — The crawler follows outbound links from indexed sites to discover new .onion addresses, building a link graph that maps relationships between hidden services.
  4. User submissions — Community-submitted .onion addresses are queued for verification and indexing after blocklist screening.

Every indexed site undergoes periodic status checks with retry logic (3 attempts per check, fresh Tor circuit per attempt). Sites confirmed offline across 5 consecutive check cycles are pruned from the active index. Technology fingerprinting, entity extraction, content hashing, and operator correlation run as scheduled background tasks.

What Analysis Features Does Innumbra Provide?

Link graph visualizationInteractive force-directed graph showing how .onion sites link to each other, revealing clusters and hub sites.
Mirror detectionContent hashing identifies sites serving identical content across different .onion addresses.
Operator correlationIdentifies sites likely run by the same entity using shared crypto wallets, PGP keys, favicon hashes, and link topology.
Warrant canary trackingDetects and monitors PGP-signed warrant canary statements across indexed sites.
Seizure detectionIdentifies law enforcement seizure banners in 12 languages from 13+ agencies (FBI, Europol, BKA, etc.).
Technology fingerprintingIdentifies web servers, CMS platforms, frameworks, programming languages, and database technologies from HTTP headers and HTML signatures.
Change detectionTracks title changes, status transitions (online↔offline), and content modifications with timestamped history.

Technical Deep Dive

How Does Operator Correlation Assign Confidence Scores?

Innumbra's operator correlator uses a proprietary probabilistic model to combine independent signals into a single confidence score. Each signal type has a base weight reflecting its discriminative power:

PGP key fingerprints0.98Nearly unique per operator
Email addresses0.95Strong identity signal
Bitcoin addresses0.90Payment infrastructure
Content hash0.85Identical page content
Server fingerprint combo0.75Same server + framework + language
Favicon hash0.70Same visual identity
Fuzzy content hash0.65Near-duplicate content

Default/common values (e.g., "admin@localhost", standard Apache error pages) are blocklisted to prevent false positives. Independent signal probabilities are combined into a single composite confidence score.

What Entity Types Does the Crawler Extract?

The entity extraction pipeline uses pattern matching optimized for each identifier format. Current coverage includes 17+ entity types across 5 categories:

Cryptocurrency
Bitcoin (1/3/bc1 prefixes), Ethereum (0x, 40 hex), Monero (4/8 prefix, 95 chars), Dogecoin (D prefix), Zcash (t1/zs transparent + shielded)
Communication
Email addresses, Jabber/XMPP JIDs, Telegram handles (@username), Tox IDs (64 hex chars)
Cryptographic
PGP key fingerprints (40 hex chars, space-separated groups)
Network
.onion links (v2 16-char and v3 56-char), I2P eepsite addresses (.i2p), clearnet domain references
Image-derived (OCR)
All of the above extracted from images via automated OCR, catching entities hidden in screenshots and image-based text
How Is the OPSEC Score Calculated?

Each site starts at a base score of 50/100. The scoring engine applies penalties and rewards based on observable indicators:

Penalties
IP address in headers-25
Clearnet domain references-20
Server version exposed-15
Framework/runtime exposed-10
Insecure cookies (no Secure/HttpOnly)-10
Missing each security header-5 each
Rewards
Content-Security-Policy present+8
X-Content-Type-Options present+5
Secure + HttpOnly cookies+5

Score ranges: 0–29 (poor), 30–69 (moderate), 70–100 (strong). The OPSEC badge appears on every site dossier page.

Tags all tags →

Recently Updated view all →

Frequently Asked Questions

What is Innumbra and who is it for?

Innumbra is a free dark web meta-search engine that indexes .onion hidden services on the Tor network. It is built for OSINT (Open Source Intelligence) professionals, cybersecurity researchers, journalists investigating dark web activity, and threat intelligence analysts. Unlike content search engines, Innumbra indexes only metadata — titles, technology stacks, uptime status, extracted entities, and structural relationships — without storing any page content.

How is Innumbra different from other dark web search engines?

Most dark web search engines index page content for full-text search. Innumbra takes a different approach: it indexes metadata only, focusing on structural intelligence. This includes technology fingerprinting (identifying web servers, CMS platforms, and frameworks), entity extraction (cryptocurrency wallets, PGP keys, email addresses), operator correlation (linking sites to the same operator via shared identifiers), mirror detection (identifying duplicate sites via content hashing), and uptime monitoring with historical tracking.

What data sources does Innumbra use?

Innumbra aggregates from multiple intelligence sources: Ahmia's public hidden service index, deepdarkCTI threat feeds (markets, forums, ransomware groups), ransomwatch and RansomLook ransomware tracking APIs, curated onion directories, curated CTI repositories, clearnet aggregator sites, and automated automated crawl discovery. All imported addresses are screened against the Ahmia CSAM blocklist before indexing.

Does Innumbra store or cache dark web page content?

No. Innumbra is strictly a metadata-only index. It stores site titles, HTTP response headers, technology fingerprints, uptime history, content hashes (for mirror detection), and extracted entity identifiers. It does not store, cache, proxy, or reproduce any actual page content from .onion hidden services.

How does operator correlation work?

Operator correlation identifies .onion sites likely operated by the same person or group. Innumbra analyzes multiple signals with weighted confidence scores: shared cryptocurrency wallet addresses (0.90 confidence), shared PGP key fingerprints (0.98), shared email addresses (0.95), matching favicon image hashes (0.70), identical content hashes (0.85), matching server fingerprints (0.75), and link topology patterns. Sites exceeding the confidence threshold are grouped into operator clusters.

Is Innumbra free to use?

Yes. Innumbra is completely free to use with no accounts, rate limits, or paywalls. The search engine, entity explorer, link graph visualization, site comparison tool, and all analysis features are openly accessible. The platform also provides an API, RSS feeds, an llms.txt file for AI systems, and a sitemap for search engine indexing.

How does Innumbra detect mirror sites and clones?

Innumbra uses two complementary methods for mirror detection: exact content hashing creates fingerprints — two sites with identical hashes are confirmed mirrors. Fuzzy hashing detects near-duplicates even when sites differ slightly (e.g., different headers but same body content). Sites sharing the same content hash are automatically grouped on the mirror detection page. Near-duplicate detection runs efficiently across the entire index.

What is an OPSEC score and how is it calculated?

Innumbra rates each site's operational security on a 0–100 scale. The score starts at 50 and applies penalties for security failures: exposing server software versions (-15), clearnet domain references (-20), leaking IP addresses in headers (-25), missing security headers (-5 each), insecure cookies (-10), and exposing framework details (-10). Rewards are given for good practices: security headers like Content-Security-Policy (+8), X-Content-Type-Options (+5), and proper cookie flags (+5). Sites with scores below 30 are considered high-risk; above 70 indicates strong operational security.

Can I use Innumbra to trace a Bitcoin address across multiple dark web sites?

Yes. Innumbra extracts cryptocurrency wallet addresses (Bitcoin, Ethereum, Monero, Dogecoin, Zcash) from every indexed page. Paste any wallet address directly into the search bar to find every .onion site that references it. This entity pivot search is one of the most powerful features for threat intelligence and financial investigations — if the same Bitcoin address appears on three different dark web markets, those services may share an operator or payment processor. The entity explorer shows all extracted entities with cross-site relationships.

How does Innumbra handle illegal content and CSAM?

Innumbra applies the Ahmia CSAM blocklist to every address before it enters the index. Known child abuse sites are permanently excluded at the import stage. Innumbra does not store, cache, or proxy any page content — it indexes only metadata (titles, headers, technology fingerprints, entity identifiers). Users can report sites for removal via the content removal page. The platform is designed exclusively for legitimate OSINT research, journalism, and threat intelligence work.