Methodology

Where every signal comes from.

GeoQ doesn't own a secret dataset. We derive, refresh and classify public and open-source inputs on a stated cadence, attach an evidence label to each one, and publish the exact risk weights below. This page is the whole method — sources, cadence, weights — so you can check our work, not take it on faith.

A signal is one input, not a verdict. Cadence is a refresh schedule, not a claim that data is current to the second.

Datasets, sources & refresh cadence

Dataset Fields set Source Cadence Evidence
IP geolocation
Geolocation is an estimate, not a GPS fix. Accuracy is highest at country level and drops toward city level.
geo.country, geo.region, geo.city, geo.latitude, geo.longitude, geo.timezone DB-IP (CC BY 4.0) Monthly (DB-IP free release) inferred
Datacenter / cloud ranges
Providers publish their own ranges. We classify the IP against them; we never guess residential or mobile.
connection_type === "datacenter", datacenter_provider AWS, Google Cloud, Microsoft Azure published IP ranges Daily authoritative
Satellite ranges
Satellite is a connection_type value, not a boolean. Satellite ASNs can carry mixed traffic — read the limit on the detect page.
connection_type === "satellite" Operator-published / BGP-derived ASN ranges (e.g. Starlink) Daily authoritative
Tor exit list
The exit list is published by the Tor Project. Membership is a fact, not an inference.
is_tor Tor Project public exit list Hourly authoritative
VPN ranges
Commercial-VPN ranges shift. We treat this as an inference, not ground truth.
is_vpn Self-maintained list of known commercial-VPN ranges Daily inferred
Proxy ranges (beta)
Residential-proxy detection is beta — weight it accordingly. Spur leads this category; we do not claim parity.
is_proxy Open / anonymising-proxy lists; residential-proxy detection is beta Daily beta
Relay ranges
Apple publishes these so operators can recognise relay traffic. A benign network kind — it caps the score (see weights).
is_relay, relay_provider: "icloud" Apple's published iCloud Private Relay egress ranges Daily authoritative
Public-resolver ranges
A benign network kind. Recognising it stops public DNS resolvers scoring as fraud.
is_public_resolver Published resolver ranges (e.g. 8.8.8.8, 1.1.1.1) Daily authoritative
Spamhaus DROP
We retain only the current published ranges and refresh on update; we do not redistribute the lists.
is_drop_listed The Spamhaus Project DROP lists Daily (on each DROP update) authoritative
Routing health (BGP)
Derived from public BGP tables. is_announced means a covering prefix is visible in the global routing table.
is_announced, is_bogon RouteViews + RIPE NCC RIS public BGP data; bogon constants Daily (BGP); monthly (bogon constants) authoritative
RPKI validation
We run our own validator against the published trust anchors. Only the invalid state scores.
rpki ("valid" | "invalid" | "unknown") RPKI repositories, validated with a self-run Routinator Daily authoritative
RIR allocation
Derived from the RIRs' published delegated-statistics files.
allocation_date, allocation_age_days, registration_country ARIN, RIPE NCC, APNIC, LACNIC, AFRINIC delegated statistics Weekly authoritative
Recent abuse (beta)
Beta and demand-gated — it scores zero today. Surfaced as a signal you can read, not yet a contributor.
recent_abuse Emerging Threats open lists; CINS Army list Daily beta
Verified crawlers
Identifies a good crawler you must not block. It carries zero risk weight — it is not bad-bot detection.
is_verified_bot, verified_bot_name Operator-published crawler ranges (Googlebot, Bingbot) Daily authoritative

What the evidence labels mean

LabelMeaning
authoritative Sourced from a list the network or registry publishes about itself (Apple relay ranges, RIR allocations, the Tor exit list). Membership is a fact.
inferred Derived from lists that shift over time (commercial-VPN ranges, geolocation). Treat as an estimate, not ground truth.
beta Surfaced so you can read it, but not yet trusted enough to score. Weight it yourself; we don't.

Every response carries an evidence object with one of these labels per signal, so you can decide how much to trust each one. See the response schema, and the glossary for the terms behind each signal.

Published risk weights

The risk score is min(100, Σ weights) of the signals that fired — no machine-learning black box. After the sum, a benign network kind (relay, satellite or public resolver) caps the score at 20. Full worked examples and the reproducible code are on the risk-score methodology page.

SignalWeightWhy
is_tor +45 Tor exit node — strong anonymisation
is_proxy +40 Open/anonymising proxy (residential-proxy detection in beta)
is_drop_listed +40 IP is on the Spamhaus DROP list (do-not-route, known hostile)
connection_type=="datacenter" +35 Hosting / cloud range, not a residential ISP
is_bogon +30 Bogon — unallocated or reserved space that should never source traffic
is_vpn +30 Known commercial VPN range
rpki=="invalid" +20 Route origin fails RPKI validation (only "invalid" scores)

Suppressor: if is_relay, connection_type === "satellite" or is_public_resolver is true, the score is capped at 20 and benign_network_kind is added to reasons[]. It's a cap, not a negative weight. recent_abuse is beta and weighted zero today; is_verified_bot identifies a good crawler and carries no weight.

How we keep it honest

Refreshed on a cadence you can see

Each dataset above has a stated refresh schedule. We don't say "real-time" or "always up to date" — we tell you how often we pull, and you can plan around it.

Derived in the open

Every score ships with reasons[] and per-signal evidence. The inputs are public; the weights are published here; the formula is reproducible.

Built to reduce false positives

Relay, satellite and public-resolver IPs are recognised and capped at 20 — so Apple and Starlink users don't get scored like a hostile datacenter. See the false-positive guide.

Fails closed, never empty

Hand-curated lists carry a coverage canary and a staleness check. A degraded or empty dataset fails the build rather than silently shipping bad data.

Start with the free tier. No card.

5,000 lookups a day, every signal, the same transparent risk score. Upgrade only when you outgrow it.