Sources & Licensing
Last updated: 21 May 2026
databook.dataint.net (the "Site") is built on data published by third parties. This page explains where the data comes from, under what licenses we use it, and how we attribute it. It is both a transparency statement and our record of license compliance.
1. Our principle
We use only data we are licensed to use. Every dataset behind the Site carries a machine-readable license, and our publishing pipeline applies a deny-by-default gate: a dataset is published only if its license is on an explicit allowlist, or if we use facts only from it (see §3). Anything unknown or restricted is not published.
2. License families we publish (full content)
Data in these families is presented in full, with attribution:
- Public domain / CC0 — e.g. many government and scientific datasets, and Natural Earth (cartographic geometry).
- Creative Commons Attribution (CC BY 4.0 / CC BY 3.0 IGO) — used with attribution, as the license requires.
- Government open-data licenses — e.g. the UK Open Government Licence v3, France Etalab 2.0, and equivalent national open-data terms.
For CC BY and government open licenses, attribution is a license condition, not a courtesy — we provide it as described in §4.
3. "Facts only" sources (attribution + restraint)
Some sources are under licenses we do not treat as freely republishable for an ad-supported site. These fall in three distinct categories:
- (a) Share-alike licenses (e.g. CC BY-SA) whose copyleft requirement would propagate to derivative works of the dataset.
- (b) Non-commercial licenses (e.g. CC BY-NC, CC BY-NC-SA) that restrict commercial reuse — and an ad-supported site is treated as commercial.
- (c) Database licenses (e.g. the Open Database License / ODbL) that, under the EU sui generis database right, control the extraction or re-utilization of substantial portions of the database.
From any source in these categories we use individual facts with attribution — for example, "according to the source, the value is X" — and we do not reproduce or redistribute the source database, dataset, or substantial extracts of it. This reflects the principle that facts themselves are not owned, while the database/compilation may be.
4. How we attribute
Attribution is delivered in three layers, so each fact is traceable:
- On every value — a provenance stamp showing the source name and the edition/reference year.
- On every page — a "Sources" list of all sources cited on that page (generated from page metadata).
- On this page — a consolidated list of all sources and their licenses (see §5).
5. Source register
Generated from each dataset's
licensemetadata field across 2,046 populated datasets in the Warehouse, grouped by license family and contributing publisher. Counts are dataset counts, not record counts. "Facts only" tagging indicates sources under share-alike, non-commercial, or database licenses (§3) from which we use individual facts with attribution, never re-publishing the underlying dataset.
CC BY 4.0 — 904 datasets
- The World Bank Group (477) — WDI, Climate Change Knowledge Portal, EdData and related programmes
- International Labour Organization (ILO) — 232 — ILOSTAT
- V-Dem Institute — University of Gothenburg (177)
- Other contributors (18): GeoNames, European Commission, Flanders Marine Institute (VLIZ), World Inequality Lab, Global Carbon Project, OECD DCD, Ember, Max Planck Institute for Evolutionary Anthropology, Global Energy Monitor, IDB Research Department, WRI / University of Maryland (GLAD), Groningen Growth & Development Centre, EC JRC EDGAR team, geoBoundaries (William & Mary geoLab), FAO Rome, FAO Land & Water Division, IDMC Geneva.
CC BY 3.0 IGO — 239 datasets
- UN DESA — Population Division (238) — World Population Prospects
- UNDP — Human Development Report Office (1)
CC BY-NC-SA 4.0 — 194 datasets — facts only (§3)
- Inter-Parliamentary Union (IPU) — Parline (194)
Government open licenses — 70 datasets
- T.C. Ticaret Bakanlığı (Turkish Ministry of Trade) (28)
- Bundesanstalt für Materialforschung und -prüfung (BAM) (6)
- U.S. federal agencies (16) — Census Bureau, BTS, FAA, FHWA, FRA, USITC, ITA, FFIEC/Federal Reserve, FDIC, OFAC, NGA, NOAA/EPA, ODNI, US DoT/PHMSA, US DoT/BTS
- Other Turkish ministries / agencies (7) — PTT, İçişleri Bakanlığı, Ulaştırma ve Altyapı Bakanlığı, Gümrükler Genel Müdürlüğü, DHMİ, KGM
- EU institutions (4) — DG AGRI, DG MOVE, ECB, DG TAXUD
- China — GACC, China Railway (12306) (2)
- Other (7) — EMSA/Equasis, CIA, BAM, …
Public domain — 7 datasets
- NASA Goddard Space Flight Center — Earth Observatory
- U.S. Geological Survey — Earthquake Hazards Program
- CIA / U.S. Central Intelligence Agency
- OurAirports Community (David Megginson)
- USGS National Minerals Information Center (Reston VA)
- U.S. Library of Congress, Federal Research Division
- IANA — Internet Assigned Numbers Authority (ICANN)
CC0 — 1 dataset
- CIA / U.S. Central Intelligence Agency — Factbook-derived data (facts under public-domain US-government work)
UK Open Government Licence v3 — 1 dataset
- UK Foreign, Commonwealth & Development Office (FCDO)
Etalab 2.0 (FR) — 1 dataset
- CEPII (Centre d'Études Prospectives et d'Informations Internationales); reconciled from UN Comtrade
CC BY-SA 4.0 — 3 datasets — facts only (§3)
- Global Fishing Watch (2)
- UNESCO Institute for Statistics (1)
ODbL — 2 datasets — facts only (§3)
- OpenStreetMap Contributors
- Trainline EU
CC BY-NC-SA 3.0 IGO — 1 dataset — facts only (§3)
- UNESCO World Heritage Centre (data.unesco.org)
Non-commercial — 1 dataset — facts only (§3)
- United Nations Educational, Scientific and Cultural Organization (UNESCO)
Proprietary — 4 datasets — facts only (§3); used under research/academic terms with attribution
- Stockholm International Peace Research Institute (SIPRI) (2)
- BIC / SMDG (1)
- International Chamber of Commerce (ICC) (1)
Mixed — 2 datasets
- GADM (University of California, Davis)
- OECD International Transport Forum (ITF)
Open (generic — UN/UNECE/IGO sectoral terms) — 616 datasets
Datasets under generic "open" terms set by UN family bodies and equivalent IGOs. Treated as attribution-required, free-republication where the publisher's terms allow.
- UNECE (241) — UN/LOCODE and related trade-facilitation references
- UNECE / OurAirports Community (235)
- geoBoundaries / William & Mary geoLab (80)
- UN Statistics Division (UNSD) (12)
- Eurostat / European Commission (4)
- International IDEA (Stockholm) (4)
- SMDG (3)
- Long-tail (37) — ECB, Council of the EU, EBA, EPC, IMF, ILO, UNHCR, ITU, BGS/UKRI, UN OLA, EIA, UNODC, NACIS, CelesTrak/US Space Surveillance, WTO, ICAO, Unicode, ISO/SIL, SIX Group, GLEIF, IMO, IATA, OpenFlights, Debian iso-codes, OONI, NBS (China, third-party reconciled), and others.
6. Trademarks and third-party rights
Country names, flags, organization names, and standard codes referenced on the Site belong to their respective owners and are used for identification and reference only; their use does not imply affiliation or endorsement.
7. Corrections
If you believe a source is mis-attributed or a license is misapplied, please tell us via the Notice & Takedown / Correction process.