Data dictionary & business glossary
Programmatically extracted from PDTF Schemas v3 (the base transaction + 18 main overlays — extension overlays not yet included) and merged with the two OPDA Glossary spreadsheets. The first deliverable of the semantic-modelling workstream.
- 16 canonical schemas (5 redundant ones excluded — see below)
- 8,458 property-path entries across canonical schemas
- 1,557 unique leaf property names — the actual PDTF vocabulary
- 389 cross-context concepts (appear in 3+ schemas — need ontology reconciliation)
- 754 context-specific concepts (appear in 1 schema only)
- 54 business terms in the merged OPDA Glossary
- 554 concepts in the generated SKOS scheme
Browse all properties
All 8,458 property-path entries from the canonical schemas. Search by name or path, filter by bounded context, source schema, or JSON type. Click any column header to sort.
Tip: try tenure, EPC or address in the search box;
or pin a single overlay (e.g. ta6) with the source filter to see exactly
what it adds.
Deliverables
All files generated to source/00-deliverables/semantic-models/.
| File | What it is | Audience |
|---|---|---|
data-dictionary-canonical.json (≈ 1.8 MB) |
Machine-readable: 8,458 property-path entries across the 16 canonical schemas. Use this for headline counts. | Engineers, code generators |
data-dictionary.json (3.0 MB) |
Original extraction including derived/superseded schemas (14,287 entries). Kept for transparency — do not use for counts. | Audit only |
data-dictionary.md (≈ 110 KB) |
Human-readable: per-schema tables, deduplicated to unique leaf names, organised by bounded context. | Engineers, BAs, reviewers |
audit.json (≈ 2 KB) |
What the inflation was, why, and the corrected figures. Read this to understand the count discrepancy. | Anyone reviewing numbers |
glossary-merged.json (20 KB) |
Merged of the two OPDA Glossary.xlsx files (PoC + Technical WG), deduplicated to 54 unique terms. |
Engineers, modelling |
business-glossary.md (23 KB) |
Three-section human glossary: OPDA Glossary terms A–Z, top-level PDTF concepts from schema annotations, external vocabulary (W3C VC, DID, ToIP). | Stakeholders, working groups, new joiners |
business-glossary.ttl (164 KB) |
SKOS Concept Scheme — 554 concepts. Loads directly into Protégé / GraphDB / TopBraid for the ontology work. | Ontology engineers |
Canonical schema inventory
| Schema | Bounded context | Unique leaves |
|---|---|---|
pdtf-transaction.json | Base — spans all contexts | 1,557 |
baspi5.json | Estate Agency | 318 |
rds.json | Property Data Services | 196 |
piq.json | Surveying | 184 |
ta6.json | Conveyancing | 178 |
nts2.json | Estate Agency | 160 |
lpe1.json | Conveyancing | 136 |
con29R.json | Property Data Services | 125 |
ntsl2.json | Estate Agency | 124 |
ta7.json | Conveyancing | 98 |
ta10.json | Conveyancing | 90 |
fme1.json | Mortgage Lending | 78 |
oc1.json | Property Data Services | 68 |
con29DW.json | Property Data Services | 34 |
sr24.json | Property Data Services | 7 |
llc1.json | Property Data Services | 3 |
Excluded as redundant: combined.json (derived merge), skeleton.json (template),
baspi4.json (superseded by baspi5), nts.json (superseded by nts2),
ntsl.json (superseded by ntsl2).
How it was built
pdtf-transaction.json]:::src S2[18 v3 overlays
BASPI · TA6/7/10 · NTS
CON29R/DW · PIQ · OC1 · LLC1
LPE1 · FME1 · RDS]:::src G1[Glossary.xlsx
Trust Framework PoC]:::src G2[Glossary.xlsx
Technical Working Group]:::src EXT[W3C VCDM 2.0
W3C DID 1.0
ToIP Foundation]:::src PROC[Property walker
+ glossary merger
+ SKOS generator]:::proc D1[data-dictionary.json
14,287 entries]:::out D2[data-dictionary.md
per-source tables]:::out D3[glossary-merged.json
54 terms deduplicated]:::out D4[business-glossary.md
tri-source glossary]:::out D5[business-glossary.ttl
SKOS · 554 concepts]:::out S1 --> PROC S2 --> PROC G1 --> PROC G2 --> PROC EXT --> PROC PROC --> D1 PROC --> D2 PROC --> D3 PROC --> D4 PROC --> D5
Source citations
Every entry in data-dictionary.json carries the source
field naming the schema file it was extracted from. Every entry in
glossary-merged.json carries a sources array naming
which of the two Glossary.xlsx files contributed.
The primary sources, with paths:
source/03-standards/schemas/src/schemas/v3/pdtf-transaction.json(base transaction schema)source/03-standards/schemas/src/schemas/v3/combined.json(base + overlays merged)source/03-standards/schemas/src/schemas/v3/skeleton.json(skeleton for new transactions)source/03-standards/schemas/src/schemas/v3/overlays/*.json(BASPI v4 + v5, TA6, TA7, TA10, NTS, NTS2, NTSL, NTSL2, CON29R, CON29DW, PIQ, RDS, OC1, LLC1, LPE1, FME1, sr24)source/06-research/trust-framework-poc/Glossary.xlsxsource/04-governance-bodies/working-groups/technical/Glossary.xlsx- External: W3C VCDM 2.0, W3C DID 1.0, Trust Over IP Foundation
Schema coverage
The 21 v3 root-level + main-overlay JSON files the extractor reads, and the
bounded context each one serves. (The 16 NTS2 extension overlays under
overlays/extensions/ are documented separately on the
PDTF overlays page and not yet processed by
this generator.)
| Schema | Bounded context | What it adds to the base |
|---|---|---|
pdtf-transaction.json | Base (all contexts) | The core transaction object — participants, property, claims, lifecycle |
combined.json | Combined view | Base + all overlays merged for tooling |
skeleton.json | Bootstrapping | Skeleton structure for a new empty transaction |
baspi4.json / baspi5.json | Estate Agency | BASPI v4 + v5 (HBSG) |
nts.json / nts2.json | Estate Agency | NTS Material Info Sales (v1 + v2 successor) |
ntsl.json / ntsl2.json | Estate Agency | NTS Material Info Lettings (v1 + v2) |
ta6.json | Conveyancing | Law Society TA6 Property Information Form |
ta7.json | Conveyancing | Law Society TA7 Leasehold Information Form |
ta10.json | Conveyancing | Law Society TA10 Fittings & Contents Form |
lpe1.json | Conveyancing | LPE1 Leasehold Property Enquiry |
fme1.json | Mortgage Lending | Form for Mortgage Enquiries |
piq.json | Surveying | Property Information Questionnaire |
rds.json | Property Data Services | Residential Data Schema |
con29R.json | Property Data Services | CON29R Residential local-authority search |
con29DW.json | Property Data Services | CON29 Drainage & Water search |
oc1.json | Property Data Services | OC1 Office Copy entries (HMLR title) |
llc1.json | Property Data Services | LLC1 Local Land Charges search |
sr24.json | Property Data Services | sr24 (small overlay) |
The 313 cross-context concepts
313 property names appear in three or more overlays — these are the cross-context vocabulary that needs explicit reconciliation in the ontology (same word, possibly different meanings in different contexts). Top examples (sorted by spread):
address, name, date, amount,
type, description, reference,
status, provider, title,
property, id, required, code,
category …
Full list with which overlays contain each: see the
"Cross-context concepts" section of
source/00-deliverables/semantic-models/business-glossary.md.
What to do with these deliverables
-
Have an editorial pass over
business-glossary.mdwith the Technical Working Group — reconcile the OPDA Glossary terms (which are open-banking/trust-framework-flavoured) with the property-data terms extracted from the schemas. The mix today is asymmetric. -
Define the upper ontology — pick the core classes (Transaction,
Property, Participant, Claim, Document, Form, Search) and align with W3C VC.
Use
business-glossary.ttlas the seed. -
Per-overlay JSON-LD contexts — for each of the
18 main overlays (and eventually the 16
extension overlays), author a
@contextthat maps the JSON property names to ontology terms. Start withbaspi5.jsonas the worked example. - Generate SHACL shapes from the JSON Schema validation rules. Most rules (required, enum, pattern, type) translate mechanically.
- Disambiguate the 313 cross-context names — for each, decide whether they refer to the same concept (single ontology class) or to context-specific concepts (separate classes with mapping).