cmp / dataset_bundle /methodology.md
cjc0013's picture
Upload 30 files
bfdd027 verified

A newer version of the Gradio SDK is available: 6.14.0

Upgrade

Congress Public Records Slice Methodology

What This Release Is

This bundle is a neutral, review-oriented slice of public-record linkages built from a fixed House-wide run of the dataset. It is designed for exploration and bounded verification, not for assigning guilt, wrongdoing, intent, or causality.

What This Shows

Public records can be normalized into a reproducible graph of House trades, committees, bills, votes, campaign finance, lobbying visibility, and community project funding.

What This Does Not Prove

This sample does not prove illegality, corruption, intent, or causality. It only shows deterministic overlap, timing, and linkage strength from official public records.

Source Groups

  • House Clerk financial disclosures and PTRs
  • House Clerk member directory and committee list
  • GovInfo BILLSTATUS bulk data
  • House Clerk roll-call vote XML
  • FEC public bulk downloads
  • LDA public search pages
  • House member community project funding disclosure pages

Public Release Notes

  • This release is a slice of public-record data, not a complete accounting of all potentially relevant data.
  • Future releases may update or expand this slice as source recovery, parsing, and evidence linkage improve.
  • This release does not assign guilt, wrongdoing, intent, or causality to any person or organization.
  • The release shows public-record overlaps, timing, and linkage strength, not proof of illegality or corruption.
  • Some rows remain review-tier or include unresolved official source references and should be read with those labels in mind.
  • The public package includes verification summaries and SHA-backed artifact indexes, but it does not include the full internal raw corpus, so external verification is bounded by what is published here.