Methodology#
This section of the PUDL documentation describes our methodologies for more involved data processing that are unique to PUDL, often affecting multiple tables or datasets. For example:
Estimating per-unit heat rates (thermal efficiency)
Allocating reported fuel consumption and net generation to individual generators
Estimating generator capacity factors
Estimating CapEx and O&M costs by plant based on FERC Form 1 data
Matching FERC & EIA plants and utilities
Extending EIA’s boiler-generator association to cover more units
Matching EIA Utilities and SEC Companies
Estimating state-level hourly electricity demand based on overlapping utility service territories
It’s primarily intended to help end users of the data understand what went into making the data, even if they aren’t digging into the code itself. We’re just getting started fleshing it out, in response to our 2025 PUDL User Survey.
- Entity Resolution
- Timeseries Imputation
- SEC 10-K Ownership Data Extraction Modeling
- Overview
- Extracting Ownership Data From Exhibit 21 Attachments
- Assigning
subsidiary_company_id_sec10kto Extracted Subsidiary Companies - Matching Subsidiary Companies to a Central Index Key
- Matching SEC Filing Companies to EIA Utilities
- Matching SEC Subsidiary Companies to EIA Utilities
- Assumptions
- Future Improvements