📚 Database Schema Documentation

Comprehensive guide to how curve data is organized in the GST Forecast system

📊 How Curve Data is Organized

🏗️

CurveDefinition

Reusable Template

Purpose: Defines what kind of curve this is. Templates are reused across many forecast runs.

Key Fields:
market → "ERCOT", "CAISO", "PJM"
location → "Houston", "SP15", "North Hub"
product → "Revenue_Optimized", "LMP"
curveType → "REVENUE", "TB4", "ENERGY_ARB"
batteryDuration → "HOUR_4", "HOUR_2"
scenario → "BASE", "BULL", "P50"
units → "$/MWh", "MW"
degradationType → "NONE", "DATE", "PERCENTAGE"
granularity → "HOURLY", "MONTHLY", "ANNUAL"
Example: "ERCOT Houston Revenue_Optimized TB4 Battery, Base Case scenario"
📈

CurveInstance

Specific Run

Purpose: A specific forecast run of the definition template. Contains when and who created this version.

Key Fields:
curveDefinitionId → Links to template above
instanceVersion → "v1.0", "v2.1", "final"
createdBy → "Aurora", "Gridstor"
modelType → "Fundamental", "Statistical"
deliveryPeriodStart → "2025-01-01"
deliveryPeriodEnd → "2025-12-31"
forecastRunDate → When created
Example: "March 2025 v2.1 by Aurora, from Jan-Dec 2025"
💾

PriceForecast

Raw Data

Purpose: Individual data points with consolidated p-value columns. One row per timestamp with all confidence levels.

Key Fields:
curveInstanceId → Links to specific run above
timestamp → "2025-01-01 00:00:00"
valueP5 → $42.50 (5th percentile)
valueP25 → $45.25 (25th percentile)
valueP50 → $48.75 (median/base case)
valueP75 → $52.10 (75th percentile)
valueP95 → $55.20 (95th percentile)
flags → ["outlier", "holiday", "estimated"]
Structure: One row per timestamp with all p-values. 8,760 hours = 8,760 rows (not 26,280), much more efficient!

📈 Data Flow

1
Definition
Template
2
Instance
Run Details
3
Data
Time Series

🎯 Schema Design Principles

✅ "Apples to Apples" Comparisons

All data within a CurveDefinition should be directly comparable:

  • • Same market, location, and product
  • • Same battery duration and granularity
  • • Different vendors/models/p-values are comparable
  • • Different scenarios can be compared within context

📊 P-Value Consolidation

Instead of separate instances per p-value, we store all confidence levels in one row:

  • • One CurveInstance per forecast run
  • • P5, P25, P50, P75, P95 as columns in PriceForecast
  • • More efficient storage and querying
  • • Easier to visualize confidence bands

🛠️ API Endpoints

Definition Management

POST /api/curve-upload/create-definition
GET /api/curves/definitions
PUT /api/admin/edit-curve-definition
DELETE /api/admin/edit-curve-definition

Instance Management

POST /api/curve-upload/create-instance
GET /api/curves/list
DELETE /api/admin/delete-curve-instance

Data Upload

POST /api/curve-upload/upload-data
GET /api/curves/data

📄 CSV Upload Format

Required Columns

timestamp

ISO 8601 format: 2024-01-01T00:00:00Z

value

Numeric price/revenue value

Optional Columns

pvalue

Confidence level: 5, 25, 50, 75, 95 (default: 50)

units

Price units: $/MWh, MW, etc. (default: $/MWh)

Example CSV

timestamp,value,pvalue,units
2024-01-01T00:00:00Z,42.50,5,$/MWh
2024-01-01T00:00:00Z,45.25,25,$/MWh
2024-01-01T00:00:00Z,48.75,50,$/MWh
2024-01-01T00:00:00Z,52.10,75,$/MWh
2024-01-01T00:00:00Z,55.20,95,$/MWh
2024-01-01T01:00:00Z,43.10,5,$/MWh
2024-01-01T01:00:00Z,46.80,25,$/MWh
...