# PROJECT OVERVIEW
**App Name:** CrystalForecasting
**Problem:** Businesses and analysts struggle with accurate and scalable time-series forecasting. While advanced models like Google's TimesFM exist, they often require significant technical expertise to implement, tune, and integrate into workflows. This leads to missed opportunities, inefficient resource allocation, and unreliable future planning.
**Solution:** CrystalForecasting is a user-friendly SaaS platform that leverages Google's state-of-the-art TimesFM foundation model to provide powerful time-series forecasting capabilities. It democratizes access to advanced AI forecasting by offering an intuitive interface for data upload, model configuration, and result visualization, enabling users of all technical backgrounds to make data-driven predictions with confidence.
**Value Proposition:** Empowering businesses to predict the future with unprecedented accuracy and ease. CrystalForecasting transforms complex time-series data into actionable insights, driving smarter decisions, optimizing operations, and unlocking growth opportunities.
**Target Audience:** Financial analysts, business intelligence professionals, data scientists, e-commerce managers, operations planners, and businesses of all sizes looking to make reliable future projections from their time-series data.
---
# TECH STACK
* **Frontend:** Next.js (App Router), React, TypeScript, Tailwind CSS, shadcn/ui (for pre-built components)
* **Backend:** Next.js API Routes (or a separate Node.js/Python backend if complexity demands, but for MVP, Next.js API routes are preferred for simplicity).
* **Database:** PostgreSQL (via Drizzle ORM)
* **ORM:** Drizzle ORM (for type-safe database interactions)
* **Authentication:** NextAuth.js (or Clerk for a managed solution if preferred for speed)
* **Charting Library:** Chart.js or Recharts (for interactive visualizations)
* **State Management:** Zustand or React Context API (for global state)
* **Deployment:** Vercel (recommended for Next.js)
* **Other:** Axios (for API requests), date-fns (for date manipulation)
---
# DATABASE SCHEMA (Drizzle ORM - PostgreSQL)
```sql
-- Users table for authentication
users ( id SERIAL PRIMARY KEY, name VARCHAR(255), email VARCHAR(255) UNIQUE NOT NULL, emailVerified TIMESTAMP(3) NULL, image TEXT NULL, created_at TIMESTAMP(3) DEFAULT NOW(), updated_at TIMESTAMP(3) DEFAULT NOW() );
-- Account table for NextAuth.js or similar
accounts (
provider VARCHAR(255) NOT NULL,
providerAccountId VARCHAR(255) NOT NULL,
userId INT NOT NULL REFERENCES users(id) ON DELETE CASCADE,
type VARCHAR(255) NOT NULL,
refresh_token TEXT NULL,
access_token TEXT NULL,
expires_at BIGINT NULL,
token_type TEXT NULL,
scope TEXT NULL,
id_token TEXT NULL,
session_state TEXT NULL,
PRIMARY KEY (provider, providerAccountId)
);
-- Sessions table for NextAuth.js or similar
sessions (
sessionToken TEXT PRIMARY KEY,
userId INT NOT NULL REFERENCES users(id) ON DELETE CASCADE,
expires TIMESTAMP(3) NOT NULL
);
-- Verification Tokens table for NextAuth.js or similar
verification_tokens (
identifier TEXT NOT NULL,
token TEXT NOT NULL,
expires BIGINT NOT NULL,
PRIMARY KEY (identifier, token)
);
-- Datasets table to store user-uploaded data information
datasets (
id SERIAL PRIMARY KEY,
userId INT NOT NULL REFERENCES users(id) ON DELETE CASCADE,
name VARCHAR(255) NOT NULL, -- User-defined name for the dataset
original_filename VARCHAR(255) NOT NULL, -- Original file name
storage_path VARCHAR(255) NOT NULL, -- Path to stored data (e.g., S3, local disk)
num_rows INT,
num_columns INT,
uploaded_at TIMESTAMP(3) DEFAULT NOW(),
created_at TIMESTAMP(3) DEFAULT NOW(),
updated_at TIMESTAMP(3) DEFAULT NOW()
);
-- ForecastJobs table to manage forecasting tasks
forecast_jobs (
id SERIAL PRIMARY KEY,
userId INT NOT NULL REFERENCES users(id) ON DELETE CASCADE,
datasetId INT NOT NULL REFERENCES datasets(id) ON DELETE CASCADE,
target_column VARCHAR(255) NOT NULL, -- The column to forecast
timestamp_column VARCHAR(255) NOT NULL, -- The column containing timestamps
forecast_horizon INT NOT NULL, -- How far into the future to predict (e.g., 100 steps)
model_parameters JSONB NULL, -- Stores any specific model parameters (e.g., context length, frequency)
status VARCHAR(50) NOT NULL DEFAULT 'PENDING', -- PENDING, PROCESSING, COMPLETED, FAILED
created_at TIMESTAMP(3) DEFAULT NOW(),
updated_at TIMESTAMP(3) DEFAULT NOW(),
completed_at TIMESTAMP(3) NULL
);
-- ForecastResults table to store the actual forecast data
forecast_results (
id SERIAL PRIMARY KEY,
jobId INT NOT NULL REFERENCES forecast_jobs(id) ON DELETE CASCADE,
timestamp TIMESTAMP(3) NOT NULL, -- Timestamp for the forecast point
forecast_value DOUBLE PRECISION NOT NULL, -- The predicted value
-- Optional: for quantile forecasts
forecast_lower_bound DOUBLE PRECISION NULL,
forecast_upper_bound DOUBLE PRECISION NULL,
-- Optional: for covariate support
covariates JSONB NULL,
created_at TIMESTAMP(3) DEFAULT NOW()
);
-- Add indexes for performance
CREATE INDEX idx_datasets_userid ON datasets(userId);
CREATE INDEX idx_forecast_jobs_userid ON forecast_jobs(userId);
CREATE INDEX idx_forecast_jobs_datasetid ON forecast_jobs(datasetId);
CREATE INDEX idx_forecast_results_jobid ON forecast_results(jobId);
CREATE INDEX idx_forecast_results_timestamp ON forecast_results(timestamp);
```
---
# CORE FEATURES & USER FLOW
1. **User Authentication:**
* **Flow:** User lands on the homepage -> Clicks 'Sign Up' or 'Log In' -> Presented with options (e.g., Google OAuth, Email/Password) -> Upon successful authentication, redirected to their dashboard.
* **Details:** Use NextAuth.js for robust authentication. Store user credentials and session information securely in the `users`, `accounts`, `sessions`, and `verification_tokens` tables. Implement password hashing for email/password signups.
2. **Dataset Upload:**
* **Flow:** User navigates to 'Datasets' page -> Clicks 'Upload New Dataset' -> Selects a file (CSV, Excel) from their device -> Enters a descriptive name for the dataset -> Clicks 'Upload' -> System validates the file format and basic structure (presence of timestamp-like column and at least one numerical column) -> File is stored (e.g., in S3 or local storage) and its metadata (name, filename, path, row/column count) is saved in the `datasets` table -> User sees the uploaded dataset in their list.
* **Details:** Use a library like `multer` (if using Next.js API routes with Node.js backend) or a dedicated frontend upload component. Implement server-side validation. Handle potential large file uploads efficiently (consider chunking or background processing for very large files).
3. **Forecasting Job Creation:**
* **Flow:** User selects an uploaded dataset from their 'Datasets' list -> Clicks 'Create Forecast' -> Presented with a form:
* Select 'Target Column' (e.g., 'Sales', 'Revenue').
* Select 'Timestamp Column' (e.g., 'Date', 'Timestamp').
* Specify 'Forecast Horizon' (e.g., '100' steps).
* (Optional) Advanced Settings: Specify context length, frequency indicator (if needed), covariate columns.
* (Optional) Quantile Forecast toggle and horizon.
* User clicks 'Start Forecasting' -> A new record is created in the `forecast_jobs` table with 'PENDING' status -> The system queues the job for processing.
* **Details:** Frontend form with dropdowns populated from the selected dataset's columns. Input validation for horizon. JSONB storage for flexible `model_parameters`.
4. **Model Execution (Backend/API Route):**
* **Flow:** A background job processor (or a Next.js API route triggered by the queue) picks up 'PENDING' jobs from `forecast_jobs`.
* Fetches the dataset from its storage path.
* Prepares the data according to TimesFM requirements (timestamp alignment, column selection).
* Instantiates and runs the TimesFM model (e.g., using a Python script called from the API route, or directly via a compatible JS library if available and performant enough for the MVP). This is the most complex part and might involve inter-process communication or a separate microservice.
* Stores the prediction results (timestamp, forecast_value, optional bounds/covariates) in the `forecast_results` table.
* Updates the `forecast_jobs` status to 'COMPLETED' or 'FAILED' and sets `completed_at` timestamp.
* **Details:** This part heavily relies on integrating with the TimesFM model. For MVP, calling a pre-packaged Python inference script from a Next.js API route might be the most feasible approach. Ensure error handling and logging. Consider resource management for model execution.
5. **Results Visualization:**
* **Flow:** User navigates to the 'Forecasts' page or views a specific forecast job -> Sees a list of completed forecast jobs -> Clicks on a job -> Displays an interactive chart showing:
* Historical data (from the original dataset).
* The forecasted values.
* (Optional) Confidence intervals/quantiles.
* User can toggle different columns/views, zoom/pan, and export the chart or data.
* **Details:** Use a charting library like Recharts or Chart.js. Fetch data from `forecast_results` based on `jobId`. Provide options to export data as CSV via a dedicated API endpoint.
---
# API & DATA FETCHING
* **Authentication Endpoints:** Handled by NextAuth.js (e.g., `/api/auth/[...nextauth]`).
* **Dataset Upload:** `POST /api/datasets`
* Request Body: `FormData` containing the file and dataset name.
* Response: `{ success: true, datasetId: number, message: string }` or `{ success: false, message: string }`.
* **Get Datasets:** `GET /api/datasets`
* Response: `[{ id: number, name: string, uploaded_at: string, ... }, ...]` (for the logged-in user).
* **Delete Dataset:** `DELETE /api/datasets/:id`
* Response: `{ success: true }` or `{ success: false }`.
* **Create Forecast Job:** `POST /api/forecast-jobs`
* Request Body: `{ datasetId: number, targetColumn: string, timestampColumn: string, forecastHorizon: number, ...modelParameters }`
* Response: `{ success: true, jobId: number, message: string }`.
* **Get Forecast Jobs:** `GET /api/forecast-jobs` (with query params for filtering by status, datasetId, etc.)
* Response: `[{ id: number, datasetName: string, targetColumn: string, status: string, created_at: string, ... }, ...]`.
* **Get Specific Forecast Job Details & Results:** `GET /api/forecast-jobs/:id`
* Response: `{ jobDetails: {...}, results: [{ timestamp: string, forecast_value: number, ... }, ...] }`.
* **Export Forecast Data:** `GET /api/forecast-jobs/:id/export?format=csv`
* Response: CSV file download.
**Data Fetching Strategy:**
* Use client-side fetching (e.g., with `useEffect` and `useState`, or libraries like SWR/React Query) for dynamic data like datasets list, job status updates, and forecast results.
* Server components in Next.js App Router can be used for initial data loading on pages like the dashboard or dataset list where initial state is needed, fetching directly from the DB.
* API routes will handle all interactions with the database and the TimesFM model execution logic.
---
# UI/UX DESIGN & VISUAL IDENTITY
* **Design Style:** Modern, Clean, Professional, Trustworthy.
* **Color Palette:**
* Primary: Deep Blue (`#0A2540`)
* Secondary: Teal (`#30C5D4`)
* Accent: Light Grey (`#F8FAFC`), White (`#FFFFFF`)
* Text: Dark Grey (`#334155`)
* Success: Green (`#22C55E`)
* Warning/Processing: Orange (`#F97316`)
* Error: Red (`#EF4444`)
* **Typography:** Sans-serif, modern font. 'Inter' or 'Inter Variable' is a good choice. Use varying weights for hierarchy.
* Headings: Inter Bold (e.g., 36px, 24px, 18px)
* Body Text: Inter Regular (e.g., 16px, 14px)
* **Layout:** Use a consistent grid system (e.g., 12-column grid). Sidebar navigation for core sections (Dashboard, Datasets, Forecasts, Settings). Main content area with clear cards and sections.
* **Key Elements:** Use subtle shadows for cards, clear visual hierarchy, ample whitespace, and intuitive icons.
* **Responsive Rules:** Mobile-first approach. Ensure usability on screens from 375px width up to large desktops. Navigation might collapse into a hamburger menu on smaller screens. Tables should be horizontally scrollable or adapt content gracefully.
---
# COMPONENT BREAKDOWN (Next.js App Router Structure)
```
app/
├── api/
│ ├── auth/[...nextauth]/route.ts # Auth routes
│ ├── datasets/
│ │ ├── route.ts # CRUD for datasets (POST, GET, DELETE)
│ │ └── [id]/route.ts # Specific dataset handling
│ ├── forecast-jobs/
│ │ ├── route.ts # CRUD for forecast jobs (POST, GET)
│ │ └── [id]/
│ │ ├── route.ts # Get job details & results (GET)
│ │ └── export/
│ │ └── route.ts # Export results (GET)
│ └── ... (other API logic, e.g., model execution trigger)
├── auth/
│ ├── layout.tsx # Auth layout
│ └── signin/
│ └── page.tsx # Sign-in page
├── components/
│ ├── ui/
│ │ ├── button.tsx # shadcn/ui Button
│ │ ├── input.tsx # shadcn/ui Input
│ │ ├── card.tsx # shadcn/ui Card
│ │ ├── table.tsx # shadcn/ui Table wrapper
│ │ ├── chart.tsx # Chart component (using Recharts/Chart.js)
│ │ ├── sidebar.tsx # Navigation sidebar
│ │ ├── header.tsx # Top header/navbar
│ │ ├── spinner.tsx # Loading indicator
│ │ └── ... (other shadcn/ui components)
│ ├── datasets/
│ │ ├── DatasetList.tsx # Displays list of datasets
│ │ ├── DatasetUploader.tsx # Handles file upload UI
│ │ └── DatasetListItem.tsx # Individual item in the dataset list
│ ├── forecasts/
│ │ ├── ForecastJobList.tsx # Displays list of forecast jobs
│ │ ├── ForecastJobDetails.tsx # Shows details and results of a single job
│ │ ├── ForecastForm.tsx # Form for creating a new forecast job
│ │ └── ForecastChart.tsx # Component rendering the forecast chart
│ └── ... (common components like AuthProvider, Layout)
├── dashboard/
│ └── page.tsx # Dashboard overview (e.g., summary stats, recent activity)
├── datasets/
│ └── page.tsx # Main datasets page (list and upload)
├── forecasts/
│ └── page.tsx # Main forecasts page (list and details)
├── layout.tsx # Root layout (includes providers, global styles)
├── page.tsx # Landing page
├── providers.tsx # Context providers (e.g., AuthContext)
└── tsconfig.json
└── tailwind.config.ts
```
**State Management:**
* Global state (e.g., user authentication status) managed via Context API or Zustand.
* Component-level state for forms, UI interactions, etc., managed using `useState`, `useReducer`.
* Server/Client data fetching and caching handled by Next.js data fetching patterns (Server Components, Route Handlers) and potentially SWR/React Query on the client.
---
# ANIMATIONS
* **Page Transitions:** Subtle fade-in/out transitions between pages using Next.js's built-in capabilities or libraries like `Framer Motion`.
* **Hover Effects:** Slight scale-up or color change on interactive elements like buttons and cards.
* **Loading States:** Use `Spinner` components or skeleton loaders while data is being fetched or models are processing. Animate status changes (e.g., PENDING -> PROCESSING -> COMPLETED).
* **Chart Animations:** Smooth transitions when data updates or charts are loaded.
* **Form Feedback:** Subtle animations for input validation success/error states.
---
# EDGE CASES & VALIDATIONS
* **Authentication:** Secure handling of sign-up, login, password reset, and session management. Prevent unauthorized access to data.
* **Data Upload:** Handle invalid file formats, empty files, files with missing required columns (timestamp, numerical), extremely large files (show progress, handle errors gracefully), duplicate file names.
* **Forecasting:**
* **Empty Datasets:** Prevent forecast job creation if dataset is empty or has insufficient data.
* **Invalid Column Selection:** Ensure target and timestamp columns are valid.
* **Non-numeric Target Column:** Throw error during job creation or processing.
* **Invalid Timestamp Format:** Attempt to parse common formats, otherwise flag as an error.
* **Model Execution Errors:** Catch errors during TimesFM execution, update job status to 'FAILED', log the error details for debugging, and provide user feedback.
* **Zero/Negative Horizon:** Validate forecast horizon input.
* **API Errors:** Implement consistent error handling across all API routes. Return meaningful error messages and status codes.
* **Empty States:** Design informative empty states for the Dashboard, Datasets list, and Forecasts list (e.g., "No datasets uploaded yet. Click here to upload your first dataset.").
* **Rate Limiting:** Consider basic rate limiting on API endpoints to prevent abuse.
---
# SAMPLE DATA (for Mocking & Initial UI State)
**1. User Data:**
```json
{
"id": 1,
"name": "Alice Smith",
"email": "alice.smith@example.com",
"image": "https://example.com/avatar.jpg"
}
```
**2. Dataset Metadata:**
```json
[
{
"id": 101,
"userId": 1,
"name": "Q3 Sales Data",
"original_filename": "q3_sales.csv",
"storage_path": "s3://crystalforecasting-data/user1/q3_sales.csv",
"num_rows": 500,
"num_columns": 5,
"uploaded_at": "2024-07-26T10:00:00Z"
},
{
"id": 102,
"userId": 1,
"name": "Website Traffic - July",
"original_filename": "july_traffic.xlsx",
"storage_path": "s3://crystalforecasting-data/user1/july_traffic.xlsx",
"num_rows": 300,
"num_columns": 3,
"uploaded_at": "2024-07-25T15:30:00Z"
}
]
```
**3. Forecast Job List (with status):**
```json
[
{
"id": 501,
"datasetName": "Q3 Sales Data",
"targetColumn": "SalesAmount",
"timestampColumn": "SaleDate",
"status": "COMPLETED",
"forecast_horizon": 30,
"created_at": "2024-07-26T11:00:00Z",
"completed_at": "2024-07-26T11:05:00Z"
},
{
"id": 502,
"datasetName": "Website Traffic - July",
"targetColumn": "UniqueVisitors",
"timestampColumn": "Day",
"status": "PROCESSING",
"forecast_horizon": 15,
"created_at": "2024-07-26T14:00:00Z",
"completed_at": null
},
{
"id": 503,
"datasetName": "Q3 Sales Data",
"targetColumn": "UnitsSold",
"timestampColumn": "SaleDate",
"status": "FAILED",
"forecast_horizon": 30,
"created_at": "2024-07-26T16:00:00Z",
"completed_at": "2024-07-26T16:01:00Z"
}
]
```
**4. Forecast Results (for Job ID 501):**
```json
[
{
"timestamp": "2024-09-01T00:00:00Z",
"forecast_value": 15500.75,
"forecast_lower_bound": 14800.50, // Example quantile data
"forecast_upper_bound": 16201.00
},
{
"timestamp": "2024-09-02T00:00:00Z",
"forecast_value": 15750.20,
"forecast_lower_bound": 15000.10,
"forecast_upper_bound": 16500.30
},
// ... more forecast points up to horizon
]
```
**5. Sample CSV Data (for upload):**
```csv
"SaleDate","UnitsSold","SalesAmount"
"2024-01-01",100,12000.50
"2024-01-02",110,13500.75
"2024-01-03",105,12800.00
...
```
**6. Sample Model Parameters (JSONB):**
```json
{
"context_length": 1024,
"frequency": "D" // 'D' for Daily, 'M' for Monthly, etc.
}
```