PROJECT OVERVIEW:
Disk Health is a proactive SaaS monitoring solution designed to prevent production server downtime caused by disk space exhaustion. It continuously monitors disk usage across multiple servers, identifies potential issues before they become critical, and provides actionable insights and automated solutions. The core value proposition is to save businesses time, prevent data loss, and ensure uninterrupted service by intelligently managing server storage.
TECH STACK:
- Framework: Next.js (App Router)
- Language: TypeScript
- Styling: Tailwind CSS
- UI Components: shadcn/ui
- ORM: Drizzle ORM
- Database: PostgreSQL (or suitable alternative like SQLite for local dev)
- Authentication: NextAuth.js (or Clerk/Supabase Auth)
- State Management: Zustand or React Context API
- Real-time Updates: WebSockets (e.g., Socket.IO or native WebSockets)
- Charting: Chart.js or Recharts
- API Layer: Server Actions and Route Handlers
- Deployment: Vercel or similar platform
- Monitoring Agent: A lightweight Go or Python agent to collect disk metrics from servers and send them to the backend.
DATABASE SCHEMA:
1. `users` table:
- `id` (UUID, Primary Key)
- `name` (VARCHAR)
- `email` (VARCHAR, Unique)
- `emailVerified` (TIMESTAMP)
- `image` (VARCHAR, optional)
- `createdAt` (TIMESTAMP)
- `updatedAt` (TIMESTAMP)
2. `accounts` table (for NextAuth.js integration):
- `id` (BIGINT, Primary Key)
- `userId` (UUID, Foreign Key to users.id)
- `type` (VARCHAR)
- `provider` (VARCHAR)
- `providerAccountId` (VARCHAR)
- `refresh_token` (TEXT, optional)
- `access_token` (TEXT, optional)
- `expires_at` (BIGINT, optional)
- `token_type` (VARCHAR, optional)
- `scope` (VARCHAR, optional)
- `id_token` (TEXT, optional)
- `session_state` (VARCHAR, optional)
3. `servers` table:
- `id` (UUID, Primary Key)
- `userId` (UUID, Foreign Key to users.id)
- `name` (VARCHAR, e.g., 'Production Web Server')
- `hostname` (VARCHAR, Unique for user)
- `ipAddress` (VARCHAR)
- `agentPort` (INTEGER, Port where the monitoring agent is listening)
- `location` (VARCHAR, optional, e.g., 'New York', 'Frankfurt')
- `diskCapacityGB` (INTEGER, Total disk size in GB)
- `monitoringEnabled` (BOOLEAN, default: true)
- `createdAt` (TIMESTAMP)
- `updatedAt` (TIMESTAMP)
4. `diskMetrics` table:
- `id` (UUID, Primary Key)
- `serverId` (UUID, Foreign Key to servers.id)
- `timestamp` (TIMESTAMP)
- `usedPercent` (FLOAT, e.g., 95.5)
- `usedGB` (FLOAT)
- `availableGB` (FLOAT)
- `inodesUsedPercent` (FLOAT, optional)
- `inodesAvailable` (BIGINT, optional)
5. `alerts` table:
- `id` (UUID, Primary Key)
- `serverId` (UUID, Foreign Key to servers.id)
- `timestamp` (TIMESTAMP)
- `alertType` (VARCHAR, e.g., 'DISK_FULL', 'DISK_HIGH_USAGE', 'INODES_HIGH')
- `details` (JSONB, e.g., {'path': '/var/log', 'usage': '98%'})
- `status` (VARCHAR, 'triggered', 'resolved', 'acknowledged')
- `resolvedAt` (TIMESTAMP, optional)
6. `userSettings` table:
- `userId` (UUID, Primary Key, Foreign Key to users.id)
- `alertThresholdPercent` (INTEGER, default: 85)
- `notificationEmail` (VARCHAR, default: user.email)
- `webhookUrl` (VARCHAR, optional)
- `autoResolveTimeMinutes` (INTEGER, optional)
- `updatedAt` (TIMESTAMP)
CORE FEATURES & USER FLOW:
1. **User Authentication & Onboarding:**
* **Flow:** User signs up via email/password or social login (Google, GitHub). After successful authentication, they are directed to a welcome/setup page.
* **Onboarding:** Users are prompted to add their first server. This involves providing a server name, IP address, and the port for the monitoring agent. A unique token or configuration details are provided for the agent setup on the target server.
* **Edge Case:** User already has an account; redirect to login.
2. **Server Registration & Agent Setup:**
* **Flow:** User navigates to 'Add Server'. They input server details (Name, IP, Agent Port). The system generates a configuration for the monitoring agent.
* **Agent Side:** The user installs a small agent (e.g., a Go binary or Python script) on their production server. This agent is configured with the provided token/details and runs periodically (e.g., every minute) to collect disk metrics.
* **Data Transmission:** The agent sends collected metrics (disk usage %, GB used, GB available) to the Disk Health backend API via HTTP POST requests.
* **Validation:** Backend validates incoming data, checks server registration, and stores metrics.
* **Edge Case:** Agent fails to connect or send data; mark server as 'unreachable' or 'stale' in the UI.
3. **Real-time Disk Monitoring Dashboard:**
* **Flow:** Authenticated users land on the main dashboard displaying all their registered servers. Each server entry shows its name, IP, current disk usage percentage (visually represented, e.g., a colored bar or gauge), and last reported status.
* **Details View:** Clicking a server expands to show more details: capacity, used/available GB, historical usage chart (last 24h, 7d, 30d), and recent alerts for that server.
* **Real-time Updates:** Disk usage data is pushed to the client via WebSockets, updating the dashboard in near real-time without requiring manual refresh.
* **Visuals:** Use clear color coding for disk usage (e.g., Green < 70%, Yellow 70-85%, Red > 85%).
4. **Alerting System:**
* **Flow:** The backend continuously analyzes incoming `diskMetrics`. If `usedPercent` exceeds the user-defined `alertThresholdPercent` (default 85%), an alert of type `DISK_HIGH_USAGE` is triggered.
* **Triggering:** A new record is created in the `alerts` table with status 'triggered'.
* **Notifications:** The system sends notifications via configured channels: primary email (`userSettings.notificationEmail`) and optionally a webhook (`userSettings.webhookUrl`).
* **Alert Details:** Notifications include server name, current usage, threshold, and a link to the dashboard.
* **Resolution:** If disk usage drops below a certain recovery threshold (e.g., 75%) for a sustained period, or if manual resolution is triggered, the alert status is updated to 'resolved'.
* **Edge Case:** Alert storms (too many rapid triggers/resolutions) should be handled (e.g., debouncing or grouping).
5. **Issue Identification & Analysis:**
* **Flow:** Within the server details view, users can trigger a "Scan Large Files" action (this might require a command run on the server via the agent, potentially a security risk if not handled carefully, or the agent could provide data about top N files/dirs if the OS allows). Alternatively, the backend can analyze historical metric patterns to suggest potential causes (e.g., rapid growth correlating with log file sizes).
* **MVP Simplification:** For MVP, focus on identifying the top N largest directories based on data readily available from the agent or OS commands executed securely. Present a sorted list (e.g., `du -sh /* | sort -rh | head -n 10`).
* **Output:** Display a list of top directories/files contributing to high disk usage.
6. **User Settings & Management:**
* **Flow:** Users can access a 'Settings' page to manage their profile, notification preferences (thresholds, email, webhook), and registered servers.
* **Server Management:** Add new servers, edit existing server details, or delete servers. Deleting a server stops monitoring and removes its associated metrics and alerts.
* **Account Management:** Update profile information.
API & DATA FETCHING:
- **`/api/servers` (POST):** Register a new server. Request body: `{ name: string, ipAddress: string, agentPort: number }`. Response: Newly created server object or error.
- **`/api/servers/[id]` (PUT):** Update server details. Request body: `{ name?: string, ipAddress?: string, monitoringEnabled?: boolean }`. Response: Updated server object.
- **`/api/servers/[id]` (DELETE):** Delete a server.
- **`/api/metrics` (POST, from Agent):** Endpoint for the agent to send metrics. Request body: `{ serverId: string, timestamp: string, usedPercent: number, usedGB: number, availableGB: number }`. Backend logic triggers alert checks here.
- **`/api/dashboard/servers` (GET):** Fetch list of all servers and their latest `diskMetrics` for the dashboard. Use Server Actions or Route Handlers.
- **`/api/servers/[id]/metrics/history` (GET):** Fetch historical `diskMetrics` for a specific server for charting. Query parameters for time range (e.g., `?range=24h`).
- **`/api/alerts` (GET):** Fetch alerts for the logged-in user, with filtering options (e.g., `?status=triggered`).
- **`/api/settings` (GET/PUT):** Fetch and update user settings.
- **WebSocket Endpoint (`/ws`):** Handles real-time data push. Server pushes new `diskMetrics` for relevant servers to connected clients.
COMPONENT BREAKDOWN (Next.js App Router Structure):
- **`app/layout.tsx`:** Root layout (includes Providers, Head, Body).
- **`app/page.tsx`:** Landing Page (if applicable before login).
- **`app/(auth)/login/page.tsx`:** Login page.
- **`app/(auth)/signup/page.tsx`:** Signup page.
- **`app/dashboard/layout.tsx`:** Authenticated dashboard layout (e.g., includes sidebar, header).
- **`app/dashboard/page.tsx`:** Main Dashboard - Lists all servers with status overview. Fetches data via `/api/dashboard/servers` (SSR/RSC) and updates via WebSocket.
- `components/ServerList/ServerListItem.tsx`: Displays individual server summary.
- `components/ServerList/ServerStatusIndicator.tsx`: Visual status indicator.
- `components/Charts/DiskUsageChart.tsx`: Renders historical usage chart.
- **`app/dashboard/servers/[serverId]/page.tsx`:** Server Detail Page - Shows detailed metrics, historical chart, alerts, and file analysis options for a single server. Fetches data via `/api/servers/[id]/metrics/history` and `/api/alerts`.
- `components/Charts/DetailedDiskUsageChart.tsx`: More detailed chart.
- `components/Alerts/AlertsTable.tsx`: Table displaying alerts for the server.
- `components/ServerManagement/ScanFilesButton.tsx`: Button to trigger file scan.
- **`app/settings/page.tsx`:** User Settings Page - Forms for profile, alert thresholds, notification preferences.
- `components/Settings/ProfileForm.tsx`
- `components/Settings/AlertSettingsForm.tsx`
- `components/Settings/NotificationForm.tsx`
- **`app/servers/new/page.tsx`:** Add New Server Page - Form for registering a new server.
- **`components/ui/Button.tsx`, `components/ui/Input.tsx`, etc.:** Reusable UI components from shadcn/ui.
- **`components/Layout/Sidebar.tsx`, `components/Layout/Navbar.tsx`:** Navigation components.
- **`lib/hooks/useWebSocket.ts`:** Custom hook for managing WebSocket connection and data.
- **`lib/utils/validators.ts`:** Input validation functions.
- **`app/api/.../route.ts`:** API route handlers (Server Actions or Route Handlers).
UI/UX DESIGN & VISUAL IDENTITY:
- **Style:** Clean, Modern, Professional.
- **Color Palette:** Primary: `#4F46E5` (Indigo-500), Secondary: `#6366F1` (Indigo-600), Accent: `#2563EB` (Blue-600). Neutrals: `#F3F4F6` (Gray-100) for backgrounds, `#1F2937` (Gray-800) for dark text, `#6B7280` (Gray-500) for secondary text. Alert Colors: Warning `#F59E0B` (Amber-500), Danger `#EF4444` (Red-500).
- **Typography:** Inter or Roboto font family. Clear hierarchy with distinct heading sizes (e.g., `h1`, `h2`, `h3`).
- **Layout:** Responsive grid system (Tailwind CSS). Dashboard uses a sidebar navigation. Cards for server summaries. Clear data tables. Generous whitespace.
- **Visual Elements:** Subtle gradients on buttons or headers. Use of icons (e.g., from lucide-react) for clarity. Gauges or progress bars for disk usage. Line charts for historical data.
- **Responsiveness:** Mobile-first approach, ensuring usability on all screen sizes. Sidebar collapses into a hamburger menu on smaller screens.
ANIMATIONS:
- **Transitions:** Smooth `transition` and `ease-in-out` for hover effects on buttons and interactive elements (Tailwind CSS defaults).
- **Loading States:** Use `react-skeletons` or shadcn/ui's `Skeleton` component for data fetching placeholders. Subtle fade-in animations for content appearing.
- **Chart Animations:** Default animations provided by Chart.js/Recharts, or disable for performance if needed.
- **WebSocket Updates:** Smoothly animate the updating values on gauges or charts when new data arrives.
EDGE CASES:
- **No Servers Registered:** Display a clear call-to-action on the dashboard prompting the user to add their first server.
- **Agent Not Connected/Stale Data:** Clearly indicate servers with missing or old data (e.g., a 'warning' icon or 'greyed out' status). Set a TTL on metrics and display a 'last checked' timestamp.
- **Authentication Errors:** Gracefully handle expired sessions, redirect to login. Prevent access to protected routes without authentication.
- **API Errors:** Implement global error handling for API requests. Display user-friendly messages (e.g., 'Failed to load server list. Please try again.'). Use React Context or a global state manager for error display.
- **Validation:** Implement robust client-side and server-side validation for all form inputs (server details, settings).
- **High Load:** Ensure the backend can handle a large number of metric updates per second, especially if many users have many servers. Consider database indexing and potentially message queues for processing.
- **Alert Thresholds:** Allow users to set flexible thresholds (e.g., 50% to 99%).
SAMPLE DATA (for `diskMetrics` and `alerts`):
**`diskMetrics` Examples:**
1. `{ "serverId": "uuid-server-1", "timestamp": "2023-10-27T10:00:00Z", "usedPercent": 75.2, "usedGB": 150.4, "availableGB": 49.6 }`
2. `{ "serverId": "uuid-server-1", "timestamp": "2023-10-27T10:01:00Z", "usedPercent": 75.3, "usedGB": 150.6, "availableGB": 49.4 }`
3. `{ "serverId": "uuid-server-2", "timestamp": "2023-10-27T10:00:30Z", "usedPercent": 91.5, "usedGB": 366.0, "availableGB": 34.0 }`
4. `{ "serverId": "uuid-server-2", "timestamp": "2023-10-27T10:01:30Z", "usedPercent": 91.8, "usedGB": 367.2, "availableGB": 32.8 }`
5. `{ "serverId": "uuid-server-3", "timestamp": "2023-10-27T10:02:00Z", "usedPercent": 45.0, "usedGB": 90.0, "availableGB": 110.0 }`
**`alerts` Examples:**
1. `{ "id": "uuid-alert-1", "serverId": "uuid-server-2", "timestamp": "2023-10-27T10:01:30Z", "alertType": "DISK_HIGH_USAGE", "details": {"path": "/", "usage": "91.8%"}, "status": "triggered", "resolvedAt": null }`
2. `{ "id": "uuid-alert-2", "serverId": "uuid-server-2", "timestamp": "2023-10-27T10:05:00Z", "alertType": "DISK_HIGH_USAGE", "details": {"path": "/", "usage": "88.0%"}, "status": "resolved", "resolvedAt": "2023-10-27T10:05:00Z" }`
3. `{ "id": "uuid-alert-3", "serverId": "uuid-server-1", "timestamp": "2023-10-27T09:55:00Z", "alertType": "DISK_HIGH_USAGE", "details": {"path": "/data", "usage": "86.1%"}, "status": "triggered", "resolvedAt": null }`
TURKISH TRANSLATIONS:
- App Title: Disk Sağlığı
- Dashboard: Pano
- Servers: Sunucular
- Add Server: Sunucu Ekle
- Settings: Ayarlar
- Server Name: Sunucu Adı
- IP Address: IP Adresi
- Disk Usage: Disk Kullanımı
- Capacity: Kapasite
- Used: Kullanılan
- Available: Mevcut
- Status: Durum
- Alerts: Uyarılar
- Triggered: Tetiklendi
- Resolved: Çözüldü
- Monitoring Enabled: İzleme Etkin
- Last Checked: Son Kontrol
- High Disk Usage Threshold: Yüksek Disk Kullanım Eşiği
- Notification Settings: Bildirim Ayarları
- Email: E-posta
- Webhook URL: Webhook URL'si
- Scan Large Files: Büyük Dosyaları Tara
- Analysis: Analiz
- System Storage Insufficient: Sistem Depolama Yetersiz
- Unexpected Reply: Beklenmedik Yanıt
- History: Geçmiş
- Real-time: Gerçek Zamanlı
- Log Files: Günlük Dosyaları
- Disk Full: Disk Dolu
- System Maintenance: Sistem Bakımı
- Proactive Monitoring: Proaktif İzleme
- Prevent Downtime: Kesintiyi Önle
- Save Changes: Değişiklikleri Kaydet
- Update Server: Sunucuyu Güncelle
- Delete Server: Sunucuyu Sil
- Are you sure?: Emin misiniz?
- Server added successfully: Sunucu başarıyla eklendi.
- Settings updated: Ayarlar güncellendi.
- No servers found. Add one to get started!: Henüz sunucu bulunamadı. Başlamak için bir tane ekleyin!
- Agent configuration copied to clipboard. Install it on your server!: Agent yapılandırması panoya kopyalandı. Sunucunuza kurun!
- An error occurred: Bir hata oluştu.
- Login to continue: Devam etmek için giriş yapın.
- Connect your server: Sunucunuzu Bağlayın
- Get Started: Başlayın
- Disk Health Monitoring: Disk Sağlığı İzleme
- Monitor your servers' disk space proactively and prevent downtime. : Sunucularınızın disk alanını proaktif olarak izleyin ve kesintileri önleyin.
- Features: Özellikler
- Pricing: Fiyatlandırma
- Login: Giriş Yap
- Sign Up: Kaydol
- Dashboard: Pano
- Server Details: Sunucu Detayları
- Usage Over Time: Zaman İçindeki Kullanım
- Alert History: Uyarı Geçmişi
- Top Directories: En Üst Dizinler
- Notification Preferences: Bildirim Tercihleri
- Profile: Profil
- Logout: Çıkış Yap
- Hourly: Saatlik
- Daily: Günlük
- Weekly: Haftalık
- Monthly: Aylık
- Custom Range: Özel Aralık
- Bytes: Bayt
- Gigabytes: Gigabayt
- Percentage: Yüzde
- System Overview: Sistem Genel Bakışı
- Real-time Metrics: Gerçek Zamanlı Metrikler
- Automated Alerts: Otomatik Uyarılar
- File System Analysis: Dosya Sistemi Analizi
- Secure Connection: Güvenli Bağlantı
- User-friendly Interface: Kullanıcı Dostu Arayüz
- Affordable Plans: Uygun Fiyatlı Planlar
- Start Free Trial: Ücretsiz Denemeyi Başlat