PROJECT OVERVIEW:
The primary goal of Lemonade OS is to democratize the use of powerful Local Large Language Models (LLMs) by providing an incredibly user-friendly, performant, and private desktop application. It leverages the speed and efficiency of AMD's Lemonade backend, enabling users to run advanced AI models like GPT-4 equivalents, coding assistants, and image generators directly on their own PCs (Windows, macOS, Linux) with minimal setup. The core value proposition is to offer a 'local-first' AI experience that is fast, open-source, private by default, and easily integrated with hundreds of existing AI applications through an OpenAI API-compatible interface. It aims to eliminate the complexities of LLM setup and management, making cutting-edge AI accessible to developers, researchers, and enthusiasts on their personal hardware.
TECH STACK:
- Frontend Framework: React (using Create React App for simplicity or Vite for faster builds)
- UI Library: Tailwind CSS for utility-first styling and rapid UI development
- State Management: Zustand or Jotai for efficient and simple global state management, suitable for a single-page application.
- Routing: React Router DOM for handling navigation within the SPA.
- Core Logic: JavaScript/TypeScript for application logic, interacting with a native backend (assumed to be Lemonade by AMD).
- Backend Communication: Fetch API or Axios for making requests to the local Lemonade server.
- Icons: Heroicons or similar for a clean, consistent icon set.
- Build Tools: Vite (preferred for speed) or Webpack.
CORE FEATURES:
1. **One-Click Installer & Auto-Configuration**:
* **User Flow**: Upon launching the application for the first time, the user is presented with a simple "Install" or "Get Started" button. Clicking this initiates the download and installation of the necessary Lemonade backend components and dependencies. The application automatically detects the user's hardware (CPU, GPU, NPU) and optimally configures the Lemonade server settings (e.g., `--no-mmap`, appropriate context sizes, GPU offloading). If dependencies like NVIDIA drivers or specific libraries are missing, the installer attempts to guide the user or provide download links.
* **Details**: This involves downloading a lightweight (~2MB) native executable for the Lemonade backend and any required drivers/libraries. Configuration is done by writing to a local config file or directly via command-line arguments when starting the server.
2. **Model Management & Download**:
* **User Flow**: Within the application, there's a "Models" section. Users can browse a curated list of compatible LLMs (e.g., `gpt-oss-120b`, `Qwen-Coder-Next`, Stable Diffusion variants) with descriptions, VRAM requirements, and performance benchmarks. A "Download" button initiates the download process, showing progress and status. Users can also manually add models by providing a download URL or path to a local file.
* **Details**: This requires an API endpoint or a local JSON file that lists available models. Downloads should be robust, supporting resumable downloads. Model metadata (name, size, type, compatibility) needs to be stored locally.
3. **Local LLM Server Control**:
* **User Flow**: A central "Server" tab allows users to start, stop, and restart the Lemonade backend server. Users can view the server status (Running/Stopped/Error), access the server logs in real-time, and see the configured parameters (like port number, API endpoint, loaded model). A "Quick Settings" option allows tweaking basic parameters like RAM allocation or context size without deep diving.
* **Details**: This involves executing shell commands to start/stop the native Lemonade executable and stream its output (logs) to the frontend. It also requires managing the server process lifecycle.
4. **OpenAI API Compatible Interface**:
* **User Flow**: Once the server is running, the application displays the local API endpoint (e.g., `http://localhost:8080/v1`). Users can copy this endpoint with a button click. Any application that supports the OpenAI API standard (like Open WebUI, n8n, etc.) can be configured to use this local endpoint, enabling private, offline AI interactions.
* **Details**: The Lemonade backend itself provides this compatibility. The OS app just needs to clearly display the endpoint and potentially offer easy integration wizards for popular companion apps.
5. **Integrated Chat Interface**:
* **User Flow**: A simple, built-in chat interface allows users to immediately test their loaded LLM. Users type messages, send them to the local server via the API, and display the LLM's response. This serves as a quick way to verify the setup and experiment with the model's capabilities (e.g., text generation, Q&A, simple tool use).
* **Details**: This is a standard chat UI component. It sends POST requests to the `/chat/completions` endpoint of the local Lemonade API.
6. **Image Generation Preview**:
* **User Flow**: If an image generation model (like Stable Diffusion) is loaded, a dedicated section allows users to input prompts (e.g., "A pitcher of lemonade in the style of a renaissance painting"). Clicking "Generate" sends the prompt to the LLM server and displays the resulting image. Basic parameters like image size or steps could be adjustable.
* **Details**: This requires a separate API endpoint (likely `/images/generations` or similar, mimicking OpenAI's structure if possible, or a custom one) on the Lemonade backend. The frontend displays the image URL or base64 data.
UI/UX DESIGN:
- **Layout**: Single-Page Application (SPA) layout. A persistent sidebar on the left contains navigation for: Dashboard (Overview, Server Status), Models (Browse, Download, Manage), Chat, Image Generation, Settings, and Community Links (GitHub, Discord). The main content area dynamically updates based on the selected navigation item.
- **Color Palette**: A clean, modern, and slightly tech-focused palette. Primary: Dark charcoal or deep blue (#1A202C or #2D3748). Accent: A bright, energetic color like a vibrant cyan (#00FFFF) or a refreshing lemonade yellow (#FFDA63) for interactive elements, buttons, and highlights. Secondary: Light grays (#E2E8F0, #A0AEC0) for text and borders. Use subtle gradients for backgrounds.
- **Typography**: Sans-serif fonts for readability. Headings: Inter or Poppins (Semi-bold). Body Text: Inter or Roboto (Regular). Ensure good contrast and legible font sizes (e.g., 16px base). Use responsive font scaling.
- **Responsive Design**: Mobile-first approach. Sidebar collapses into a hamburger menu on smaller screens. Main content adjusts to fit the viewport. Ensure all elements are touch-friendly and usable on various screen sizes (desktop, tablet, mobile).
- **Interactions**: Smooth transitions between sections. Clear visual feedback on button clicks and hover states. Loading indicators (spinners, progress bars) are crucial for download and generation processes.
COMPONENT BREAKDOWN:
- `App.jsx`: Main application component, sets up routing and overall layout.
- `Sidebar.jsx`: Navigation menu component. Receives `activeItem` prop, renders links, handles clicks. Uses `Link` from `react-router-dom`.
- `Dashboard.jsx`: Displays server status, quick stats (loaded model, VRAM usage), quick links.
- `ServerStatusCard.jsx`: Shows server connection status, API endpoint. Uses `useQuery` or similar to poll status.
- `QuickSettings.jsx`: Simple form for adjusting context size, etc.
- `ModelsPage.jsx`: Manages model browsing, downloading, and local management.
- `ModelList.jsx`: Renders a list of available models. Receives `models` array and `onDownload` callback.
- `ModelListItem.jsx`: Displays individual model info (name, size, description), download button. Receives `model` object and `onDownload` function.
- `DownloadProgress.jsx`: Shows download progress bar and status.
- `ChatPage.jsx`: Interface for interacting with the LLM via chat.
- `ChatWindow.jsx`: Displays messages (user and AI). Receives `messages` array.
- `ChatInput.jsx`: Text input field and send button. Receives `onSendMessage` callback.
- `ImageGenerationPage.jsx`: Interface for generating images.
- `PromptInput.jsx`: Text area for image prompts.
- `ImagePreview.jsx`: Displays generated images. Receives `imageUrl` or `imageData`.
- `SettingsPage.jsx`: Application settings (e.g., theme, installation path).
- `Button.jsx`: Reusable button component with hover effects. Props: `onClick`, `children`, `variant`, `isLoading`.
- `Input.jsx`: Reusable input field component. Props: `value`, `onChange`, `placeholder`.
- `Modal.jsx`: Reusable modal/dialog component. Props: `isOpen`, `onClose`, `children`.
DATA MODEL:
- **State Structure (using Zustand)**:
```javascript
// store.js
import { create } from 'zustand';
export const useStore = create((set, get) => ({
// Server State
serverStatus: 'stopped', // 'stopped', 'starting', 'running', 'error'
apiEndpoint: 'http://localhost:8080/v1', // Default
loadedModel: null,
serverLogs: [],
contextSize: 4096,
gpuEnabled: true,
setServerStatus: (status) => set({ serverStatus: status }),
setApiEndpoint: (endpoint) => set({ apiEndpoint: endpoint }),
setLoadedModel: (model) => set({ loadedModel: model }),
addLog: (log) => set((state) => ({ serverLogs: [...state.serverLogs, log].slice(-50) })),
// Model Management
availableModels: [], // Array of model objects
downloadingModel: null, // { name: string, progress: number }
setAvailableModels: (models) => set({ availableModels: models }),
setDownloadingModel: (modelInfo) => set({ downloadingModel: modelInfo }),
// Chat State
chatMessages: [],
setChatMessages: (messages) => set({ chatMessages: messages }),
addChatMessage: (message) => set((state) => ({ chatMessages: [...state.chatMessages, message] })),
// Image Gen State
imagePrompt: '',
generatedImages: [], // Array of image data or URLs
setImagePrompt: (prompt) => set({ imagePrompt: prompt }),
addGeneratedImage: (image) => set((state) => ({ generatedImages: [image, ...state.generatedImages] })),
}));
```
- **Mock Data Formats**:
* **Model Object**: `{ id: 'uuid-123', name: 'Qwen-Coder-Next', provider: 'HuggingFace', sizeGB: 25.5, vramRequiredMB: 28000, description: 'Advanced code generation model.', downloadUrl: 'http://example.com/models/qwen-coder.gguf', status: 'downloaded' | 'downloading' | 'available' | 'error', progress: 0 }`
* **Chat Message Object**: `{ id: 'msg-uuid', role: 'user' | 'assistant', content: 'Hello there!', timestamp: '2023-10-27T10:00:00Z' }`
* **Image Object**: `{ id: 'img-uuid', prompt: 'A renaissance pitcher of lemonade', url: 'blob:http://localhost:3000/generated-image-1.png', timestamp: '2023-10-27T10:05:00Z' }`
* **Server Log Entry**: `{ timestamp: '2023-10-27T09:59:00Z', message: '[INFO] Server started successfully on port 8080', level: 'info' | 'warn' | 'error' }`
ANIMATIONS & INTERACTIONS:
- **Page Transitions**: Subtle fade-in/fade-out or slide-in animations when navigating between sections using `react-router-dom` and libraries like `Framer Motion` or CSS transitions.
- **Button Hovers**: Slight scale-up or background color change on buttons when hovered.
- **Loading States**: Use spinners (`react-loader-spinner` or custom SVG) for downloads, model loading, and API requests. Progress bars for downloads. Skeleton loaders for lists while data is being fetched.
- **Micro-interactions**: Subtle animations on item expansion/collapse, successful action confirmations (e.g., a small checkmark animation), input field focus states.
- **Chat Bubbles**: Slight fade-in animation for new chat messages.
- **Server Status**: Visual cues (color changes: green for running, red for error, yellow for starting) and subtle pulsing animations for the 'Running' status indicator.
EDGE CASES:
- **Initial State**: When no models are downloaded, the Models page should show a clear message and a prominent "Download Models" button. The Chat and Image Generation pages should indicate that no model is loaded and guide the user to the Models section.
- **Server Errors**: If the Lemonade server fails to start or crashes, display a user-friendly error message in the Dashboard, attempt to provide a reason (from logs), and offer troubleshooting steps or a "Restart Server" option.
- **Download Failures**: Handle interrupted or failed model downloads gracefully. Allow users to retry. Display clear error messages.
- **API Errors**: Catch errors from the Lemonade API (e.g., invalid prompt, model incompatibilities, out-of-memory errors) and display them to the user within the relevant interface (Chat or Image Generation).
- **Hardware Incompatibility**: If the user's hardware is insufficient (e.g., low VRAM for desired models), provide clear warnings in the Models section. The auto-config should aim for stability.
- **Permissions**: Ensure the application has the necessary file system permissions for installation and model storage, guiding the user if issues arise.
- **Accessibility (a11y)**: Use semantic HTML, ensure keyboard navigability, provide ARIA attributes where necessary, sufficient color contrast, and focus management.
- **Network Issues**: For downloading models, handle potential network interruptions.
SAMPLE DATA:
1. **Model List (API Response or Local JSON)**:
```json
[
{
"id": "model-gpt4-mini-gguf",
"name": "GPT-4 Mini (GGUF)",
"provider": "Community",
"sizeGB": 15.2,
"vramRequiredMB": 16000,
"description": "A powerful GPT-4 equivalent model optimized for local use. Requires significant VRAM.",
"downloadUrl": "https://huggingface.co/TheBloke/GPT-4-Mini-GGUF/resolve/main/gpt-4-mini.gguf",
"tags": ["chat", "text-generation", "openai-compatible"]
},
{
"id": "model-qwen-coder-next-gguf",
"name": "Qwen-Coder-Next (GGUF)",
"provider": "HuggingFace",
"sizeGB": 30.1,
"vramRequiredMB": 32000,
"description": "State-of-the-art code generation and completion model.",
"downloadUrl": "https://huggingface.co/Qwen/Qwen-Coder-Next-GGUF/resolve/main/qwen-coder-next.gguf",
"tags": ["code", "completion", "tool-use"]
},
{
"id": "model-sdxl-turbo-safetensors",
"name": "Stable Diffusion XL Turbo",
"provider": "StabilityAI",
"sizeGB": 6.2,
"vramRequiredMB": 8000,
"description": "Fast, high-quality text-to-image generation model.",
"downloadUrl": "https://huggingface.co/stabilityai/sdxl-turbo/resolve/main/sdxl_turbo.safetensors",
"tags": ["image-generation", "stable-diffusion"]
},
{
"id": "model-phi-2-gguf",
"name": "Microsoft Phi-2",
"provider": "Microsoft",
"sizeGB": 2.7,
"vramRequiredMB": 4000,
"description": "A small but capable model for reasoning and language understanding tasks.",
"downloadUrl": "https://huggingface.co/microsoft/phi-2-gguf/resolve/main/phi-2.gguf",
"tags": ["chat", "reasoning", "small-model"]
}
]
```
2. **Chat Message (User Input)**:
```json
{
"id": "user-msg-1701234567",
"role": "user",
"content": "Explain the concept of local LLMs in simple terms.",
"timestamp": "2023-11-29T15:09:27Z"
}
```
3. **Chat Message (Assistant Response - Mock)**:
```json
{
"id": "assistant-msg-1701234568",
"role": "assistant",
"content": "Local LLMs are large artificial intelligence language models that run directly on your personal computer's hardware, rather than connecting to a remote server. This offers benefits like enhanced privacy (your data stays local), potentially faster responses (no network latency), and offline usability.",
"timestamp": "2023-11-29T15:09:35Z"
}
```
4. **Image Generation Prompt**: "A steampunk cat wearing a top hat, digital art"
5. **Generated Image Data (Mock - could be URL or base64)**:
```json
{
"id": "img-gen-1701234600",
"prompt": "A steampunk cat wearing a top hat, digital art",
"url": "/api/images/generated/steampunk_cat_1.png", // Relative path or blob URL
"timestamp": "2023-11-29T15:10:00Z"
}
```
6. **Server Status Update (Log Entry)**:
```json
{
"timestamp": "2023-11-29T15:00:05Z",
"message": "[INFO] Lemonade server started. Listening on http://localhost:8080. Model loaded: Qwen-Coder-Next (GGUF).",
"level": "info"
}
```
7. **Error Log Entry**:
```json
{
"timestamp": "2023-11-29T15:05:15Z",
"message": "[ERROR] Failed to allocate memory for context. Consider increasing RAM or using a smaller model.",
"level": "error"
}
```
8. **Download Progress Update**: `{ modelId: 'model-gpt4-mini-gguf', progress: 75, status: 'downloading' }`
DEPLOYMENT NOTES:
- **Build Command**: Use `npm run build` (if using CRA) or `npm run build` (if using Vite). Ensure the build process outputs static assets.
- **Environment Variables**: Use `.env` files for configuration. Key variables might include `REACT_APP_API_BASE_URL` (for dev), `NODE_ENV`. For production, the bundled frontend will be served statically, and the backend is a separate native process.
- **Serving the Frontend**: The built React application (HTML, CSS, JS files) can be served by any static file server (like Nginx, Caddy, or even the Lemonade backend itself if it has static file serving capabilities). Ensure the server is configured to handle client-side routing correctly (e.g., redirect all non-file requests to `index.html`).
- **Native Backend Integration**: The core challenge is managing the native Lemonade backend process. The Electron framework could be considered if a truly integrated desktop app experience (with auto-updates, system tray, etc.) is desired, as it allows bundling a Node.js process with a React frontend. Alternatively, a simpler setup involves the user installing the React app (e.g., via a web server) and the Lemonade backend separately, with the React app just providing instructions and an interface to manage the standalone backend.
- **Performance Optimizations**: Code splitting (automatic with Vite/CRA), lazy loading components, memoization (React.memo, useMemo), efficient state management. Optimize image sizes for the UI. Ensure efficient polling for server status.
- **Packaging**: For a distributable app, consider tools like Electron Forge, Tauri, or packaging the static frontend with the native backend installer for each OS.
- **Update Strategy**: Plan for how both the frontend application and the Lemonade backend will be updated. Auto-updates (via Electron) or clear manual update instructions are necessary.