PROJECT OVERVIEW:
The application, named 'Kod Ustası: Yapay Zeka Kod Üretimi' (AI Code Generation Master), is a SaaS platform designed to significantly improve the code generation capabilities of Large Language Models (LLMs). It addresses the challenge highlighted in the Hacker News post 'Embarrassingly simple self-distillation improves code generation' (arXiv:2604.01193). Instead of relying on external verifiers, teacher models, or complex reinforcement learning, Kod Ustası implements a 'Simple Self-Distillation' (SSD) technique. Users can select a pre-trained LLM, configure specific temperature and truncation parameters for output sampling, and then fine-tune the model on its own generated code samples. This process enhances the model's performance, particularly on harder coding problems, by optimizing its token distribution for precision and exploration. The core value proposition is providing an accessible, efficient, and effective method for developers and AI researchers to boost their LLM's code generation accuracy without requiring deep ML expertise or extensive computational resources for training from scratch. The platform aims to democratize access to highly performant code-generating AI models.
TECH STACK:
- Frontend: React.js (using Vite for fast development)
- Styling: Tailwind CSS for rapid and utility-first UI development
- State Management: Zustand for efficient global state management
- Routing: React Router DOM for client-side navigation
- API Communication: Axios for making HTTP requests to a potential backend (initially mocked)
- UI Components: Headless UI for accessible and unstyled components, potentially Radix UI for more advanced needs.
- Animations: Framer Motion for smooth UI transitions and micro-interactions.
- Data Fetching: React Query (TanStack Query) for server state management and caching if a backend is introduced later.
- Forms: React Hook Form for efficient form handling and validation.
CORE FEATURES:
1. **Model Selection & Configuration:**
* **User Flow:** The user lands on the dashboard. They see a list of available pre-trained LLMs (e.g., Qwen3-30B-Instruct, Llama-4B, Llama-8B-Instruct). They click 'Select Model' next to their choice. A modal or dedicated section appears where they can adjust the 'Temperature' slider (e.g., 0.1 to 1.0) and 'Truncation' value (e.g., integer representing tokens, or a percentage). Default values based on the research paper will be provided. After configuration, they click 'Proceed to Training'.
* **Details:** This involves fetching a list of available models (initially mocked) and providing input fields/sliders for SSD parameters. Input validation will ensure parameters are within acceptable ranges.
2. **Self-Distillation Training:**
* **User Flow:** After configuring parameters, the user clicks 'Start Training'. A loading/progress indicator appears, showing the estimated time or stages of the fine-tuning process (e.g., 'Sampling Outputs', 'Fine-tuning Model', 'Evaluating Performance'). The user can monitor the progress without needing to keep the tab open (simulated progress for MVP).
* **Details:** This is the core backend process (simulated in MVP using mock responses). It involves sampling LLM outputs with the specified parameters and then performing supervised fine-tuning on these samples. The UI will display a clear progress bar and status updates.
3. **Performance Dashboard & Analysis:**
* **User Flow:** Once training is complete, the user is redirected to a dashboard. It displays the initial performance score (e.g., pass@1 on LiveCodeBench) and the improved score after self-distillation. A simple chart might show the distribution of gains (e.g., improvements on harder vs. easier problems, if data is available). Key metrics like accuracy improvement and time saved are highlighted.
* **Details:** This section visualizes the results. It will show before/after metrics, potentially simple comparison charts, and summary statistics. For the MVP, this data will be mock data representing plausible improvements based on the paper.
4. **Code Generation Demo:**
* **User Flow:** A dedicated 'Demo' section allows users to input a prompt (e.g., 'Write a Python function to calculate factorial') and see the code generated by the fine-tuned model in real-time (or near real-time). This serves as a direct showcase of the model's improved capability.
* **Details:** A simple text input for prompts and a code display area (using a syntax highlighting library like Prism.js or Highlight.js). The generated code will be based on the mock fine-tuned model's output.
UI/UX DESIGN:
- **Layout:** A clean, single-page application (SPA) layout. A persistent sidebar navigation (collapsible) for accessing Dashboard, Model Training, and Demo sections. The main content area will display the relevant components for each section.
- **Color Palette:** A modern, tech-focused palette. Primary: Dark blue/purple (#3B82F6 or similar for calls to action). Secondary: Dark gray/charcoal (#1F2937) for backgrounds. Accent: Lighter blues/cyans (#38BDF8, #06B6D4) for highlights and active states. Text: Off-white (#E5E7EB) on dark backgrounds.
- **Typography:** A clean, readable sans-serif font like Inter or Poppins for all text. Clear hierarchy using font weights and sizes (e.g., H1 for page titles, H2 for section titles, standard body text). Code blocks will use a monospaced font (e.g., Fira Code, Source Code Pro).
- **Responsive Design:** Mobile-first approach. Sidebar collapses into a hamburger menu on smaller screens. Main content adjusts to fit screen width. Components will stack vertically and resize gracefully. Focus on usability across desktops, tablets, and mobile devices.
- **Interactions:** Subtle hover effects on buttons and links. Smooth transitions for modals and section changes using Framer Motion. Clear loading states (spinners, skeleton screens) for any simulated asynchronous operations.
COMPONENT BREAKDOWN:
- `App.js`: Main application component, sets up routing and global layout.
- `Layout.js`: Contains the sidebar navigation and the main content area.
- `Sidebar.js`: Navigation links (Dashboard, Train, Demo), collapsible functionality.
* Props: `isOpen` (boolean), `onClose` (function).
- `ModelSelection.js`: Displays available models, handles selection and configuration modal.
* Props: `models` (array of objects), `onSelectModel` (function).
- `Configurator.js`: UI elements for Temperature and Truncation sliders/inputs.
* Props: `config` (object), `onChange` (function).
- `TrainingProgress.js`: Displays loading state and progress updates during the simulated training.
* Props: `status` (string), `progress` (number).
- `Dashboard.js`: Displays performance metrics (before/after scores, charts).
* Props: `performanceData` (object).
- `CodeDemo.js`: Input area for prompts and display area for generated code.
* Props: `onGenerate` (function), `generatedCode` (string), `isLoading` (boolean).
- `Button.js`: Reusable button component.
- `Slider.js`: Reusable slider component.
- `Card.js`: Reusable card component for displaying information.
- `Chart.js`: (Optional) Component for displaying performance charts (e.g., using Chart.js library).
DATA MODEL:
- **`models` State:** Array of objects, each representing an LLM.
```javascript
[
{ id: 'qwen30b', name: 'Qwen3 30B Instruct', description: 'State-of-the-art model from Alibaba', defaultTemp: 0.7, defaultTrunc: 512 },
{ id: 'llama8b', name: 'Llama 8B Instruct', description: 'Powerful open-source model', defaultTemp: 0.6, defaultTrunc: 256 },
// ... more models
]
```
- **`trainingConfig` State:** Object holding current configuration.
```javascript
{
selectedModelId: 'qwen30b',
temperature: 0.7,
truncation: 512
}
```
- **`trainingStatus` State:** Object for tracking the training process.
```javascript
{
isRunning: false,
progress: 0, // 0-100
statusMessage: 'Idle'
}
```
- **`performanceData` State:** Object holding pre- and post-training metrics.
```javascript
{
initialPassRate: 42.4,
improvedPassRate: 55.3,
gain: 12.9,
analysis: 'Gains concentrated on harder problems.'
}
```
- **`demo` State:** For the code generation demo.
```javascript
{
prompt: '',
generatedCode: '// Your generated code will appear here...',
isLoading: false
}
```
- **Local Storage:** Used to persist `trainingConfig` and potentially `performanceData` between sessions for the MVP.
ANIMATIONS & INTERACTIONS:
- **Hover Effects:** Subtle background color changes or slight scaling on interactive elements like buttons and navigation links.
- **Page Transitions:** Smooth fading or sliding transitions between different sections (Dashboard, Training, Demo) using Framer Motion's `AnimatePresence` and `motion.div`.
- **Modal Animations:** Modals (e.g., for configuration) will slide in/out from the top or fade in/out.
- **Loading Indicators:** When starting training or generating demo code, a progress spinner or bar will be displayed. For list loading, skeleton loaders can be used.
- **Micro-interactions:** Button click feedback (slight press effect), input focus highlights.
EDGE CASES:
- **No Models Available:** Display a friendly message indicating that no pre-trained models are currently available and suggesting to check back later.
- **Invalid Configuration:** Prevent training from starting if configuration parameters are outside the valid range. Provide clear inline validation messages using React Hook Form.
- **Training Failure (Simulated):** If the simulated training fails, display an informative error message and reset the training status.
- **Empty State (Dashboard):** If no training has been completed yet, the dashboard should show placeholder text or instructions on how to start.
- **Empty State (Demo):** The code display area should have placeholder text before any code is generated.
- **Accessibility (a11y):** Use semantic HTML5 elements. Ensure sufficient color contrast. All interactive elements must be keyboard-navigable and have clear focus states. Use ARIA attributes where necessary (e.g., for sliders, modals).
SAMPLE DATA:
1. **Mock Model List:**
```json
[
{ "id": "qwen30b", "name": "Qwen3 30B Instruct", "description": "High-performance model by Alibaba.", "defaultTemp": 0.7, "defaultTrunc": 512 },
{ "id": "llama8b", "name": "Llama 8B Instruct", "description": "Versatile model from Meta.", "defaultTemp": 0.6, "defaultTrunc": 256 },
{ "id": "codellama7b", "name": "CodeLlama 7B", "description": "Specialized code model.", "defaultTemp": 0.5, "defaultTrunc": 300 }
]
```
2. **Mock Training Configuration:**
```json
{ "selectedModelId": "qwen30b", "temperature": 0.75, "truncation": 512 }
```
3. **Mock Training Status (Initial):**
```json
{ "isRunning": false, "progress": 0, "statusMessage": "Idle. Select a model and configure parameters to begin." }
```
4. **Mock Training Status (In Progress):**
```json
{ "isRunning": true, "progress": 45, "statusMessage": "Fine-tuning model on generated samples..." }
```
5. **Mock Training Status (Completed):**
```json
{ "isRunning": false, "progress": 100, "statusMessage": "Training complete!" }
```
6. **Mock Performance Data (Post-Training):**
```json
{
"initialPassRate": 42.4,
"improvedPassRate": 55.3,
"gain": 12.9,
"modelUsed": "Qwen3 30B Instruct",
"config": { "temperature": 0.75, "truncation": 512 },
"analysis": "Significant improvement observed, especially on complex algorithmic tasks. Gains concentrated on harder problems."
}
```
7. **Mock Generated Code (Demo):**
```javascript
// Python function to check for palindrome
function isPalindrome(str) {
const cleanedStr = str.toLowerCase().replace(/[^a-z0-9]/g, '');
const reversedStr = cleanedStr.split('').reverse().join('');
return cleanedStr === reversedStr;
}
// Example usage:
console.log(isPalindrome("A man, a plan, a canal: Panama")); // true
console.log(isPalindrome("race a car")); // false
```
DEPLOYMENT NOTES:
- **Build Tool:** Vite is recommended for its speed. `npm run build` command will generate optimized static assets.
- **Environment Variables:** Use `.env` files for managing API endpoints (if a backend is added) or feature flags. `VITE_API_URL` could be an example.
- **Performance Optimizations:** Code splitting using React.lazy and Suspense. Image optimization (if any). Bundle analysis using tools like `rollup-plugin-visualizer`.
- **Hosting:** Can be deployed on static hosting platforms like Netlify, Vercel, or GitHub Pages.
- **State Management:** Ensure Zustand selectors are optimized to prevent unnecessary re-renders.
- **Error Handling:** Implement global error boundaries in React to catch rendering errors and display user-friendly fallback UIs. Use `try...catch` blocks for any simulated API calls.
- **Mocking:** For the MVP, use a library like `msw` (Mock Service Worker) or simple mock functions to simulate API responses for training and performance data, allowing the frontend to be developed and tested independently.