PROJECT OVERVIEW:
The user problem stems from a Hacker News post highlighting that AI code assistants like Claude Code have become unusable for complex engineering tasks after recent updates, citing ignored instructions, incorrect "simplest fixes," and performing the opposite of requested activities, contrasting with their previous January performance. This project, 'Kod Güven' (Code Trust), aims to build a single-page application (SPA) that addresses this by providing developers with a platform to monitor, analyze, and improve the reliability of AI-generated code. The core value proposition is to restore trust in AI coding assistants by offering transparency into their performance, facilitating feedback, and providing tools to generate more accurate and reliable code for complex engineering tasks. It acts as a meta-tool, evaluating and guiding the use of other AI coding tools.
TECH STACK:
- Frontend Framework: React.js (using Vite for fast development server and build process)
- Styling: Tailwind CSS for rapid UI development and consistent design.
- State Management: Zustand for efficient and simple global state management.
- Routing: React Router DOM for client-side navigation.
- API Client: Axios for making HTTP requests to a potential backend (or mock API).
- Utility Libraries: Date-fns for date/time manipulation, Lodash (optional) for utility functions.
- Icons: Heroicons or similar for clear and accessible iconography.
CORE FEATURES:
1. **IDE/VCS Integration (Conceptual for MVP):** While full integration is complex, the MVP will simulate this by allowing users to manually input code snippets or link to public repositories (e.g., GitHub issue URLs). The UI will reflect areas where deeper integration would occur. A 'Connect IDE' button will be present, leading to a placeholder explaining the future integration.
* *User Flow:* User clicks 'Connect Tools' -> Sees options for VS Code, GitHub, etc. -> Clicks on a placeholder 'Connect VS Code' -> A modal explains that this feature will allow real-time code analysis and feedback within the IDE. For MVP, it guides the user to manually input/paste code.
2. **AI Model Performance Tracker:** Allows users to select from a list of popular AI models (e.g., Claude Opus, GPT-4, Copilot) and input specific engineering tasks or prompts. The system will then present historical performance data (simulated in MVP) based on community feedback and predefined benchmarks.
* *User Flow:* User navigates to 'Track Performance' -> Selects 'Claude Opus' from a dropdown -> Enters a sample prompt like 'Refactor this complex algorithm...' -> Clicks 'Analyze'. -> A dashboard displays metrics like 'Accuracy Score', 'Instruction Following Rate', 'Regression Trend' (based on Feb updates vs. Jan). Mock data will show a dip for Feb updates.
3. **Feedback Loop Mechanism:** A core feature where users can submit AI-generated code snippets and provide detailed feedback. This feedback is categorized (e.g., 'Ignored Instruction', 'Incorrect Fix', 'Opposite of Request', 'Works Correctly') and contributes to the model's 'Trust Score'.
* *User Flow:* User pastes AI-generated code into a text area -> Selects the AI model and prompt used -> Chooses feedback tags (e.g., 'Ignored Instruction', 'Incorrect Fix') -> Adds optional comments -> Clicks 'Submit Feedback'. -> A success message appears, and the feedback is added to the user's history and contributes to global scores.
4. **Trust Score Indicator:** A visual representation (e.g., a gauge or percentage) showing the aggregated trust level for different AI models and versions, based on the submitted feedback.
* *User Flow:* On the main dashboard or performance tracking page, users see a clear score (e.g., 'Claude Opus (Feb Build): 65% Trust Score') with a visual indicator. Hovering over it provides details on how the score is calculated (e.g., 'Based on 1500+ feedback submissions in the last 30 days').
5. **Optimized Prompt Engineering Assistant:** Provides templates and guidance for crafting effective prompts for complex engineering tasks. Users can browse categories like 'Code Refactoring', 'Bug Fixing', 'Algorithm Optimization' and select/modify prompts known to yield better results.
* *User Flow:* User navigates to 'Prompt Engineering' -> Selects 'Bug Fixing' category -> Sees a list of effective prompts like 'Identify and fix potential race conditions in the following C++ code...' -> User can copy the prompt, or click 'Customize' to modify it before using it with their AI assistant.
UI/UX DESIGN:
- **Layout:** A clean, modern single-page application layout. A persistent sidebar for navigation (Dashboard, Track Performance, Submit Feedback, Prompt Engineering, Settings). The main content area displays the selected feature dynamically. Emphasis on clarity and information hierarchy.
- **Color Palette:** Primary: Dark blue (#1E3A8A) for backgrounds/nav. Secondary: Light gray (#E5E7EB) for content backgrounds. Accent: Teal (#06B6D4) for interactive elements, buttons, and scores. Warning/Error: Red (#EF4444). Success: Green (#10B981). Use subtle gradients where appropriate.
- **Typography:** Sans-serif font family (e.g., Inter, Roboto) for readability. Clear hierarchy using font weights (bold for headings, regular for body text) and sizes.
- **Responsive Design:** Mobile-first approach. Sidebar collapses into a hamburger menu on smaller screens. Content sections reflow and stack vertically. Ensure usability across devices (desktop, tablet, mobile).
- **Components:** Use consistent styling for buttons, input fields, cards, modals, tables, and charts.
DATA MODEL:
- **State Structure (Zustand):**
* `authStore`: { `isAuthenticated`, `user` }
* `modelStore`: { `models`: [{ `id`, `name`, `version`, `trustScore`, `performanceMetrics` }], `selectedModel` }
* `feedbackStore`: { `feedbacks`: [{ `id`, `userId`, `modelId`, `prompt`, `generatedCode`, `feedbackType`, `comments`, `timestamp` }], `isSubmitting`, `error` }
* `promptStore`: { `categories`: [{ `name`, `prompts`: [{ `id`, `title`, `template`, `description` }] }] }
- **Mock Data Format:** JSON objects representing models, feedback entries, and prompt templates.
* Example `Model`: `{ "id": "claude-opus-feb2024", "name": "Claude Opus", "version": "Feb 2024", "trustScore": 65, "performanceMetrics": { "accuracy": 0.70, "instructionFollowing": 0.60, "regressionDetected": true } }`
* Example `Feedback`: `{ "id": "fbk_123", "userId": "user_abc", "modelId": "claude-opus-feb2024", "prompt": "Refactor the linked list implementation to use generics.", "generatedCode": "class Node {\n constructor(public data: T) {}
}\n...", "feedbackType": ["Incorrect Fix", "Ignored Instruction"], "comments": "The code did not implement generics and introduced syntax errors.", "timestamp": "2024-04-15T10:30:00Z" }`
COMPONENT BREAKDOWN:
- **`App.jsx`:** Main application component, sets up routing and global layout.
- **`Layout.jsx`:** Contains the overall page structure, including the sidebar and header.
* `props`: `children`
- **`Sidebar.jsx`:** Navigation menu component.
* `props`: `navItems` (array of objects: { `name`, `path`, `icon` })
- **`Dashboard.jsx`:** Main overview page showing key metrics and summary information.
* `props`: (None - fetches data via stores)
* Components: `TrustScoreGauge.jsx`, `PerformanceSummary.jsx`, `RecentFeedback.jsx`
- **`TrackPerformance.jsx`:** Interface for selecting models and viewing detailed performance metrics.
* `props`: (None)
* Components: `ModelSelector.jsx`, `PromptInputArea.jsx`, `PerformanceChart.jsx`
- **`SubmitFeedback.jsx`:** Form for users to submit their feedback on AI-generated code.
* `props`: (None)
* Components: `CodeEditor.jsx` (textarea with syntax highlighting), `FeedbackForm.jsx` (dropdowns, checkboxes, text input).
- **`PromptEngineering.jsx`:** Displays prompt templates and guides.
* `props`: (None)
* Components: `PromptCategoryList.jsx`, `PromptTemplateCard.jsx`
- **`TrustScoreGauge.jsx`:** Visualizes the trust score.
* `props`: `score` (number), `modelName` (string)
- **`PerformanceChart.jsx`:** Displays performance metrics using a charting library.
* `props`: `data` (array), `metricType` (string)
- **`ModelSelector.jsx`:** Dropdown or list for choosing AI models.
* `props`: `models` (array), `onSelect` (function)
- **`CodeEditor.jsx`:** A textarea component, potentially enhanced with basic syntax highlighting for code snippets.
* `props`: `value`, `onChange`, `placeholder`
ANIMATIONS & INTERACTIONS:
- **Page Transitions:** Subtle fade-in/fade-out transitions between different sections using `Framer Motion` or CSS transitions.
- **Hover Effects:** Buttons and interactive elements will have subtle hover effects (e.g., slight scale up, color change).
- **Loading States:** Use spinners or skeleton loaders (`react-loading-skeleton`) when fetching data or submitting feedback. The 'Submit Feedback' button should show a loading state.
- **Micro-interactions:** Smooth scrolling, form validation feedback (e.g., input border color change), success/error message animations.
- **Gauge Animation:** The `TrustScoreGauge` should animate smoothly when the score updates.
EDGE CASES:
- **Empty States:** Display user-friendly messages when there's no data (e.g., "No feedback submitted yet.", "No models available to track.").
- **Error Handling:** Gracefully handle API errors (e.g., network issues, server errors) and display informative messages to the user. Implement form validation for all user inputs (e.g., ensuring prompts are not empty, feedback types are selected).
- **Accessibility (a11y):** Use semantic HTML5 elements, ensure sufficient color contrast, provide ARIA attributes where necessary, ensure keyboard navigability for all interactive components.
- **Data Consistency:** Handle potential inconsistencies in mock or real data, especially when displaying historical trends.
- **User Authentication:** (For a full implementation) Handle login, logout, and protected routes. For MVP, simulate logged-in state.
SAMPLE DATA:
1. **Models List:**
```json
[
{ "id": "claude-opus-jan2024", "name": "Claude Opus", "version": "Jan 2024", "trustScore": 85, "performanceMetrics": { "accuracy": 0.88, "instructionFollowing": 0.85, "regressionDetected": false } },
{ "id": "claude-opus-feb2024", "name": "Claude Opus", "version": "Feb 2024", "trustScore": 65, "performanceMetrics": { "accuracy": 0.70, "instructionFollowing": 0.60, "regressionDetected": true } },
{ "id": "gpt-4-turbo", "name": "GPT-4 Turbo", "version": "Latest", "trustScore": 88, "performanceMetrics": { "accuracy": 0.90, "instructionFollowing": 0.88, "regressionDetected": false } },
{ "id": "copilot-latest", "name": "GitHub Copilot", "version": "Latest", "trustScore": 78, "performanceMetrics": { "accuracy": 0.75, "instructionFollowing": 0.80, "regressionDetected": false } }
]
```
2. **Feedback Entries (Sample for 'claude-opus-feb2024'):**
```json
[
{ "id": "fbk_101", "userId": "dev_001", "modelId": "claude-opus-feb2024", "prompt": "Optimize the following Python function for speed.", "generatedCode": "def optimized_func(data):\n # ... complex logic ...\n return result", "feedbackType": ["Incorrect Fix"], "comments": "The function still contains inefficient loops and does not leverage built-in optimized methods.", "timestamp": "2024-04-10T09:00:00Z" },
{ "id": "fbk_102", "userId": "dev_002", "modelId": "claude-opus-feb2024", "prompt": "Add error handling for file I/O operations.", "generatedCode": "try:\n # ... file operations ...\nexcept Exception as e:\n print(f'Error: {e}')", "feedbackType": ["Ignored Instruction"], "comments": "The error handling is too generic and doesn't catch specific file-related exceptions like FileNotFoundError.", "timestamp": "2024-04-11T14:20:00Z" },
{ "id": "fbk_103", "userId": "dev_003", "modelId": "claude-opus-feb2024", "prompt": "Translate this Java code to C#.", "generatedCode": "// Incorrect C# translation with syntax errors", "feedbackType": ["Opposite of Requested Activities", "Incorrect Fix"], "comments": "The generated code is not valid C# and seems to have misunderstood the request entirely.", "timestamp": "2024-04-12T11:05:00Z" },
{ "id": "fbk_104", "userId": "dev_004", "modelId": "claude-opus-feb2024", "prompt": "Implement a basic binary search algorithm.", "generatedCode": "// Correct implementation of binary search", "feedbackType": ["Works Correctly"], "comments": "The algorithm is correctly implemented.", "timestamp": "2024-04-13T16:45:00Z" }
]
```
3. **Prompt Templates:**
```json
[
{
"category": "Bug Fixing",
"prompts": [
{ "id": "pf_bf_001", "title": "Identify Race Conditions", "template": "Analyze the following code for potential race conditions and suggest fixes. Code:\n```\n{{CODE_SNIPPET}}\n```", "description": "Helps find concurrency issues in multi-threaded applications." },
{ "id": "pf_bf_002", "title": "Memory Leak Detection", "template": "Review this code for potential memory leaks and provide solutions. Code:\n```\n{{CODE_SNIPPET}}\n```", "description": "Useful for identifying and resolving memory management problems." }
]
},
{
"category": "Code Refactoring",
"prompts": [
{ "id": "pf_cr_001", "title": "Simplify Complex Logic", "template": "Refactor the following complex logic into a more readable and maintainable function. Logic:\n```\n{{CODE_SNIPPET}}\n```", "description": "Improves code readability and maintainability." },
{ "id": "pf_cr_002", "title": "Apply Design Patterns", "template": "Apply the Strategy design pattern to the following class structure to improve flexibility. Code:\n```\n{{CODE_SNIPPET}}\n```", "description": "Enhances code structure using established design patterns." }
]
}
]
```
DEPLOYMENT NOTES:
- **Build Tool:** Vite is configured for optimal build performance (`vite build`).
- **Environment Variables:** Use `.env` files for managing configurations like API endpoints (`VITE_API_URL`). Ensure sensitive variables are not committed to version control.
- **Performance Optimizations:** Code splitting (handled by Vite), lazy loading components, image optimization (if any added later), efficient state management to prevent unnecessary re-renders.
- **Static Hosting:** Can be deployed on platforms like Vercel, Netlify, or GitHub Pages.
- **CORS:** If a separate backend is implemented, ensure CORS is configured correctly to allow requests from the frontend domain.