PROJECT OVERVIEW:
Develop a single-page SaaS application called 'Code License Guardian' designed to address the complexities of software licensing in the age of AI-driven code generation and reimplementation. The core problem this application solves is the potential erosion of copyleft licenses and intellectual property rights when AI tools are used to rewrite existing codebases. The application will automatically analyze code repositories, detect AI-assisted modifications, and provide clear, actionable insights into license compliance. Our value proposition is to provide developers, project managers, and legal teams with a robust tool that safeguards the integrity of open-source licenses, mitigates legal risks, and ensures fair attribution and adherence to open-source principles, even when AI tools are involved in code development.
TECH STACK:
- Frontend Framework: React (using Vite for fast development and build)
- Styling: Tailwind CSS for utility-first styling and rapid UI development
- State Management: Zustand for efficient and scalable global state management
- Routing: React Router DOM for navigation within the single-page application
- API Interaction: Axios for making HTTP requests to a potential backend (even if simulated for MVP)
- UI Components: Radix UI for accessible and unstyled primitive components, styled with Tailwind CSS
- Icons: Lucide React for a comprehensive set of open-source icons
- Form Handling: React Hook Form for efficient and performant form management
- Code Analysis (Client-side simulation/parsing if feasible, otherwise mock data for MVP): A library like 'Code-Mirror' or a custom parser if needed for basic syntax highlighting or structural analysis. For the MVP, focus on structure and metadata rather than deep semantic analysis.
- Libraries for specific functionalities: 'js-yaml' for handling YAML configurations, 'diff' library for basic text diffing (if applicable for mock data).
CORE FEATURES:
1. **Repository Connection:**
* User Flow: User navigates to the 'Connections' page. Clicks 'Connect GitHub/GitLab'. Authenticates via OAuth. Selects a repository to analyze. The application fetches repository metadata (name, URL, recent commits, license file content if available).
* Details: Implement OAuth flow for GitHub and GitLab. Display a list of user's repositories. Allow selection of a single repository for analysis.
2. **Code Scan & AI Detection (Simulated for MVP):**
* User Flow: After selecting a repository, the user initiates a 'Scan'. The application simulates a code scan, focusing on recently modified files or specific directories. It flags files that might have undergone significant AI-assisted reimplementation based on metadata (e.g., commit messages suggesting AI use, unusually high diff percentages without clear human authorship, or specific file rename patterns).
* Details: For MVP, this will be a simulated process. The backend (or mock API) would provide flags. Frontend displays a "Scanning..." state and then shows a list of "Potentially AI-Modified" files. In a real implementation, this would involve AST parsing and similarity checks.
3. **License Analysis:**
* User Flow: For each "Potentially AI-Modified" file or the entire repository, the user can trigger a 'License Analysis'. The application compares the detected code's characteristics (simulated) and the repository's declared license against a database of known open-source licenses (LGPL, MIT, Apache, GPL etc.) and common AI reimplementation legal interpretations.
* Details: Maintain an internal database/configuration of popular licenses, their key terms (copyleft, attribution, modification clauses), and common legal arguments regarding AI reimplementation. Display the repository's current license and the "analyzed" license status.
4. **Compliance Reporting:**
* User Flow: A 'Report' is generated automatically after analysis. It summarizes findings: license detected, potential compliance issues (e.g., 'LGPL modification requires same license, but MIT was applied'), similarity scores (mocked), contributor information (mocked), and risk assessment.
* Details: The report should be clear, concise, and visually appealing. It should highlight specific files and the nature of the suspected non-compliance. Offer options to export the report (e.g., PDF, JSON).
5. **User Dashboard:**
* User Flow: The main landing page after login. Displays a summary of recent scans, overall compliance status of connected repositories, and quick links to detailed reports.
* Details: Provide an overview of the user's projects and their compliance health. Use clear visual indicators (e.g., green/yellow/red status icons).
UI/UX DESIGN:
- **Layout:** Single-page application. A persistent sidebar navigation for main sections (Dashboard, Connections, Scan History, Settings). The main content area displays the active section.
- **Color Palette:** Primary: Deep blue (`#1e3a8a` - slate-800). Secondary: Teal (`#06b6d4` - cyan-400). Accent: Orange (`#f97316` - orange-500) for CTAs and alerts. Background: Light gray (`#f3f4f6` - gray-100). Text: Dark gray (`#1f2937` - gray-900).
- **Typography:** Use a clean, modern sans-serif font like Inter or Roboto. Headings: Bold, larger sizes. Body text: Regular weight, readable size (16px).
- **Responsive Design:** Mobile-first approach. Sidebar collapses into a hamburger menu on smaller screens. Content adapts fluidly. Ensure usability across devices (desktop, tablet, mobile).
- **Key Components:** Sidebar Navigation, Repository List, Scan Status Indicator, Report Viewer, Modal Dialogs, Input Fields, Buttons.
- **Visual Style:** Clean, professional, data-focused. Use subtle shadows for cards and containers. Clear visual hierarchy.
COMPONENT BREAKDOWN:
- `App.jsx`: Main application component, sets up routing.
- `Layout.jsx`: Main layout with Sidebar and content area.
- `Sidebar.jsx`: Navigation menu component. Props: `isOpen`, `onClose`.
- `Dashboard.jsx`: Displays overview of scans and compliance status. Needs data from state.
- `Connections.jsx`: Manages repository connections (GitHub/GitLab OAuth). Props: `user`.
- `RepositoryList.jsx`: Displays list of connected repositories. Props: `repositories`, `onSelectRepository`.
- `ScanButton.jsx`: Button to initiate a scan. Props: `repositoryId`, `onScanStart`.
- `ScanStatus.jsx`: Displays the current status of a scan (e.g., 'Scanning', 'Completed', 'Error'). Props: `status`, `progress`.
- `ReportViewer.jsx`: Displays the detailed compliance report. Props: `reportData`.
- `ReportSummary.jsx`: Component within `ReportViewer` showing a high-level summary. Props: `summaryData`.
- `ComplianceIssue.jsx`: Component highlighting a specific compliance issue. Props: `issue`.
- `CodeFileListItem.jsx`: Displays a file identified during the scan. Props: `fileName`, `status`, `onClick`.
- `Modal.jsx`: Generic modal component. Props: `isOpen`, `onClose`, `title`, `children`.
- `AuthButton.jsx`: Button for OAuth login. Props: `provider`, `onAuthSuccess`.
DATA MODEL:
- **UserState:** `{ userId: string, token: string | null, repositories: Repository[] }`
- **Repository:** `{ id: string, name: string, url: string, pushedAt: string, license: { key: string, name: string, spdxId: string } | null, lastScan: ScanResult | null }`
- **ScanResult:** `{ scanId: string, repositoryId: string, status: 'pending' | 'scanning' | 'completed' | 'error', startTime: string, endTime: string | null, issues: ComplianceIssue[], potentiallyAIModifiedFiles: string[], overallCompliance: 'compliant' | 'non-compliant' | 'requires_review', reportSummary: string }`
- **ComplianceIssue:** `{ file: string, type: string, description: string, suggestedAction: string, severity: 'low' | 'medium' | 'high' }`
* **State Management (Zustand):** Use separate stores for `userStore`, `repositoryStore`, `scanStore`.
* **Mock Data:** All data for MVP will be mock data fetched from a JSON file or a simple mock API endpoint.
ANIMATIONS & INTERACTIONS:
- **Page Transitions:** Subtle fade-in/fade-out transitions between major sections using React Router's `<CSSTransition>` or similar.
- **Button Hovers:** Slight scale-up and background color change on interactive elements like buttons.
- **Loading States:** Skeleton loaders or spinners for data fetching and scan processes. `ScanStatus` component will visually represent progress.
- **Micro-interactions:** Smooth expanding/collapsing of report sections. Subtle bounce effect on successful actions (e.g., connection established).
- **Notifications:** Use a toast notification system (e.g., `react-toastify`) for success/error messages.
EDGE CASES:
- **No Repositories Connected:** Display a clear message and CTA on the Dashboard and Connections page prompting the user to connect a repository.
- **Scan Errors:** Handle API errors gracefully. Display informative error messages to the user. Implement retry mechanisms for transient network issues.
- **No License File:** If a repository lacks a `LICENSE` file, flag it as a potential issue and prompt the user to add one.
- **Unsupported License:** Identify and flag licenses not present in the internal database.
- **Empty Scan Results:** If a scan completes with no identified issues, display a clear "All Clear" message.
- **Authentication Errors:** Handle OAuth failures and expired tokens, prompting re-authentication.
- **Accessibility (a11y):** Ensure all interactive elements have proper ARIA attributes, keyboard navigation support, and sufficient color contrast. Use semantic HTML5 elements.
- **Rate Limiting:** (For future backend) Implement awareness of API rate limits for platforms like GitHub.
SAMPLE DATA:
1. **Repository Mock:**
```json
{
"id": "repo-123",
"name": "chardet-ai-reimplementation",
"url": "https://github.com/user/chardet-ai-reimplementation",
"pushedAt": "2024-03-10T10:00:00Z",
"license": {"key": "mit", "name": "MIT License", "spdxId": "MIT"},
"lastScan": null
}
```
2. **ScanResult Mock (Compliant):**
```json
{
"scanId": "scan-abc",
"repositoryId": "repo-123",
"status": "completed",
"startTime": "2024-03-10T11:00:00Z",
"endTime": "2024-03-10T11:05:00Z",
"issues": [],
"potentiallyAIModifiedFiles": ["src/chardet.py"],
"overallCompliance": "compliant",
"reportSummary": "Repository is compliant with its declared MIT license. No significant compliance issues detected."
}
```
3. **ScanResult Mock (Non-Compliant - LGPL vs MIT):**
```json
{
"scanId": "scan-def",
"repositoryId": "repo-456",
"status": "completed",
"startTime": "2024-03-10T12:00:00Z",
"endTime": "2024-03-10T12:15:00Z",
"issues": [
{
"file": "src/core_logic.js",
"type": "LicenseMismatch",
"description": "Code reimplemented with AI appears to be derived from an LGPL licensed project, but the repository declares an MIT license. LGPL requires modifications to be distributed under the same license.",
"suggestedAction": "Review AI reimplementation claims. Consider relicensing under LGPL or ensuring a clean-room implementation was legally performed.",
"severity": "high"
}
],
"potentiallyAIModifiedFiles": ["src/core_logic.js", "src/utils.js"],
"overallCompliance": "non-compliant",
"reportSummary": "Potential license compliance issue detected. The MIT license may not be valid for AI-modified code derived from LGPL sources."
}
```
4. **ComplianceIssue Mock:** (See above in ScanResult Mock)
5. **Repository Mock (No License):**
```json
{
"id": "repo-789",
"name": "project-no-license",
"url": "https://github.com/user/project-no-license",
"pushedAt": "2024-02-15T08:00:00Z",
"license": null,
"lastScan": null
}
```
6. **AI Contribution Metadata (Example, not a formal data structure):** Imagine commit messages like "feat: AI-assisted rewrite of X module using Claude" or internal flags indicating high code similarity to training data with weak attribution.
DEPLOYMENT NOTES:
- **Build Tool:** Vite is recommended for its speed. Configure `vite.config.js` for production builds (`build` command).
- **Environment Variables:** Use `.env` files for managing API keys (e.g., GitHub OAuth client ID/secret) and other configurations. Prefix variables with `VITE_` for client-side access.
- **Hosting:** Deployable on static hosting platforms like Vercel, Netlify, or GitHub Pages.
- **Performance Optimizations:** Code splitting with React Router, lazy loading components, image optimization (if any), memoization (React.memo, useMemo, useCallback) to prevent unnecessary re-renders.
- **Error Handling:** Implement a global error boundary in React to catch unexpected errors. Centralized error logging (e.g., Sentry for production).
- **CI/CD:** Set up a CI/CD pipeline (e.g., GitHub Actions) for automated testing and deployment.
- **Dependencies:** Keep dependencies updated. Regularly audit for security vulnerabilities using tools like `npm audit`.