You are an expert AI assistant and senior full-stack developer tasked with creating a single-page application (SPA) for a browser extension that identifies and blocks unwanted AI-generated content, like ads or affiliate links, within code repositories, specifically focusing on GitHub pull requests. The application aims to ensure the integrity and ethical use of AI coding assistants.
**PROJECT OVERVIEW:**
The core problem this application solves is the unauthorized insertion of promotional content or third-party advertisements into code, particularly within the context of AI-generated suggestions or modifications in tools like GitHub Copilot. Developers have expressed frustration and concern over AI assistants altering code or commit messages to include links to other products or services without explicit consent. This tool will act as a safeguard, providing developers with transparency and control over the content generated by AI tools in their workflow. The primary value proposition is restoring developer trust in AI coding assistants by ensuring their outputs remain focused on code generation and are free from unsolicited marketing.
**TECH STACK:**
- **Frontend Framework:** React (using Vite for fast development and build)
- **Styling:** Tailwind CSS for rapid UI development and a consistent design system.
- **State Management:** Zustand for efficient and simple global state management.
- **Browser Extension API:** Standard WebExtensions APIs for Chrome, Firefox, etc.
- **Routing (if needed for settings page):** React Router DOM (though likely minimal for a basic extension).
- **Utility Libraries:** Lodash (for utility functions), clsx (for conditional class name manipulation).
**CORE FEATURES:**
1. **AI Content Detection (Browser Extension):**
* **User Flow:** When a user navigates to a GitHub pull request page, the extension automatically scans the page's content, focusing on commit messages, code diffs, and comments where AI tools might have intervened. It looks for patterns indicative of promotional text, affiliate links, or calls to action for unrelated products/services.
* **Detection Logic:** The extension will use a combination of regex patterns and potentially a lightweight, on-device ML model (if feasible and performant for a browser extension) to identify suspicious text. Predefined patterns will target common marketing phrases, link shorteners associated with affiliate programs, and known product names that are not relevant to the code change.
* **UI Integration:** Detected content will be visually highlighted on the GitHub page (e.g., a colored border or background) with a small, non-intrusive icon. Hovering over the highlighted section will display a tooltip explaining the detected issue and its potential source (e.g., 'Potential Ad/Promotion by AI Tool').
2. **User Dashboard/Settings (SPA):**
* **User Flow:** Users can access a dedicated settings page (either via a browser extension popup or a separate SPA page opened from the extension icon) to manage their preferences. This page will display a summary of detected items across repositories they frequently visit.
* **Functionality:** Users can enable/disable the detection feature, whitelist specific domains or patterns they trust, and configure the sensitivity of the detection.
* **Reporting:** A mechanism to report false positives or new patterns to the central system for community learning.
3. **Community Reporting & Learning:**
* **User Flow:** If a user encounters content they believe is an incorrect detection (false positive) or a new type of unwanted AI content, they can report it through the settings page or a direct button near the highlighted content. Reports are sent to a central backend (initially mock data, later a simple API).
* **System Update:** Aggregated, anonymized reports will be used to refine the detection algorithms and patterns over time, improving accuracy.
**UI/UX DESIGN:**
- **Layout:** The SPA will be minimalist and functional. The primary view will be the settings/dashboard. A browser action popup will provide quick access to enable/disable and see recent detections.
- **Color Palette:** Dark theme preferred, reflecting a developer-centric environment. Primary colors: Dark charcoal (`#1a1a1a`), Accent: A vibrant but not jarring blue (`#4f77ff`) or green (`#34d399`) for active states and highlights. Secondary colors: Light gray (`#e5e7eb`) for text, muted tones for warnings.
- **Typography:** Clean, readable sans-serif font like Inter or Manrope.
- **Responsive Design:** Primarily designed for desktop browser use. The SPA settings page should be responsive enough for smaller screens if accessed via mobile browser, though the core functionality is the extension.
- **GitHub Integration:** UI elements (highlights, icons) should blend reasonably well with GitHub's existing UI without being overly distracting.
**COMPONENT BREAKDOWN:**
- `App.jsx`: Main application component, sets up routing and global providers.
- `SettingsPage.jsx`: Renders the main dashboard/settings UI. Contains child components for configuration and reporting.
- `DetectionToggle.jsx`: Simple toggle switch to enable/disable the extension's detection feature. Props: `isEnabled`, `onToggle`.
- `PatternList.jsx`: Displays the list of current detection patterns/rules. Props: `patterns`.
- `WhitelistManager.jsx`: Component for adding/removing trusted domains or patterns. Props: `whitelist`, `onAdd`, `onRemove`.
- `ReportingSection.jsx`: Contains buttons/forms for reporting false positives or new patterns. Props: `onSubmitReport`.
- `ExtensionPopup.jsx`: The small popup shown when clicking the browser extension icon. Displays quick status and enable/disable control. Props: `isEnabled`, `onToggle`, `recentDetections`.
- `DetectorEngine.js` (Utility/Logic file, not a component): Contains the core detection logic, regex patterns, and potentially ML model inference.
- `GitHubHighlighter.js` (Content Script Logic): Injects styles and highlights detected content directly onto GitHub pages.
**DATA MODEL:**
- **Browser Extension Storage (`chrome.storage.sync` or `local`):**
```json
{
"isEnabled": true, // boolean
"detectionSensitivity": "high", // "low", "medium", "high"
"customPatterns": [
{
"id": "pattern-1",
"regex": "\\b(buy now|special offer)\\b",
"description": "Generic sales phrases"
}
],
"whitelist": [
"example.com/trusted-repo",
"my-internal-tool"
]
}
```
- **Mock Backend/Reporting Data:**
```json
// For community learning - initially could be logged locally or in mock storage
{
"reports": [
{
"id": "report-123",
"url": "https://github.com/user/repo/pull/123",
"detectedContent": "Get 50% off at amazing-tool.com!",
"detectedPatternId": "pattern-sales-phrase",
"reportType": "false_positive" | "new_unwanted_content",
"timestamp": "2024-03-15T10:30:00Z"
}
]
}
```
**ANIMATIONS & INTERACTIONS:**
- **Toggle Switches:** Smooth transition for enable/disable toggles.
- **Highlighting:** Subtle background flash or color change when new content is detected and highlighted on the GitHub page.
- **Tooltips:** Fade-in/fade-out effect for tooltips on hover.
- **Loading States:** Indicate when settings are being saved or data is being fetched (if applicable later).
- **Reporting Feedback:** Visual confirmation (e.g., a small checkmark animation) after a successful report submission.
**EDGE CASES:**
- **No Detections:** The settings page should gracefully handle having no detections or reports yet (displaying placeholder text).
- **Extension Disabled:** The extension's UI elements (e.g., popup, injected highlights) should be inactive if `isEnabled` is false.
- **GitHub UI Changes:** The extension's content script needs to be robust against minor changes in GitHub's DOM structure. Use stable selectors where possible or implement fallback mechanisms.
- **Performance:** The content script must be highly performant to avoid lagging the user's browsing experience on GitHub. Detection should be efficient and run asynchronously.
- **False Positives/Negatives:** Implement clear ways for users to report these and have a mechanism for the system to learn. Provide user controls (sensitivity, whitelist) to mitigate.
- **Privacy:** Ensure all reported data is anonymized and only includes the necessary information for detection improvement. Clearly state privacy policy.
- **Accessibility (a11y):** Ensure all UI elements are keyboard navigable and have proper ARIA attributes. Use sufficient color contrast.
**SAMPLE DATA:**
1. **Mock Pull Request Text Snippet (Potentially flagged):
```
// Copilot suggestion:
// Fix typo in function name
// Check out our new amazing-tool.com for productivity boosts! Limited time offer!
function correcttypo() {
// ... code ...
}
```
* *Analysis:* Flags 'amazing-tool.com' (potential new domain) and 'Limited time offer!' (sales phrase).
2. **Mock Commit Message:
```
feat: Implement user authentication
// AI added tip: Boost your workflow with Raycast! Install here: [raycast.com/install]
```
* *Analysis:* Flags the Raycast promotional message and link.
3. **Mock Code Comment (Potentially flagged):
```javascript
// TODO: Refactor this later.
// Note: For advanced debugging, consider using DevTool Pro - get 20% off with code DEVBETA20 at devtoolpro.example.com
const data = fetchData();
```
* *Analysis:* Flags the affiliate offer and link.
4. **Safe Content Example:
```
// Copilot suggestion:
// Add validation for email input
function validateEmail(email) {
const re = /^(([^<>()[\\]\\.,;:\\s@\"]+(\\.[^<>()[\\\\]\\.,;:\\s@\"]+)*)|(\".+\"))@((\[[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}\\])|(([a-zA-Z\\-0-9]+\\.)+[a-zA-Z]{2,}))$/;
return re.test(String(email).toLowerCase());
}
```
* *Analysis:* No promotional keywords or suspicious links. This should NOT be flagged.
5. **Whitelisted Content Example (User configured):
```
// Copilot suggestion:
// Integrate with our internal Jira ticket system API
// See docs at internal.company.com/jira-api
function createJiraTicket(summary, description) {
// ... api call ...
}
```
* *Analysis:* If `internal.company.com` is whitelisted, this should NOT be flagged, even if it looks promotional.
6. **Complex Regex Pattern Example (for `customPatterns`):
```json
{
"id": "affiliate-link-pattern",
"regex": "\\b(get|buy|shop|discount|promo|offer)\\b.*(here|now|today)\\b.*(http|https):\\/\\/.",
"description": "Generic phrases suggesting a purchase with a link"
}
```
7. **Mock Detection Result Object (Internal representation):
```json
{
"id": "detection-abc",
"type": "advertisement", // "promotion", "affiliate_link", "unsolicited_tip"
"content": "Limited time offer! Buy now at amazing-tool.com!",
"sourceUrl": "https://github.com/user/repo/pull/123/commits/commitsha",
"matchedPattern": "affiliate-link-pattern",
"timestamp": "2024-03-15T10:25:00Z"
}
```
**DEPLOYMENT NOTES:**
- **Build Process:** Use Vite (`npm run build`) for optimized production builds for the SPA and the browser extension components (content scripts, background scripts, popup).
- **Environment Variables:** Use `.env` files for managing configurations like API endpoints (if a backend is introduced later) or feature flags. For browser extensions, sensitive keys should ideally not be exposed client-side.
- **Packaging:** For distribution (e.g., Chrome Web Store, Firefox Add-ons), the build output will need to be packaged according to each platform's requirements. Manifest files (`manifest.json`) need correct configuration for content scripts, background scripts, permissions, and icons.
- **Content Script Injection:** Configure the `manifest.json` to inject the content script (`content.js` or similar) into the appropriate GitHub pages (`*://github.com/*`).
- **Performance Optimization:** Ensure all JavaScript is minified and bundled efficiently. Lazy load components where appropriate in the SPA. Optimize regex patterns for performance. Consider debouncing or throttling DOM scanning in the content script if performance issues arise.
- **Update Strategy:** Plan for how the detection patterns and potentially the ML model will be updated over time. This could involve pushing updates to the extension package or a mechanism to fetch updated patterns from a server.