[{"content":"Duration: 2 weeks (Start date: 2025-04-21, End date: 2025-05-04)\nThe following sections build on the Requirements Specifications document.\nSprint Goal Testing and Quality Assurance (IR1, NFR1, NFR2)\nSince the code is stable, write down the ArchUnit tests Define quality metrics using PCA and pairwise analysis Visualization and Analysis (IR5, FR1.2, BR2, NFR4)\nVisualize code embeddings for quality analysis Generate insights for documentation through scatter plots and pair-wise heatmaps Documentation and Reporting (NFR2, IR4)\nCreate user documentation generated from code comments Complete project report Project Housekeeping (IR1, NFR2)\nPerform code refactoring and cleanup Prepare for project release Finalize website based on the Nerfies template Sprint Backlog For detailed sprint backlog items, see Sprint Backlog.\nSprint Retrospective What went well?\nSuccessfully completed all major project deliverables Finalized documentation and visualization analysis Implemented ArchUnit tests to validate architectural decisions Produced report with quality metrics analysis Project Status\nAll tasks from the product backlog have been completed (according to the definition of done) The Nerfies website provides an overview of the key decisions made during development ","permalink":"https://rfvasile.github.io/git-inspector/process/docs/sprint5/overview/","summary":"Fifth sprint (April 21-May 04, 2025) overview, focused on completing the project report, documentation, and final analysis.","title":"Sprint 5 Overview: Report Completion and Project Finalization"},{"content":"Duration: 1 week (Start date: 2025-04-14, End date: 2025-04-20)\nThe following sections build on the Requirements Specifications document.\nSprint Goal Performance Optimization (NFR1, FR2.2, FR2.3, NFR2)\nOptimize repository lookup and processing speed Implement caching mechanism for embeddings Ensure responsive search and chat experience Visualization and Analysis (IR5, FR1.2, BR2)\nVisualize code embeddings for quality analysis Provide metrics on search effectiveness Generate insights for documentation Scala.js Frontend Implementation (NFR2, IR1, NFR3)\nComplete the Scala.js interface components started in Sprint 3 Integrate frontend with backend services Enhance user experience with responsive design Documentation and Reporting (NFR2, IR4)\nCreate comprehensive user documentation Generate project report Document architecture and design decisions Sprint Backlog For detailed sprint backlog items, see Sprint Backlog.\nSprint Retrospective What went well? Managed to solve most of the issues, with the work related to the redaction of the report outstanding. During the next sprint, I will focus on completing the report, including the evaluation of the library through experiments. ","permalink":"https://rfvasile.github.io/git-inspector/process/docs/sprint4/overview/","summary":"Fourth sprint (April 14-20, 2025) overview, focused on implementing performance optimizations, visualization features, and comprehensive documentation.","title":"Sprint 4 Overview: Performance Optimization, Visualization, and Documentation"},{"content":"Duration: 1 week (Start date: 2025-04-07, End date: 2025-04-13)\nThe following sections build on the Requirements Specifications document.\nSprint Goal Natural Language Code Search (FR1.2, FR1.3, FR1.4, FR1.5, BR1, NFR2)\nEnable semantic search across indexed codebases Support language/extension filtering capabilities Display search results with proper code context Code Understanding Chat Interface (FR1.6, FR1.5, FR2.4, BR2, NFR2)\nProvide interactive chat interface for code questions Retrieve and use relevant code context for queries Generate context-aware responses with code references User Interface Implementation (NFR2, IR1, NFR3)\nCreate intuitive frontend for search and chat functionality Support Scala.js interface components Ensure responsive and accessible design Technical Debt \u0026amp; Testing (IR4, NFR1, FR1.2)\nImplement ArchUnit tests for layered architecture Create acceptance tests for repository loading Develop search relevance and chat accuracy testing Sprint Backlog For detailed sprint backlog items, see Sprint Backlog.\nSprint Retrospective What went well?\nMany tasks carried over from the previous sprint have been completed Acceptance tests involving all types of requirements were built. This includes tests for the requirements listed in the requirements document. Much of the technical debt has been resolved Implemented the search functionality with filtering capabilities Added comprehensive test suites for validating functional, business, and non-functional requirements using BDD-style tests What could be improved?\nNot all planned tasks were completed: The Scala.js interface components (S3.3.2) implementation was only started but not completed ArchUnit tests for layered architecture (S3.4.1) remain to be implemented Chat accuracy evaluation (S3.4.4) wasn\u0026rsquo;t completed Frontend development in Scala.js requires more attention in the next sprint Integration between the backend and frontend components needs further refinement What did I learn?\nBDD-style testing with ScalaTest FeatureSpec provides a clear way to validate requirements Modern build tools like Vite improve the frontend development experience by allowing to dynamically reload changes Implementing comprehensive test suites early helps validate that requirements are being met correctly Python prototyping allowed for quick validation of UI concepts before full Scala.js implementation ","permalink":"https://rfvasile.github.io/git-inspector/process/docs/sprint3/overview/","summary":"Third sprint (April 7-13, 2025) overview, focused on implementing search features, chat interface, and responsive UI.","title":"Sprint 3 Overview: Search Features, Chat Interface, and UI Implementation"},{"content":"Duration: 1 week (Start date: 2025-03-31, End date: 2025-04-06)\nThe following sections build on the Requirements Specifications document.\nSprint Goal Repository URL Input Interface (FR1.1, NFR2, NFR3)\nGit Repository Fetching Service (FR2.1, IR1, NFR1, NFR2, IR3)\nCode Processing Pipeline (FR2.1, FR2.2, FR1.4)\nSearch Indexing System (FR2.2, FR2.3, IR2, NFR1)\nArchitectural Foundation (IR1, IR4)\nSprint Backlog For detailed sprint backlog items, see Sprint Backlog.\nSprint Retrospective What went well?\nGood progress. Most of the requirements that were planned were completed. Layered Architecture. It becomes clear how useful the layered architecture is. The different parts communicate using interfaces, and the dependencies are well defined. By allowing different modules to depend on other modules strictly below them, the code becomes more flexible and easier to maintain. Example. The application layer is not allowed to depend on the infrastructure layer, but all communicate is done through the domain layer, which is an intermediary designated to satisfy business requirements. What could be improved?\nNFRs not addressed. For instance, while the repository loading functionality is complete, it is still necessary to assess qualities such as the robustness and performance of the code. Testing. From a testing perspective, both integration and mocked tests pass, with the former requiring local external services to be running. The later are executable using CI. Requirements Documentation. It would be ideal to have the requirements documented using tests. However, since the implementation details are likely to change, I deemed more valuable to to document this later when the code is more stable. Code Quality. The code is not as clean as it could be, and more refactoring is going to be necessary in future sprints. What did I learn?\nRequirements. Since the requirements were well defined, the implementation was relatively straightforward. Documenting progress. It is fairly useful to document progress, and to think about the process by documenting it. This can certainly influence the direction of future sprints. ","permalink":"https://rfvasile.github.io/git-inspector/process/docs/sprint2/overview/","summary":"Second sprint (March 31-April 6, 2025) overview, focused on design patterns and core functionality.","title":"Sprint 2 Overview: Design Patterns and Indexing"},{"content":"Duration: 1 week (Start date: 2025-03-24, End date: 2025-03-30)\nSprint Goal Sprint 1 focused on setting up the project infrastructure and development environment. This includes:\nSetting up the basic Scala project structure Implementing essential development tools and configurations Establishing code quality standards and automated checks Creating a reliable CI/CD pipeline Deliverables:\nBasic project structure with Scala 3.6.4 Automated code formatting and linting setup Git hooks for code quality checks Test infrastructure with ScalaTest Code coverage reporting with Scoverage Semantic release configuration Logging infrastructure Sprint Backlog For detailed sprint backlog items, see Sprint Backlog\nSprint Retrospective The following tools will help maintain code quality and automate tasks, freeing up time for more complex tasks.\nMain completed tasks:\nSuccessfully set up the project infrastructure using Scala 3 Added tools to ensure code quality: Scalafmt, Wartremover, Scalafix and Trunk Set up complete CI/CD pipeline with Github Actions and artifact publishing Configured automatic semantic versioning and release management Implemented git hooks with pre-commit checks Added Gemini bot for automated pull request reviews What went well?\nThis approach to setting up the requirements is likely to save time in future sprints, as it makes it easier to track requirements and choose tasks to focus on. The documentation process was fairly smooth. During the forthcoming sprints, it will become much more easier to maintain the documentation. Learnt about semantic versioning and how it can be used to manage the release of new versions of the project. What could be improved?\nSome planned tasks remain incomplete, including Dependabot configuration. This will be removed from the product backlog. I prefer not to use it, to avoid build stability issues. Wartremover rules need to be fine-tuned, as they will likely lead to accumulating technical debt if I rely too much on excluded rules. Repository loading functionality is not yet implemented (see Sprint Backlog). This will be remedied in the next sprint. What I learned?\nI need to pay special attention to the Wartremover rules, as if I rely too much on excluded rules, this will likely end up in accumulating technical debt. Used Wart.Any, Wart.Throw and Wart.Var. Early investment in development infrastructure may pay off in the long run. This will have to be proven in the forthcoming sprints. Setting up automated checks from the beginning helps maintain consistent code quality standards. Breaking down infrastructure tasks into smaller, more manageable pieces makes the development process more straightforward. ","permalink":"https://rfvasile.github.io/git-inspector/process/docs/sprint1/overview/","summary":"First sprint (March 24-30, 2025) dedicated to setting up the project\u0026rsquo;s foundational infrastructure, including Scala 3 environment, development tools, automated checks, testing framework, and CI/CD pipeline.","title":"Sprint 1 Overview: Project Infrastructure Setup"},{"content":"Tasks from Sprint 3:\nImplement Scala.js interface components (started but not completed) Implement ArchUnit tests for layered architecture Complete chat accuracy evaluation Sprint Goal: Enhance the application\u0026rsquo;s performance, provide visualization tools for code analysis, complete the Scala.js frontend, and create comprehensive documentation to finalize the project.\nKey Deliverables:\nPerformance optimization for repository lookup and search functionality Visualization tools for code embeddings and search metrics Complete Scala.js frontend with full backend integration Comprehensive user documentation and project report Task Board Link to the main product backlog: Product Backlog\nSBI ID Task Description User Story PBI ID Est. Points Status PERFORMANCE OPTIMIZATION (18 Points) S4.1.1 Optimize repository lookup speed Performance E2 6 ✓ (config) S4.1.2 Implement caching for embeddings Performance E2 7 ✓ S4.1.3 Ensure responsive search/chat experience Performance E2 5 ✓ (status-bar) VISUALIZATION \u0026amp; ANALYSIS (15 Points) S4.2.1 Visualize code embeddings for quality analysis Visualization E5 6 ✓ (scatter) S4.2.2 Provide metrics on search effectiveness Visualization E5 4 ✓ (pair-wise) S4.2.3 Generate insights for documentation Visualization E5 5 ✓ SCALA.JS FRONTEND (22 Points) S4.3.1 Complete Scala.js interface components UI Implementation E1 8 ✓ S4.3.2 Integrate frontend with backend services UI Implementation E1 7 ✓ S4.3.3 Enhance user experience with responsive design UI Implementation E1 7 ✓ DOCUMENTATION (20 Points) S4.4.1 Create comprehensive user documentation Documentation E6 6 ✓ S4.4.2 Generate project report Documentation E6 8 ✓ S4.4.3 Document architecture and design decisions Documentation E6 6 ✓ TECHNICAL DEBT (15 Points) S4.5.1 Implement ArchUnit tests for layered architecture Technical Debt F1 5 ✓ S4.5.2 Complete chat accuracy evaluation Technical Debt C2 5 ✓ (report) S4.5.3 Final code refactoring and cleanup Technical Debt E4 5 ✓ ","permalink":"https://rfvasile.github.io/git-inspector/process/docs/sprint4/sprint_backlog/","summary":"List of deliverables for Sprint 4 (April 14-20, 2025), focusing on performance optimization, visualization features, and documentation completion.","title":"Sprint 4 Backlog: Performance, Visualization, and Documentation"},{"content":"Sprint Goal: The purpose of the sprint is to complete the project report and analysis. Project housekeeping is also included with tasks such as refactoring, cleanup and release.\nThe sprint took two weeks instead of one. It was responsible of one larger task, and since I prefered each sprint to have a single, well-defined focus, I decided not to split it into two sprints.\nKey Deliverables:\nComplete project report and documentation Project housekeeping tasks such as refactoring, cleanup and release Since the project is almost complete and the code is stable, the ArchUnit tests are finalized. The visualization analysis is finalized. Task Board Link to the main product backlog: Product Backlog\nSBI ID Task Description User Story PBI ID Est. Points Status TESTING \u0026amp; QUALITY ASSURANCE (15 Points) S5.1.1 Implement testing strategy (ArchUnit, acceptance test) Testing E4 5 ✓ (tests) S5.1.3 Define quality metrics (PCA, Pairwise) Testing E4 5 ✓ (report) VISUALIZATION \u0026amp; ANALYSIS (15 Points) S5.2.1 Visualize code embeddings for quality analysis Visualization E5 5 ✓ (scatter) S5.2.2 Provide metrics on search effectiveness Visualization E5 5 ✓ (pair-wise) S5.2.3 Generate insights for documentation Visualization E5 5 ✓ DOCUMENTATION (15 Points) S5.3.1 Create user documentation Documentation E6 5 ✓ S5.3.2 Generate project report Documentation E6 5 ✓ S5.3.3 Document architecture and design decisions Documentation E6 5 ✓ ","permalink":"https://rfvasile.github.io/git-inspector/process/docs/sprint5/sprint_backlog/","summary":"List of deliverables for Sprint 5 (April 21-May 04, 2025), focusing on testing, visualization features, and analysis.","title":"Sprint 5 Backlog: Report"},{"content":"Tasks from Sprint 2:\nAcceptance tests for repository loading Taking code notes upon completion of requirements The need for ArchUnit to verify the layered architecture Sprint Goal: Develop a fully functional search interface and code understanding chat assistant that leverages the indexed repository data to provide intelligent code insights and responses to natural language queries.\nKey Deliverables:\nNatural language code search interface with filtering capabilities Code understanding chat interface with context-aware responses Responsive UI implementation for both search and chat features Testing infrastructure for search relevance and chat accuracy Task Board Link to the main product backlog: Product Backlog\nSBI ID Task Description User Story PBI ID Est. Points Status SEARCH FUNCTIONALITY (20 Points) S3.1.1 Implement semantic search across codebase Code Search C1 8 ✓ S3.1.2 Create language/extension filtering capabilities Code Search C1 5 ✓ S3.1.3 Design results display with code context Code Search C1 7 ✓ CHAT INTERFACE (25 Points) S3.2.1 Develop chat interface for code questions Code Understanding C2 8 ✓ S3.2.2 Implement context retrieval for queries Code Understanding C2 10 ✓ S3.2.3 Create response generation with code context Code Understanding C2 7 ✓ UI IMPLEMENTATION (20 Points) S3.3.1 Design intuitive frontend for search and chat UI Implementation E1 7 ✓ S3.3.2 Implement Scala.js interface components UI Implementation E1 8 ✓ S3.3.3 Ensure responsive and accessible design UI Implementation E1 5 ✓ TECHNICAL DEBT \u0026amp; TESTING (15 Points) S3.4.1 Implement ArchUnit tests for layered architecture Technical Debt F1 3 ✓ S3.4.2 Create acceptance tests for repository loading Technical Debt F2 5 ✓ S3.4.3 Develop provider for results with score relevance Testing C1 4 ✓ S3.4.4 Implement chat accuracy evaluation Testing C2 3 ✓ (report) ","permalink":"https://rfvasile.github.io/git-inspector/process/docs/sprint3/sprint_backlog/","summary":"List of deliverables for Sprint 3 (April 7-13, 2025), focusing on natural language code search and chat interface implementation.","title":"Sprint 3 Backlog: Search and Chat Interface"},{"content":" Sprint Goal: Build and implement the foundation for a code search system that accepts Git repositories, processes their content, and creates searchable indexes. The requirements that are addressed are shown below.\nKey Deliverables:\nRepository URL input interface with validation Git repository fetching and processing service Code indexing system with vector DB integration Basic layered architectural foundation (Scala) Task Board Link to the main product backlog: Product Backlog\nSBI ID Task Description User Story PBI ID Est. Points Status ARCHITECTURE \u0026amp; SETUP S1.A1 Set up basic layered project structure (Scala) (Foundation) F1 10 ✓ S1.A2 Define core interfaces between initial layers (Foundation) F1 5 ✓ S1.A3 Implement design patterns discussed during the course (Foundation) F1 10 ✓ REPOSITORY INPUT (15 Points) S1.1.1 Create UI component for repository URL input Repository Input F2 5 ✓ S1.1.2 Implement URL validation with clear feedback Repository Input F2 3 ✓ S1.1.3 Create Git wrapper for repository fetching Repository Input F2 7 ✓ CODE PROCESSING (20 Points) S1.2.1 Implement file traversal \u0026amp; content extraction Code Processing F2 8 ✓ S1.2.2 Add language detection \u0026amp; basic file filtering Code Processing F2 4 ✓ S1.2.3 Create code chunking strategy for indexing Code Processing F3 8 ✓ SEARCH INDEXING (15 Points) S1.3.1 Implement code embedding generation (Langchain4J) Search Indexing F3 8 ✓ S1.3.2 Create vector database integration (Qdrant) using Langchain4j Search Indexing F3 7 ✓ ","permalink":"https://rfvasile.github.io/git-inspector/process/docs/sprint2/sprint_backlog/","summary":"List of deliverables for Sprint 2 (March 31-April 6, 2025), focusing on design patterns and code indexing.","title":"Sprint 2 Backlog: Design Patterns and Indexing"},{"content":"Sprint Goal The goal of the sprint is to set up the project infrastructure and CI/CD pipeline so we can start developing the application.\nUser Stories Development Setup\nAs a developer, I want a properly configured Scala project so I can efficiently develop the application As a developer, I want linting and formatting tools configured so code quality remains consistent CI/CD Pipeline\nAs a developer, I want GitHub Actions configured for CI/CD so code is automatically built and tested As a developer, I want documentation automatically generated and published so it stays current Basic Repository Loading\nAs a user, I want to load a local Git repository so I can inspect its contents As a user, I want to see basic repository information to confirm it loaded correctly Task Board Link to the main product backlog: Product Backlog\nSBI ID Task Description User Story Est. Points Status BUILD CONFIGURATION (10 Points) S1.B1 Initialize SBT project with Scala 3.6.4 Development Setup 2 ✓ S1.B2 Configure assembly plugin for JAR creation Development Setup 1 ✓ S1.B3 Set up test environment with ScalaTest Development Setup 1 ✓ S1.B4 Configure memory settings for tests Development Setup 1 ✓ S1.B5 Enable code coverage with Codecov Development Setup 2 ✓ S1.B6 Configure automatic documentation generation Development Setup 1 ✓ S1.B7 Set up project website Development Setup 1 ✓ S1.B8 Code quality badges CI/CD 1 ✓ CODE QUALITY TOOLS (10 Points) S1.Q1 Set up Scalafmt with formatting rules Code Quality 2 ✓ S1.Q2 Implement Wartremover for code analysis Code Quality 3 ✓ S1.Q3 Configure Scalafix and semantic DB Code Quality 3 ✓ S1.Q4 Set up Trunk for style checks Code Quality 1 ✓ S1.Q5 Set up Gemini bot for PR reviews Code Quality 1 ✓ GIT WORKFLOW (7 Points) S1.G1 Implement git hooks system CI/CD Pipeline 3 ✓ S1.G2 Set up semantic release system CI/CD Pipeline 4 ✓ PROJECT INFRASTRUCTURE (8 Points) S1.I1 Set up logging infrastructure Development Setup 2 ✓ S1.I2 Configure CI/CD pipeline CI/CD Pipeline 4 ✓ S1.I3 Define high-level architecture Development Setup 2 ✓ CORE DOMAIN MODEL (5 Points) S1.D1 Design repository data model Basic Repository 2 ✓ S1.D2 Design initial API contracts Basic Repository 2 ✓ S1.D3 Generate API documentation Basic Repository 1 ✓ BASIC GIT OPERATIONS (3 Points) S1.O1 Implement repository loading Basic Repository 1 ✓ S1.O2 Extract repository metadata Basic Repository 1 ✓ S1.O3 Create error handling Basic Repository 1 ✓ TESTING (3 Points) S1.T1 Write domain model unit tests Basic Repository 2 ✓ S1.T2 Create integration tests Basic Repository 1 ✓ DOCUMENTATION (5 Points) S1.P1 Document development process Documentation 2 ✓ S1.P2 Complete sprint retrospective Documentation 2 ✓ S1.P3 Plan next sprint Documentation 1 ✓ ","permalink":"https://rfvasile.github.io/git-inspector/process/docs/sprint1/sprint_backlog/","summary":"Comprehensive list of tasks and deliverables for Sprint 1 (March 24-30, 2025), focusing on establishing project infrastructure, development environment, and CI/CD pipeline setup.","title":"Sprint 1 Backlog: Infrastructure Tasks"},{"content":" Project Tasks Based on the Requirements Specifications document.\nThe project was divided into 5 separate sprints, each with its focus described below:\nSprint 1: project setup Sprint 2: design patterns and indexing Sprint 3: search and chat interface Sprint 4: performance, visualization and documentation Sprint 5: final report and presentation ID Task Priority Related Requirements Status Sprint (click) Github I1 Build Configuration - Initialize SBT project with Scala 3.6.4\n- Configure assembly plugin for JAR creation\n- Set up test environment with ScalaTest\n- Configure code coverage \u0026amp; documentation HIGHEST IR1, IR4, NFR1 ✓ S1 I2 Code Quality Tools - Set up Scalafmt with formatting rules\n- Implement Wartremover for code analysis\n- Configure Scalafix and semantic DB\n- Set up Trunk and Gemini bot HIGH IR1, NFR2 ✓ S1 I3 Git Workflow - Implement git hooks system\n- Set up semantic release system MEDIUM NFR1, NFR2 ✓ S1 I4 Project Infrastructure - Set up logging infrastructure\n- Configure CI/CD pipeline\n- Define high-level architecture HIGHEST IR4, NFR1, NFR2 ✓ S1 I5 Core Domain Model - Design repository data model\n- Design initial API contracts\n- Generate API documentation HIGH IR1, IR4, FR2.1 ✓ S1, S5 I6 Basic Git Operations - Implement repository loading\n- Extract repository metadata\n- Create error handling HIGH FR1.1, FR2.1, NFR3 ✓ S1 Foundation F1 Layered Architecture Implementation - Implement modules from architecture diagram\n- Apply design patterns from lectures (dependency injection, layered architecture, strategy, factory, etc) HIGHEST IR4, IR1 ✓ S2 F2 Repository Input and Processing - Accept/validate Git repository URLs\n- Fetch repository contents\n- Support file type filtering HIGHEST FR1.1, FR2.1, NFR3, BR1, NFR1 ✓ S2 F3 Code Indexing System - Process code into searchable representations\n- Generate/store code embeddings\n- Integrate with Langchain4J for vector storage HIGHEST FR2.2, FR2.3, IR2, IR3, BR2, NFR1 ✓ S2 Core Value C1 Natural Language Code Search - Enable semantic search across codebase\n- Support language/extension filtering\n- Display relevant results with context HIGH FR1.2, FR1.4, FR1.3, FR1.5, BR1, NFR2 ✓ S3 C2 Code Understanding Chat Interface - Provide chat interface for code questions\n- Retrieve relevant code context for queries\n- Generate context-aware responses HIGH FR1.6, FR1.5, FR2.4, BR2, NFR2 ✓ S3 Additional E1 User Interface Implementation - Create intuitive frontend for all functionality\n- Support Scala.js and Python (Gradio) interfaces\n- Ensure responsive and accessible design MEDIUM NFR2, IR1, NFR3 ✓ S3 E2 Performance Optimization - Optimize repository lookup speed\n- Implement caching/reuse of embeddings\n- Ensure responsive search/chat experience MEDIUM NFR1, FR2.2, FR2.3, NFR2 ✓ S4 E3 Security Implementation - Sanitize user inputs\n- Secure Restful API\n- Secure data storage MEDIUM NFR3, FR2.3 ✓ S4 E4 Testing and Quality Assurance - Implement comprehensive testing strategy\n- Define quality metrics (see requirements document) MEDIUM IR1, NFR1, NFR2 ✓ S4, S5 E5 Visualization and Analysis - Visualize code embeddings for quality analysis\n- Provide metrics on search effectiveness\n- Generate insights for documentation MEDIUM IR5, FR1.2, BR2 ✓ S5 E6 Documentation and Reporting - Create user documentation\n- Generate project report\n- Document architecture and design decisions MEDIUM NFR2, IR4 ✓ S5 Traceability Matrix The following table shows evidence for the requirements in the Requirements Specifications document.\nRequirement Design element Implementation Evidence Done BR1: Search Productivity Project-wide BusinessRequirementsSuite.scala\nSUS Questionnaire\nEmbedding Diagrams ✓ BR2: Improve Code Understanding Project-wide createTextEmbeddingModel\ncreateCodeEmbeddingModel\nPython/Scala Frontend\nSUS Questionnaire\nEmbedding Diagrams ✓ FR1.1: Repository URL Input Interface GithubWrapperService.scala UserFunctionalRequirementsSuite ✓ FR1.2: Code Search Using Markdown QdrantEmbeddingStore.scala\nGithubWrapperService.scala UserFunctionalRequirementsSuite ✓ FR1.3: Search Results Display System Scala frontend\nPython frontend SUS Questionnaire ✓ FR1.4: Code Search using Code QdrantEmbeddingStore.scala\nGithubWrapperService.scala UserFunctionalRequirementsSuite ✓ (see related FR1.2) FR1.5: Code Context Visualization Scala frontend\nPython frontend\nRepositoryWithLanguages\nGithubWrapperService UserFunctionalRequirementsSuite\nSUS Questionnaire ✓ FR1.6: Model with Past Chat History Pipeline.scala\nRAGComponentFactory.scala UserFunctionalRequirementsSuite ✓ FR2.1: Repository Cloning GithubWrapperService.scala\nFetchingService.scala SystemFunctionalRequirementsSuite ✓ FR2.2: Vector Database Generation IngestorService.scala\nCacheService.scala\nQdrantEmbeddingStore.scala SystemFunctionalRequirementsSuite ✓ FR2.3: Vector Database Implementation QdrantEmbeddingStore.scala\nGithubWrapperService.scala UserFunctionalRequirementsSuite ✓ FR2.4: LLM Integration for Code QueryRoutingStrategy.scala\nQueryFilterService.scala\nChatService.scala SystemFunctionalRequirementsSuite ✓ NFR1: Performance Optimization ChatService.scala\nCacheService.scala\nIngestorService.scala\nGithubWrapperService.scala NonFunctionalRequirementsSuite ✓ NFR2: System Usability Optimization GithubWrapperService.scala\nScala frontend\nPython frontend NonFunctionalRequirementsSuite\nSUS Questionnaire ✓ NFR3: User Interface Security Scala frontend\nPython frontend NonFunctionalRequirementsSuite ✓ NFR4: Embedding Visualization IngestorService.scala Final Report ✓ IR1: Scala Implementation (declarative programming) Project-wide Adherence to the Gemini style guide ✓ IR2: Qdrant Vector Database IngestorService.scala\nComponentFactory.scala application.conf ✓ IR3: Ollama Integration QueryRoutingStrategy.scala\nQueryFilterService.scala\nChatService.scala application.conf ✓ IR4: Layered Architecture Project-wide ArchUnit tests ✓ Test Results The acceptance tests are not executable using the traditional CI/CD pipeline, so below are the results ran locally.\nFig. 1: Sample run of the unit tests, including the acceptance tests ","permalink":"https://rfvasile.github.io/git-inspector/process/docs/product_backlog/","summary":"The product backlog from which tasks are derived for each sprint.","title":"Product Backlog"},{"content":" Summary Req ID Description BR1 Allow users to efficiently search and understand code within Git repositories. BR2 Improve developer productivity by facilitating code search/understanding workflows. FR1.1 As a user, I can specify a Git repository URL to inspect its code. FR1.2 As a user, I can search for code using keywords or natural language queries. FR1.3 As a user, I can view the search results with code snippets and links to the original files in the repository. FR1.4 As a user, I can filter search results by programming language. FR1.5 As a user, I can view the context around a code snippet in the search results. FR1.6 As a user, I can ask code-related questions via chat, and the chat history is preserved. FR2.1 As a developer, I need the system to fetch and clone Git repositories from provided URLs. FR2.2 As a developer, I need the system to index the code of the fetched repositories, to generate fast responses. FR2.3 As a developer, I need the system to use a vector database to store code embeddings for semantic search. FR2.4 As a developer, I need the system to integrate with an LLM to process natural language queries. NFR1 The system will index code for targeted repositories within 10 seconds on the specified hardware. NFR2 The system should achieve a System Usability Scale (SUS) score of 70+ based on at least 5 target users. NFR3 The system should sanitize user search query inputs to prevent Cross-Site Scripting (XSS) attacks. NFR4 A 2D visualization tool will display code embeddings to help analyze and improve indexing and search. IR1 The system should be implemented in Scala, following functional programming principles. IR2 The system should use Qdrant as the vector database for code embeddings. IR3 The system should integrate with Ollama for LLM functionalities. IR4 The system should follow a layered architecture approach, ensuring better modularity. Requirements Specification Below is a more through description of the requirements:\nCode Search Productivity (BR1) - Choice: Enable developers to efficiently search and understand code within Git repositories. - Rationale: Developers spend significant time searching through codebases, and improving this process directly impacts productivity. - Validation Criteria: - At least 85\\% of test users report improved workflow efficiency in post-usage surveys (SUS). - Average query-to-result time under 10 seconds for the predetermined repositories. - Implementation Considerations: - Ensure integration with common development workflows. This can be done using Gradio interfaces. - Focus on search result quality and relevance (i.e. analyze the effectiveness of the generated embeddings). - Related Requirements: - FR1.2 (Search) - FR1.5 (Context) - FR1.6 (Chat) Improving Code Understanding (BR2) - Choice: Improve developer productivity by facilitating code search/understanding workflows. - Rationale: Understanding existing code is often more time-consuming than writing new code. - Validation Criteria: - Code explanations rated as \u0026#34;accurate and helpful\u0026#34; by at least 70\\% of test users (SUS). - Implementation Considerations: - Implement contextual code explanations (i.e. use separate models that understand code and natural language). - Provide relationship visualization between embedding by using a 2D visualization tool. - Prioritize speed and accuracy in responses by allowing the users to select any open source model. - Related Requirements: - FR1.6 (Chat) - FR2.4 (LLM integration) - NFR2 (Usability) - NFR4 (Visualization) Repository URL Input Interface (FR1.1) - Choice: As a user, I can specify a Git repository URL to inspect its code, so that I can access and analyze specific codebases I\u0026#39;m interested in. - Rationale: The system needs a secure and user-friendly way to fetch Git repository URLs. - Validation Criteria: - The acceptance tests parse valid GitHub URLs successfully. - Error feedback displayed within 2 seconds of validation failure. - URL validation completes within half a second for all inputs. - Implementation Considerations: - Implement URL validation and display clear feedback in case of parsing errors. - Related Requirements: - FR2.1 (Repository Cloning) - NFR3 (Security) Code Search Using Markdown (FR1.2) - Choice: As a user, I can search for code using keywords or natural language, so that I can quickly find relevant code sections without manually browsing through files. - Rationale: This allows users to search for code using natural language, making it easier to find relevant code sections. - Validation Criteria: - Search results return in under 2 seconds for the predetermined repositories. - Language filtering correctly categorizes at least 95\\% of code files. - Implementation Considerations: - Use embedding model to convert natural language queries to vectors. - Support filtering by language, content type and extension. - Related Requirements: - FR2.2 (Code Indexing) - FR2.3 (Vector Database) - NFR1 (Performance) Search Results Display System (FR1.3) - Choice: As a user, I can view the search results with code snippets and links to the original files in the repository, so that I can efficiently evaluate search results and navigate to the full context when needed. - Validation Criteria: - 95\\% of users can correctly identify file locations from the display assessed via the SUS survey. - Code snippets maintain proper indentation and formatting (Python, Scala frontend). - Implementation Considerations: - Display code snippets with syntax highlighting (Python). - Show file path and location information. - Related Requirements: - FR1.2 (Search) - FR1.5 (Context) - NFR2 (Usability) Code Search using Code Embeddings (FR1.4) - Choice: As a user, I can filter search results by programming language, so that I can focus on code written in languages relevant to my current task. - Rationale: Developers often need to restrict searches to specific languages or file types. - Validation Criteria: - Language detection accuracy \u0026gt;95\\% across all common programming languages (see partser impl.). - Multiple simultaneous filters function correctly in 100\\% as assessed by the acceptance tests. - Implementation Considerations: - Detect and classify programming languages during indexing. - Create efficient language metadata for fast filtering. - Support multiple simultaneous language filters. - Include language identification in UI. - Related Requirements: - FR1.2 (Search) - FR1.3 (Search Results) - FR2.2 (Code Indexing) Code Context Visualization (FR1.5) - Choice: As a user, I can view the context around a code snippet in the search results, so that I can better understand how the code fits into the broader implementation. - Rationale: This allows users to view the context around a code snippet in the search results, making it easier to understand how the code fits into the broader implementation. - Validation Criteria: - Fetching a repository loads within 5 seconds for the predetermined repositories. - 90\\% of users report sufficient context for understanding code purpose (SUS) - Implementation Considerations: - Display the entire code being indexed in the search results. - When answering questions, display relevant snippets from the codebase. - Allow the user to switch between the full text and retrieved snippets. - Related Requirements: - FR1.3 (Search Results) - NFR2 (Usability) Model with Past Chat History (FR1.6) - Choice: As a user, I can ask code-related questions via chat, and the chat history is preserved, so that I can have a continuous conversation with the system. - Rationale: This allows users to have a continuous conversation with the system, making it easier to understand how the code fits into the broader implementation. - Validation Criteria: - Context-aware responses remain relevant for at least 2 consecutive related questions. - Chat history can be cleared by regenerating the index. - Implementation Considerations: - Maintain chat history within session scope. - Structure LLM prompts to include chat history and retrieved code. - Related Requirements: - FR2.4 (LLM Integration) - FR2.3 (Vector Database) - NFR2 (Usability) Repository Cloning and Management (FR2.1) - Choice: As a developer, I need the system to fetch and clone Git repositories from provided URLs, so that I can work with up-to-date code without performing these operations manually. - Rationale: This allows developers to work with up-to-date code without performing these operations manually. - Validation Criteria: - Predetermined repositories clone successfully within 30 seconds. - UI remains responsive (no blocking) during 100\\% of cloning operations. - Invalid repository URLs are handled gracefully. - Implementation Considerations: - Use Uithub for extracting the repository code. - Implement caching mechanism for previously cloned repositories. - Add repository verification to ensure valid Git URLs. - Related Requirements: - FR2.2 (Code Indexing) - NFR1 (Performance) - NFR3 (Security) Vector Database Generation for RAG (FR2.2) - Choice: As a developer, I need the system to index the code of the fetched repositories, so that I can perform fast and accurate searches across the entire codebase. - Validation Criteria: - The clusters generated by the embeddings are well defined, suggesting successful embedding generation. - Metadata correctly captures language and file type. - Implementation Considerations: - Use Qdrant for caching code embeddings. - Utilize metadata to enhance search results relevance. - Related Requirements: - FR2.2 (Code Indexing) - NFR1 (Performance) Semantic Search (FR2.3) - Choice: As a developer, I need the system to use a vector database to store code embeddings, so that I can perform semantic searches that understand code context beyond simple keyword matching. - Validation Criteria: - Queries complete in less than 300ms for the predetermined repositories. - Vector similarity scores correctly correlate with semantic relevance as determined by the cluster analysis. - Implementation Considerations: - Configure Qdrant collection schema for code embeddings. - Use metadata for filtering specific file types. - Related Requirements: - FR1.2 (Search) - FR2.2 (Code Indexing) - NFR1 (Performance) - IR2 (Qdrant) LLM Integration for Natural Language Queries (FR2.4) - Choice: As a developer, I need the system to integrate with an LLM to process natural language queries, so that I can interact with the codebase using plain English rather than specialized query syntax. - Validation Criteria: - Ollama integration successfully handles queries within tests without errors. - Implementation Considerations: - Implement prompt engineering techniques to guide responses (i.e. conditional RAG, search by file type). - Design context management for large repositories (limit the amount of tokens being processed). - Related Requirements: - FR1.6 (Chat Functionality) - NFR1 (Performance) - NFR3 (Security) System Performance Optimization (NFR1) - Choice: The system will index code for targeted repositories within 10 seconds on the specified hardware. - Success Criteria: - Search queries return results in under 40 seconds for the predetermined repositories. - Embedding generation completes in under 30 seconds for the predetermined repositories. - Chat responses for simple search (without code context) arrives within 20 seconds for all tests. - Implementation Considerations: - Embeddings are stored in the Qdrant vector database. - The embeddings for retrieving chunks are generated using Ollama. - Related Requirements: - FR2.1 (Repository Cloning) - FR2.2 (Code Indexing) - FR1.2 (Search) - FR1.6 (Chat) System Usability Testing (NFR2) - Choice: The system should achieve a System Usability Scale (SUS) score of 70+ based on at least 5 target users. - Rationale: This ensures that the system is usable, as evaluated by a group of users. - Validation Criteria: - First-time users find the interface easy to use without assistance in \u0026gt;70\\% of cases (SUS) - 80\\% of users rate UI intuitiveness as \u0026#34;good\u0026#34; or \u0026#34;excellent\u0026#34; (SUS) - Implementation Considerations: - Implement clean, intuitive UI. Rely on established UX design patterns by using Gradio components. - Related Requirements: - FR1.3 (Search Results) - FR1.5 (Context) - FR1.6 (Chat) User Interface Security (NFR3) - Choice: The system should sanitize user search query inputs to prevent Cross-Site Scripting (XSS) attacks. - Validation Criteria: - 100\\% of malformed/malicious URLs rejected before processing - Implementation Considerations: - Validate repository URLs against known valid patterns. - Test against standard Github URLs. - Related Requirements: - FR1.1 (Repository Input) - FR1.2 (Search) - FR1.6 (Chat) Embedding Visualization Requirement (NFR4) - Choice: A 2D visualization tool will display code embeddings to help analyze and improve indexing and search. - Validation Criteria: - Visualization correctly clusters similar code types. - Report analysis identifies strategies for improving search quality. - Implementation Considerations: - Implement dimension reduction techniques (t-SNE, UMAP) for 2D visualization. - Used to make informed decisions about the search quality. - Related Requirements: - FR2.2 (Code Indexing) - FR2.3 (Vector Database) Scala Implementation Requirement (IR1) - Choice: The system should be implemented in Scala, following functional programming principles. - Success Criteria: - Scala tools are used to ensure consisent. - Functional programming patterns are used to ensure consistent code quality. - Implementation Considerations: - Use appropriate abstraction mechanisms (strategies, factories, memoization, etc.) - Implement error handling using functional approaches (Try, Option) - Related Requirements: - FR2.2 (Code Indexing) - FR2.3 (Vector Database) - FR2.4 (LLM Integration) Qdrant Vector Database Requirement (IR2) - Choice: The system should use Qdrant as the vector database for code embeddings. - Rationale: Qdrant provides efficient vector search capabilities with filtering options needed for code search. - Validation Criteria: - Qdrant client wrapper handles all required vector operations. - Collection schemas designed for code embeddings and text embeddings. - Implementation Considerations: - Implement the AIServices wrapper around the Qdrant module. Configure distance metrics. - Related Requirements: - FR2.3 (Vector Database) - NFR1 (Performance) Ollama Integration Requirement (IR3) - Choice: The system should integrate with Ollama for LLM functionalities. - Rationale: Ollama provides locally-hosted LLM models, yielding good privacy and reduced latency. - Success Criteria: - The application successfully communicates with Ollama. - Implementation Considerations: - Implement the AIServices wrapper around the Ollama module. - Use prompt templates optimized for code understanding. - Related Requirements: - FR1.6 (Chat) - FR2.4 (LLM Integration) - NFR1 (Performance) Layered Architecture (IR4) - Choice: The system should follow a layered architecture approach, ensuring better modularity. - Success Criteria: - All components separated into Presentation, Application, Domain, and Infrastructure layers. - The separation is assessed via ArchUnit tests. ","permalink":"https://rfvasile.github.io/git-inspector/process/docs/requirements/","summary":"I define all 5 requirement types: Business, Functional (user and system), Non-Functional, Implementation.","title":"Requirements Engineering"},{"content":" User Interfaces Components in the Scala frontend package:\nComponents:\nLinkViewer: Component for viewing and indexing content from GitHub repositories ChatInterface: Component for chatting with the LLM about indexed repositories, handling message display and submission. IndexSelector: Component for choosing which repository index to query, with options to refresh or remove indices. StatusBar: Simple component displaying status messages to users. TabContainer: Component for switching between the Chat and Link Viewer tabs. Services:\nContentService: Service that communicates with the backend API to fetch content, generate indices, and handle chat interactions. HttpClient: Low-level service handling HTTP requests and Server-Sent Events for streaming chat responses. Models:\nModels.scala: Contains data models used throughout the application like ChatMessage, IndexOption, and various request/response models. Utilities:\nIDGenerator: Utility for generating unique IDs for chat messages and other elements. Main.scala: Entry point that initializes the application, sets up event listeners, and creates the UI components. Python Interface:\nmain.py: Implements an alternative frontend using Gradio with functionality for chatting with repositories, viewing content, and managing indices. style.css: Provides styling for the Gradio interface. Gradio Interface Fig. 1: Gradio interface for the chatbot Fig. 2: Gradio interface for the code indexer Laminar Interface Fig. 3: Laminar interface for the chatbot Fig. 4: Laminar interface for the code indexer ","permalink":"https://rfvasile.github.io/git-inspector/process/docs/user-interfaces/","summary":"This document contains images with the UIs. They allow inspection of the UI without installing the app.","title":"User Interfaces"},{"content":" Project Tasks Based on the Requirements Specifications document.\nThe project was divided into 5 separate sprints, each with its focus described below:\nSprint 1: project setup Sprint 2: design patterns and indexing Sprint 3: search and chat interface Sprint 4: performance, visualization and documentation Sprint 5: final report and presentation ID Task Priority Related Requirements Status Sprint (click) Github I1 Build Configuration - Initialize SBT project with Scala 3.6.4\n- Configure assembly plugin for JAR creation\n- Set up test environment with ScalaTest\n- Configure code coverage \u0026amp; documentation HIGHEST IR1, IR4, NFR1 ✓ S1 I2 Code Quality Tools - Set up Scalafmt with formatting rules\n- Implement Wartremover for code analysis\n- Configure Scalafix and semantic DB\n- Set up Trunk and Gemini bot HIGH IR1, NFR2 ✓ S1 I3 Git Workflow - Implement git hooks system\n- Set up semantic release system MEDIUM NFR1, NFR2 ✓ S1 I4 Project Infrastructure - Set up logging infrastructure\n- Configure CI/CD pipeline\n- Define high-level architecture HIGHEST IR4, NFR1, NFR2 ✓ S1 I5 Core Domain Model - Design repository data model\n- Design initial API contracts\n- Generate API documentation HIGH IR1, IR4, FR2.1 ✓ S1, S5 I6 Basic Git Operations - Implement repository loading\n- Extract repository metadata\n- Create error handling HIGH FR1.1, FR2.1, NFR3 ✓ S1 Foundation F1 Layered Architecture Implementation - Implement modules from architecture diagram\n- Apply design patterns from lectures (dependency injection, layered architecture, strategy, factory, etc) HIGHEST IR4, IR1 ✓ S2 F2 Repository Input and Processing - Accept/validate Git repository URLs\n- Fetch repository contents\n- Support file type filtering HIGHEST FR1.1, FR2.1, NFR3, BR1, NFR1 ✓ S2 F3 Code Indexing System - Process code into searchable representations\n- Generate/store code embeddings\n- Integrate with Langchain4J for vector storage HIGHEST FR2.2, FR2.3, IR2, IR3, BR2, NFR1 ✓ S2 Core Value C1 Natural Language Code Search - Enable semantic search across codebase\n- Support language/extension filtering\n- Display relevant results with context HIGH FR1.2, FR1.4, FR1.3, FR1.5, BR1, NFR2 ✓ S3 C2 Code Understanding Chat Interface - Provide chat interface for code questions\n- Retrieve relevant code context for queries\n- Generate context-aware responses HIGH FR1.6, FR1.5, FR2.4, BR2, NFR2 ✓ S3 Additional E1 User Interface Implementation - Create intuitive frontend for all functionality\n- Support Scala.js and Python (Gradio) interfaces\n- Ensure responsive and accessible design MEDIUM NFR2, IR1, NFR3 ✓ S3 E2 Performance Optimization - Optimize repository lookup speed\n- Implement caching/reuse of embeddings\n- Ensure responsive search/chat experience MEDIUM NFR1, FR2.2, FR2.3, NFR2 ✓ S4 E3 Security Implementation - Sanitize user inputs\n- Secure Restful API\n- Secure data storage MEDIUM NFR3, FR2.3 ✓ S4 E4 Testing and Quality Assurance - Implement comprehensive testing strategy\n- Define quality metrics (see requirements document) MEDIUM IR1, NFR1, NFR2 ✓ S4, S5 E5 Visualization and Analysis - Visualize code embeddings for quality analysis\n- Provide metrics on search effectiveness\n- Generate insights for documentation MEDIUM IR5, FR1.2, BR2 ✓ S5 E6 Documentation and Reporting - Create user documentation\n- Generate project report\n- Document architecture and design decisions MEDIUM NFR2, IR4 ✓ S5 Traceability Matrix The following table shows evidence for the requirements in the Requirements Specifications document.\nRequirement Design element Implementation Evidence Done BR1: Search Productivity Project-wide BusinessRequirementsSuite.scala\nSUS Questionnaire\nEmbedding Diagrams ✓ BR2: Improve Code Understanding Project-wide createTextEmbeddingModel\ncreateCodeEmbeddingModel\nPython/Scala Frontend\nSUS Questionnaire\nEmbedding Diagrams ✓ FR1.1: Repository URL Input Interface GithubWrapperService.scala UserFunctionalRequirementsSuite ✓ FR1.2: Code Search Using Markdown QdrantEmbeddingStore.scala\nGithubWrapperService.scala UserFunctionalRequirementsSuite ✓ FR1.3: Search Results Display System Scala frontend\nPython frontend SUS Questionnaire ✓ FR1.4: Code Search using Code QdrantEmbeddingStore.scala\nGithubWrapperService.scala UserFunctionalRequirementsSuite ✓ (see related FR1.2) FR1.5: Code Context Visualization Scala frontend\nPython frontend\nRepositoryWithLanguages\nGithubWrapperService UserFunctionalRequirementsSuite\nSUS Questionnaire ✓ FR1.6: Model with Past Chat History Pipeline.scala\nRAGComponentFactory.scala UserFunctionalRequirementsSuite ✓ FR2.1: Repository Cloning GithubWrapperService.scala\nFetchingService.scala SystemFunctionalRequirementsSuite ✓ FR2.2: Vector Database Generation IngestorService.scala\nCacheService.scala\nQdrantEmbeddingStore.scala SystemFunctionalRequirementsSuite ✓ FR2.3: Vector Database Implementation QdrantEmbeddingStore.scala\nGithubWrapperService.scala UserFunctionalRequirementsSuite ✓ FR2.4: LLM Integration for Code QueryRoutingStrategy.scala\nQueryFilterService.scala\nChatService.scala SystemFunctionalRequirementsSuite ✓ NFR1: Performance Optimization ChatService.scala\nCacheService.scala\nIngestorService.scala\nGithubWrapperService.scala NonFunctionalRequirementsSuite ✓ NFR2: System Usability Optimization GithubWrapperService.scala\nScala frontend\nPython frontend NonFunctionalRequirementsSuite\nSUS Questionnaire ✓ NFR3: User Interface Security Scala frontend\nPython frontend NonFunctionalRequirementsSuite ✓ NFR4: Embedding Visualization IngestorService.scala Final Report ✓ IR1: Scala Implementation (declarative programming) Project-wide Adherence to the Gemini style guide ✓ IR2: Qdrant Vector Database IngestorService.scala\nComponentFactory.scala application.conf ✓ IR3: Ollama Integration QueryRoutingStrategy.scala\nQueryFilterService.scala\nChatService.scala application.conf ✓ IR4: Layered Architecture Project-wide ArchUnit tests ✓ Test Results The acceptance tests are not executable using the traditional CI/CD pipeline, so below are the results ran locally.\nFig. 1: Sample run of the unit tests, including the acceptance tests ","permalink":"https://rfvasile.github.io/git-inspector/process/static/product-backlog/","summary":"\u003cstyle\u003e\n/* Override the global style for this page only */\n.custom_table_style th:last-child {\n  width: auto !important;\n}\n\u003c/style\u003e\n\u003c!-- trunk-ignore-all(markdownlint/MD041) --\u003e\n\u003ch1 id=\"project-tasks\"\u003eProject Tasks\u003c/h1\u003e\n\u003c!-- markdownlint-disable MD041 MD033 MD056 --\u003e\n\u003cp\u003e\u003cem\u003eBased on the \u003ca href=\"/git-inspector/process/static/requirement-specifications/\"\u003eRequirements Specifications\u003c/a\u003e document.\u003c/em\u003e\u003c/p\u003e\n\u003cp\u003eThe project was divided into 5 separate sprints, each with its focus described below:\u003c/p\u003e\n\u003cul\u003e\n\u003cli\u003e\n\n\n\n\n\n\u003ca href=\"https://rfvasile.github.io/git-inspector/process/docs/sprint1/sprint_backlog#task-board\" \u003eSprint 1\u003c/a\u003e: project setup\u003c/li\u003e\n\u003cli\u003e\n\n\n\n\n\n\u003ca href=\"https://rfvasile.github.io/git-inspector/process/docs/sprint2/sprint_backlog#task-board\" \u003eSprint 2\u003c/a\u003e: design patterns and indexing\u003c/li\u003e\n\u003cli\u003e\n\n\n\n\n\n\u003ca href=\"https://rfvasile.github.io/git-inspector/process/docs/sprint3/sprint_backlog#task-board\" \u003eSprint 3\u003c/a\u003e: search and chat interface\u003c/li\u003e\n\u003cli\u003e\n\n\n\n\n\n\u003ca href=\"https://rfvasile.github.io/git-inspector/process/docs/sprint4/sprint_backlog#task-board\" \u003eSprint 4\u003c/a\u003e: performance, visualization and documentation\u003c/li\u003e\n\u003cli\u003e\n\n\n\n\n\n\u003ca href=\"https://rfvasile.github.io/git-inspector/process/docs/sprint5/sprint_backlog#task-board\" \u003eSprint 5\u003c/a\u003e: final report and presentation\u003c/li\u003e\n\u003c/ul\u003e\n\u003cdiv style=\"display: table;\"\u003e\n    \n\u003ctable class=\"custom_table_style\"\u003e\n  \u003cthead\u003e\n      \u003ctr\u003e\n          \u003cth\u003eID\u003c/th\u003e\n          \u003cth\u003eTask\u003c/th\u003e\n          \u003cth\u003ePriority\u003c/th\u003e\n          \u003cth\u003eRelated Requirements\u003c/th\u003e\n          \u003cth\u003eStatus\u003c/th\u003e\n          \u003cth\u003eSprint (click)\u003c/th\u003e\n      \u003c/tr\u003e\n  \u003c/thead\u003e\n  \u003ctbody\u003e\n      \u003ctr\u003e\n          \u003ctd\u003e\u003cstrong\u003eGithub\u003c/strong\u003e\u003c/td\u003e\n          \u003ctd\u003e\u003c/td\u003e\n          \u003ctd\u003e\u003c/td\u003e\n          \u003ctd\u003e\u003c/td\u003e\n          \u003ctd\u003e\u003c/td\u003e\n          \u003ctd\u003e\u003c/td\u003e\n      \u003c/tr\u003e\n      \u003ctr\u003e\n          \u003ctd\u003eI1\u003c/td\u003e\n          \u003ctd\u003e\u003cstrong\u003eBuild Configuration\u003c/strong\u003e \u003cbr\u003e- Initialize SBT project with Scala 3.6.4\u003cbr\u003e- Configure assembly plugin for JAR creation\u003cbr\u003e- Set up test environment with ScalaTest\u003cbr\u003e- Configure code coverage \u0026amp; documentation\u003c/td\u003e\n          \u003ctd\u003eHIGHEST\u003c/td\u003e\n          \u003ctd\u003eIR1, IR4, NFR1\u003c/td\u003e\n          \u003ctd\u003e✓\u003c/td\u003e\n          \u003ctd\u003e\u003ca href=\"/git-inspector//process/docs/sprint1/sprint_backlog#task-board\"\u003eS1\u003c/a\u003e\u003c/td\u003e\n      \u003c/tr\u003e\n      \u003ctr\u003e\n          \u003ctd\u003eI2\u003c/td\u003e\n          \u003ctd\u003e\u003cstrong\u003eCode Quality Tools\u003c/strong\u003e \u003cbr\u003e- Set up Scalafmt with formatting rules\u003cbr\u003e- Implement Wartremover for code analysis\u003cbr\u003e- Configure Scalafix and semantic DB\u003cbr\u003e- Set up Trunk and Gemini bot\u003c/td\u003e\n          \u003ctd\u003eHIGH\u003c/td\u003e\n          \u003ctd\u003eIR1, NFR2\u003c/td\u003e\n          \u003ctd\u003e✓\u003c/td\u003e\n          \u003ctd\u003e\u003ca href=\"/git-inspector//process/docs/sprint1/sprint_backlog#task-board\"\u003eS1\u003c/a\u003e\u003c/td\u003e\n      \u003c/tr\u003e\n      \u003ctr\u003e\n          \u003ctd\u003eI3\u003c/td\u003e\n          \u003ctd\u003e\u003cstrong\u003eGit Workflow\u003c/strong\u003e \u003cbr\u003e- Implement git hooks system\u003cbr\u003e- Set up semantic release system\u003c/td\u003e\n          \u003ctd\u003eMEDIUM\u003c/td\u003e\n          \u003ctd\u003eNFR1, NFR2\u003c/td\u003e\n          \u003ctd\u003e✓\u003c/td\u003e\n          \u003ctd\u003e\u003ca href=\"/git-inspector//process/docs/sprint1/sprint_backlog#task-board\"\u003eS1\u003c/a\u003e\u003c/td\u003e\n      \u003c/tr\u003e\n      \u003ctr\u003e\n          \u003ctd\u003eI4\u003c/td\u003e\n          \u003ctd\u003e\u003cstrong\u003eProject Infrastructure\u003c/strong\u003e \u003cbr\u003e- Set up logging infrastructure\u003cbr\u003e- Configure CI/CD pipeline\u003cbr\u003e- Define high-level architecture\u003c/td\u003e\n          \u003ctd\u003eHIGHEST\u003c/td\u003e\n          \u003ctd\u003eIR4, NFR1, NFR2\u003c/td\u003e\n          \u003ctd\u003e✓\u003c/td\u003e\n          \u003ctd\u003e\u003ca href=\"/git-inspector//process/docs/sprint1/sprint_backlog#task-board\"\u003eS1\u003c/a\u003e\u003c/td\u003e\n      \u003c/tr\u003e\n      \u003ctr\u003e\n          \u003ctd\u003eI5\u003c/td\u003e\n          \u003ctd\u003e\u003cstrong\u003eCore Domain Model\u003c/strong\u003e \u003cbr\u003e- Design repository data model\u003cbr\u003e- Design initial API contracts\u003cbr\u003e- Generate API documentation\u003c/td\u003e\n          \u003ctd\u003eHIGH\u003c/td\u003e\n          \u003ctd\u003eIR1, IR4, FR2.1\u003c/td\u003e\n          \u003ctd\u003e✓\u003c/td\u003e\n          \u003ctd\u003e\u003ca href=\"/git-inspector//process/docs/sprint1/sprint_backlog#task-board\"\u003eS1\u003c/a\u003e, \u003ca href=\"/git-inspector//process/docs/sprint5/sprint_backlog#task-board\"\u003eS5\u003c/a\u003e\u003c/td\u003e\n      \u003c/tr\u003e\n      \u003ctr\u003e\n          \u003ctd\u003eI6\u003c/td\u003e\n          \u003ctd\u003e\u003cstrong\u003eBasic Git Operations\u003c/strong\u003e \u003cbr\u003e- Implement repository loading\u003cbr\u003e- Extract repository metadata\u003cbr\u003e- Create error handling\u003c/td\u003e\n          \u003ctd\u003eHIGH\u003c/td\u003e\n          \u003ctd\u003eFR1.1, FR2.1, NFR3\u003c/td\u003e\n          \u003ctd\u003e✓\u003c/td\u003e\n          \u003ctd\u003e\u003ca href=\"/git-inspector//process/docs/sprint1/sprint_backlog#task-board\"\u003eS1\u003c/a\u003e\u003c/td\u003e\n      \u003c/tr\u003e\n      \u003ctr\u003e\n          \u003ctd\u003e\u003cstrong\u003eFoundation\u003c/strong\u003e\u003c/td\u003e\n          \u003ctd\u003e\u003c/td\u003e\n          \u003ctd\u003e\u003c/td\u003e\n          \u003ctd\u003e\u003c/td\u003e\n          \u003ctd\u003e\u003c/td\u003e\n          \u003ctd\u003e\u003c/td\u003e\n      \u003c/tr\u003e\n      \u003ctr\u003e\n          \u003ctd\u003eF1\u003c/td\u003e\n          \u003ctd\u003e\u003cstrong\u003eLayered Architecture Implementation\u003c/strong\u003e \u003cbr\u003e- Implement modules from architecture diagram\u003cbr\u003e- Apply design patterns from lectures (dependency injection, layered architecture, strategy, factory, etc)\u003c/td\u003e\n          \u003ctd\u003eHIGHEST\u003c/td\u003e\n          \u003ctd\u003eIR4, IR1\u003c/td\u003e\n          \u003ctd\u003e✓\u003c/td\u003e\n          \u003ctd\u003e\u003ca href=\"/git-inspector//process/docs/sprint2/sprint_backlog#task-board\"\u003eS2\u003c/a\u003e\u003c/td\u003e\n      \u003c/tr\u003e\n      \u003ctr\u003e\n          \u003ctd\u003eF2\u003c/td\u003e\n          \u003ctd\u003e\u003cstrong\u003eRepository Input and Processing\u003c/strong\u003e \u003cbr\u003e- Accept/validate Git repository URLs\u003cbr\u003e- Fetch repository contents\u003cbr\u003e- Support file type filtering\u003c/td\u003e\n          \u003ctd\u003eHIGHEST\u003c/td\u003e\n          \u003ctd\u003eFR1.1, FR2.1, NFR3, BR1, NFR1\u003c/td\u003e\n          \u003ctd\u003e✓\u003c/td\u003e\n          \u003ctd\u003e\u003ca href=\"/git-inspector//process/docs/sprint2/sprint_backlog#task-board\"\u003eS2\u003c/a\u003e\u003c/td\u003e\n      \u003c/tr\u003e\n      \u003ctr\u003e\n          \u003ctd\u003eF3\u003c/td\u003e\n          \u003ctd\u003e\u003cstrong\u003eCode Indexing System\u003c/strong\u003e \u003cbr\u003e- Process code into searchable representations\u003cbr\u003e- Generate/store code embeddings\u003cbr\u003e- Integrate with Langchain4J for vector storage\u003c/td\u003e\n          \u003ctd\u003eHIGHEST\u003c/td\u003e\n          \u003ctd\u003eFR2.2, FR2.3, IR2, IR3, BR2, NFR1\u003c/td\u003e\n          \u003ctd\u003e✓\u003c/td\u003e\n          \u003ctd\u003e\u003ca href=\"/git-inspector//process/docs/sprint2/sprint_backlog#task-board\"\u003eS2\u003c/a\u003e\u003c/td\u003e\n      \u003c/tr\u003e\n      \u003ctr\u003e\n          \u003ctd\u003e\u003cstrong\u003eCore Value\u003c/strong\u003e\u003c/td\u003e\n          \u003ctd\u003e\u003c/td\u003e\n          \u003ctd\u003e\u003c/td\u003e\n          \u003ctd\u003e\u003c/td\u003e\n          \u003ctd\u003e\u003c/td\u003e\n          \u003ctd\u003e\u003c/td\u003e\n      \u003c/tr\u003e\n      \u003ctr\u003e\n          \u003ctd\u003eC1\u003c/td\u003e\n          \u003ctd\u003e\u003cstrong\u003eNatural Language Code Search\u003c/strong\u003e \u003cbr\u003e- Enable semantic search across codebase\u003cbr\u003e- Support language/extension filtering\u003cbr\u003e- Display relevant results with context\u003c/td\u003e\n          \u003ctd\u003eHIGH\u003c/td\u003e\n          \u003ctd\u003eFR1.2, FR1.4, FR1.3, FR1.5, BR1, NFR2\u003c/td\u003e\n          \u003ctd\u003e✓\u003c/td\u003e\n          \u003ctd\u003e\u003ca href=\"/git-inspector//process/docs/sprint3/sprint_backlog#task-board\"\u003eS3\u003c/a\u003e\u003c/td\u003e\n      \u003c/tr\u003e\n      \u003ctr\u003e\n          \u003ctd\u003eC2\u003c/td\u003e\n          \u003ctd\u003e\u003cstrong\u003eCode Understanding Chat Interface\u003c/strong\u003e \u003cbr\u003e- Provide chat interface for code questions\u003cbr\u003e- Retrieve relevant code context for queries\u003cbr\u003e- Generate context-aware responses\u003c/td\u003e\n          \u003ctd\u003eHIGH\u003c/td\u003e\n          \u003ctd\u003eFR1.6, FR1.5, FR2.4, BR2, NFR2\u003c/td\u003e\n          \u003ctd\u003e✓\u003c/td\u003e\n          \u003ctd\u003e\u003ca href=\"/git-inspector//process/docs/sprint3/sprint_backlog#task-board\"\u003eS3\u003c/a\u003e\u003c/td\u003e\n      \u003c/tr\u003e\n      \u003ctr\u003e\n          \u003ctd\u003e\u003cstrong\u003eAdditional\u003c/strong\u003e\u003c/td\u003e\n          \u003ctd\u003e\u003c/td\u003e\n          \u003ctd\u003e\u003c/td\u003e\n          \u003ctd\u003e\u003c/td\u003e\n          \u003ctd\u003e\u003c/td\u003e\n          \u003ctd\u003e\u003c/td\u003e\n      \u003c/tr\u003e\n      \u003ctr\u003e\n          \u003ctd\u003eE1\u003c/td\u003e\n          \u003ctd\u003e\u003cstrong\u003eUser Interface Implementation\u003c/strong\u003e \u003cbr\u003e- Create intuitive frontend for all functionality\u003cbr\u003e- Support Scala.js and Python (Gradio) interfaces\u003cbr\u003e- Ensure responsive and accessible design\u003c/td\u003e\n          \u003ctd\u003eMEDIUM\u003c/td\u003e\n          \u003ctd\u003eNFR2, IR1, NFR3\u003c/td\u003e\n          \u003ctd\u003e✓\u003c/td\u003e\n          \u003ctd\u003e\u003ca href=\"/git-inspector//process/docs/sprint3/sprint_backlog#task-board\"\u003eS3\u003c/a\u003e\u003c/td\u003e\n      \u003c/tr\u003e\n      \u003ctr\u003e\n          \u003ctd\u003eE2\u003c/td\u003e\n          \u003ctd\u003e\u003cstrong\u003ePerformance Optimization\u003c/strong\u003e \u003cbr\u003e- Optimize repository lookup speed\u003cbr\u003e- Implement caching/reuse of embeddings\u003cbr\u003e- Ensure responsive search/chat experience\u003c/td\u003e\n          \u003ctd\u003eMEDIUM\u003c/td\u003e\n          \u003ctd\u003eNFR1, FR2.2, FR2.3, NFR2\u003c/td\u003e\n          \u003ctd\u003e✓\u003c/td\u003e\n          \u003ctd\u003e\u003ca href=\"/git-inspector//process/docs/sprint4/sprint_backlog#task-board\"\u003eS4\u003c/a\u003e\u003c/td\u003e\n      \u003c/tr\u003e\n      \u003ctr\u003e\n          \u003ctd\u003eE3\u003c/td\u003e\n          \u003ctd\u003e\u003cstrong\u003eSecurity Implementation\u003c/strong\u003e \u003cbr\u003e- Sanitize user inputs\u003cbr\u003e- Secure Restful API\u003cbr\u003e- Secure data storage\u003c/td\u003e\n          \u003ctd\u003eMEDIUM\u003c/td\u003e\n          \u003ctd\u003eNFR3, FR2.3\u003c/td\u003e\n          \u003ctd\u003e✓\u003c/td\u003e\n          \u003ctd\u003e\u003ca href=\"/git-inspector//process/docs/sprint4/sprint_backlog#task-board\"\u003eS4\u003c/a\u003e\u003c/td\u003e\n      \u003c/tr\u003e\n      \u003ctr\u003e\n          \u003ctd\u003eE4\u003c/td\u003e\n          \u003ctd\u003e\u003cstrong\u003eTesting and Quality Assurance\u003c/strong\u003e \u003cbr\u003e- Implement comprehensive testing strategy\u003cbr\u003e- Define quality metrics (see requirements document)\u003c/td\u003e\n          \u003ctd\u003eMEDIUM\u003c/td\u003e\n          \u003ctd\u003eIR1, NFR1, NFR2\u003c/td\u003e\n          \u003ctd\u003e✓\u003c/td\u003e\n          \u003ctd\u003e\u003ca href=\"/git-inspector//process/docs/sprint4/sprint_backlog#task-board\"\u003eS4\u003c/a\u003e, \u003ca href=\"/git-inspector//process/docs/sprint5/sprint_backlog#task-board\"\u003eS5\u003c/a\u003e\u003c/td\u003e\n      \u003c/tr\u003e\n      \u003ctr\u003e\n          \u003ctd\u003eE5\u003c/td\u003e\n          \u003ctd\u003e\u003cstrong\u003eVisualization and Analysis\u003c/strong\u003e \u003cbr\u003e- Visualize code embeddings for quality analysis\u003cbr\u003e- Provide metrics on search effectiveness\u003cbr\u003e- Generate insights for documentation\u003c/td\u003e\n          \u003ctd\u003eMEDIUM\u003c/td\u003e\n          \u003ctd\u003eIR5, FR1.2, BR2\u003c/td\u003e\n          \u003ctd\u003e✓\u003c/td\u003e\n          \u003ctd\u003e\u003ca href=\"/git-inspector//process/docs/sprint5/sprint_backlog#task-board\"\u003eS5\u003c/a\u003e\u003c/td\u003e\n      \u003c/tr\u003e\n      \u003ctr\u003e\n          \u003ctd\u003eE6\u003c/td\u003e\n          \u003ctd\u003e\u003cstrong\u003eDocumentation and Reporting\u003c/strong\u003e \u003cbr\u003e- Create user documentation\u003cbr\u003e- Generate project report\u003cbr\u003e- Document architecture and design decisions\u003c/td\u003e\n          \u003ctd\u003eMEDIUM\u003c/td\u003e\n          \u003ctd\u003eNFR2, IR4\u003c/td\u003e\n          \u003ctd\u003e✓\u003c/td\u003e\n          \u003ctd\u003e\u003ca href=\"/git-inspector//process/docs/sprint5/sprint_backlog#task-board\"\u003eS5\u003c/a\u003e\u003c/td\u003e\n      \u003c/tr\u003e\n  \u003c/tbody\u003e\n\u003c/table\u003e\n\n\u003cstyle\u003e\n.post-content table:not(.lntable .highlighttable,.highlight table,.gist .highlight){\n   \n  background-color: transparent;\n  border-radius: 6px;\n  border: 1px solid black;\n  outline: 2px solid black;\n  overflow-x: auto;\n  table-layout: fixed;\n  word-break: break-all;\n  font-size: 12px;\n}\n\n.dark .post-content table:not(.lntable .highlighttable,.highlight table,.gist .highlight){\n  outline: 2px solid rgb(54, 156, 95);\n}\n\n.post-content table:not(.lntable .highlighttable,.highlight table,.gist .highlight) thead{\n  background-color: #545d7b8a;\n}\n\n.dark .post-content table:not(.lntable .highlighttable,.highlight table,.gist .highlight) thead{\n  background-color: rgb(62, 62, 62);\n}\n\n.post-content table:not(.lntable .highlighttable,.highlight table,.gist .highlight) td,\n.post-content table:not(.lntable .highlighttable,.highlight table,.gist .highlight) tr,\n.post-content table:not(.lntable .highlighttable,.highlight table,.gist .highlight) th{\n  border-bottom: unset;\n  border: 1px solid black,\n}\n\n.post-content table:not(.lntable .highlighttable,.highlight table,.gist .highlight) td:hover,\n.post-content table:not(.lntable .highlighttable,.highlight table,.gist .highlight) td:focus{\n  background-color: rgba(67, 166, 86, 0.8);\n}\n\n.dark .post-content table:not(.lntable .highlighttable,.highlight table,.gist .highlight) td:hover,\n.dark .post-content table:not(.lntable .highlighttable,.highlight table,.gist .highlight) td:focus{\n  background-color: rgb(0, 0, 0, 0.7);\n}\n\u003c/style\u003e\n\n\u003c/div\u003e \n\n\u003ch1 id=\"traceability-matrix\"\u003eTraceability Matrix\u003c/h1\u003e\n\u003cp\u003eThe following table shows evidence for the requirements in the \u003ca href=\"/git-inspector/process/static/requirement-specifications/\"\u003eRequirements Specifications\u003c/a\u003e document.\u003c/p\u003e","title":""},{"content":"Code Search Productivity (BR1) - Choice: Enable developers to efficiently search and understand code within Git repositories. - Rationale: Developers spend significant time searching through codebases, and improving this process directly impacts productivity. - Validation Criteria: - At least 85\\% of test users report improved workflow efficiency in post-usage surveys (SUS). - Average query-to-result time under 10 seconds for the predetermined repositories. - Implementation Considerations: - Ensure integration with common development workflows. This can be done using Gradio interfaces. - Focus on search result quality and relevance (i.e. analyze the effectiveness of the generated embeddings). - Related Requirements: - FR1.2 (Search) - FR1.5 (Context) - FR1.6 (Chat) Improving Code Understanding (BR2) - Choice: Improve developer productivity by facilitating code search/understanding workflows. - Rationale: Understanding existing code is often more time-consuming than writing new code. - Validation Criteria: - Code explanations rated as \u0026#34;accurate and helpful\u0026#34; by at least 70\\% of test users (SUS). - Implementation Considerations: - Implement contextual code explanations (i.e. use separate models that understand code and natural language). - Provide relationship visualization between embedding by using a 2D visualization tool. - Prioritize speed and accuracy in responses by allowing the users to select any open source model. - Related Requirements: - FR1.6 (Chat) - FR2.4 (LLM integration) - NFR2 (Usability) - NFR4 (Visualization) Repository URL Input Interface (FR1.1) - Choice: As a user, I can specify a Git repository URL to inspect its code, so that I can access and analyze specific codebases I\u0026#39;m interested in. - Rationale: The system needs a secure and user-friendly way to fetch Git repository URLs. - Validation Criteria: - The acceptance tests parse valid GitHub URLs successfully. - Error feedback displayed within 2 seconds of validation failure. - URL validation completes within half a second for all inputs. - Implementation Considerations: - Implement URL validation and display clear feedback in case of parsing errors. - Related Requirements: - FR2.1 (Repository Cloning) - NFR3 (Security) Code Search Using Markdown (FR1.2) - Choice: As a user, I can search for code using keywords or natural language, so that I can quickly find relevant code sections without manually browsing through files. - Rationale: This allows users to search for code using natural language, making it easier to find relevant code sections. - Validation Criteria: - Search results return in under 2 seconds for the predetermined repositories. - Language filtering correctly categorizes at least 95\\% of code files. - Implementation Considerations: - Use embedding model to convert natural language queries to vectors. - Support filtering by language, content type and extension. - Related Requirements: - FR2.2 (Code Indexing) - FR2.3 (Vector Database) - NFR1 (Performance) Search Results Display System (FR1.3) - Choice: As a user, I can view the search results with code snippets and links to the original files in the repository, so that I can efficiently evaluate search results and navigate to the full context when needed. - Validation Criteria: - 95\\% of users can correctly identify file locations from the display assessed via the SUS survey. - Code snippets maintain proper indentation and formatting (Python, Scala frontend). - Implementation Considerations: - Display code snippets with syntax highlighting (Python). - Show file path and location information. - Related Requirements: - FR1.2 (Search) - FR1.5 (Context) - NFR2 (Usability) Code Search using Code Embeddings (FR1.4) - Choice: As a user, I can filter search results by programming language, so that I can focus on code written in languages relevant to my current task. - Rationale: Developers often need to restrict searches to specific languages or file types. - Validation Criteria: - Language detection accuracy \u0026gt;95\\% across all common programming languages (see partser impl.). - Multiple simultaneous filters function correctly in 100\\% as assessed by the acceptance tests. - Implementation Considerations: - Detect and classify programming languages during indexing. - Create efficient language metadata for fast filtering. - Support multiple simultaneous language filters. - Include language identification in UI. - Related Requirements: - FR1.2 (Search) - FR1.3 (Search Results) - FR2.2 (Code Indexing) Code Context Visualization (FR1.5) - Choice: As a user, I can view the context around a code snippet in the search results, so that I can better understand how the code fits into the broader implementation. - Rationale: This allows users to view the context around a code snippet in the search results, making it easier to understand how the code fits into the broader implementation. - Validation Criteria: - Fetching a repository loads within 5 seconds for the predetermined repositories. - 90\\% of users report sufficient context for understanding code purpose (SUS) - Implementation Considerations: - Display the entire code being indexed in the search results. - When answering questions, display relevant snippets from the codebase. - Allow the user to switch between the full text and retrieved snippets. - Related Requirements: - FR1.3 (Search Results) - NFR2 (Usability) Model with Past Chat History (FR1.6) - Choice: As a user, I can ask code-related questions via chat, and the chat history is preserved, so that I can have a continuous conversation with the system. - Rationale: This allows users to have a continuous conversation with the system, making it easier to understand how the code fits into the broader implementation. - Validation Criteria: - Context-aware responses remain relevant for at least 2 consecutive related questions. - Chat history can be cleared by regenerating the index. - Implementation Considerations: - Maintain chat history within session scope. - Structure LLM prompts to include chat history and retrieved code. - Related Requirements: - FR2.4 (LLM Integration) - FR2.3 (Vector Database) - NFR2 (Usability) Repository Cloning and Management (FR2.1) - Choice: As a developer, I need the system to fetch and clone Git repositories from provided URLs, so that I can work with up-to-date code without performing these operations manually. - Rationale: This allows developers to work with up-to-date code without performing these operations manually. - Validation Criteria: - Predetermined repositories clone successfully within 30 seconds. - UI remains responsive (no blocking) during 100\\% of cloning operations. - Invalid repository URLs are handled gracefully. - Implementation Considerations: - Use Uithub for extracting the repository code. - Implement caching mechanism for previously cloned repositories. - Add repository verification to ensure valid Git URLs. - Related Requirements: - FR2.2 (Code Indexing) - NFR1 (Performance) - NFR3 (Security) Vector Database Generation for RAG (FR2.2) - Choice: As a developer, I need the system to index the code of the fetched repositories, so that I can perform fast and accurate searches across the entire codebase. - Validation Criteria: - The clusters generated by the embeddings are well defined, suggesting successful embedding generation. - Metadata correctly captures language and file type. - Implementation Considerations: - Use Qdrant for caching code embeddings. - Utilize metadata to enhance search results relevance. - Related Requirements: - FR2.2 (Code Indexing) - NFR1 (Performance) Semantic Search (FR2.3) - Choice: As a developer, I need the system to use a vector database to store code embeddings, so that I can perform semantic searches that understand code context beyond simple keyword matching. - Validation Criteria: - Queries complete in less than 300ms for the predetermined repositories. - Vector similarity scores correctly correlate with semantic relevance as determined by the cluster analysis. - Implementation Considerations: - Configure Qdrant collection schema for code embeddings. - Use metadata for filtering specific file types. - Related Requirements: - FR1.2 (Search) - FR2.2 (Code Indexing) - NFR1 (Performance) - IR2 (Qdrant) LLM Integration for Natural Language Queries (FR2.4) - Choice: As a developer, I need the system to integrate with an LLM to process natural language queries, so that I can interact with the codebase using plain English rather than specialized query syntax. - Validation Criteria: - Ollama integration successfully handles queries within tests without errors. - Implementation Considerations: - Implement prompt engineering techniques to guide responses (i.e. conditional RAG, search by file type). - Design context management for large repositories (limit the amount of tokens being processed). - Related Requirements: - FR1.6 (Chat Functionality) - NFR1 (Performance) - NFR3 (Security) System Performance Optimization (NFR1) - Choice: The system will index code for targeted repositories within 10 seconds on the specified hardware. - Success Criteria: - Search queries return results in under 40 seconds for the predetermined repositories. - Embedding generation completes in under 30 seconds for the predetermined repositories. - Chat responses for simple search (without code context) arrives within 20 seconds for all tests. - Implementation Considerations: - Embeddings are stored in the Qdrant vector database. - The embeddings for retrieving chunks are generated using Ollama. - Related Requirements: - FR2.1 (Repository Cloning) - FR2.2 (Code Indexing) - FR1.2 (Search) - FR1.6 (Chat) System Usability Testing (NFR2) - Choice: The system should achieve a System Usability Scale (SUS) score of 70+ based on at least 5 target users. - Rationale: This ensures that the system is usable, as evaluated by a group of users. - Validation Criteria: - First-time users find the interface easy to use without assistance in \u0026gt;70\\% of cases (SUS) - 80\\% of users rate UI intuitiveness as \u0026#34;good\u0026#34; or \u0026#34;excellent\u0026#34; (SUS) - Implementation Considerations: - Implement clean, intuitive UI. Rely on established UX design patterns by using Gradio components. - Related Requirements: - FR1.3 (Search Results) - FR1.5 (Context) - FR1.6 (Chat) User Interface Security (NFR3) - Choice: The system should sanitize user search query inputs to prevent Cross-Site Scripting (XSS) attacks. - Validation Criteria: - 100\\% of malformed/malicious URLs rejected before processing - Implementation Considerations: - Validate repository URLs against known valid patterns. - Test against standard Github URLs. - Related Requirements: - FR1.1 (Repository Input) - FR1.2 (Search) - FR1.6 (Chat) Embedding Visualization Requirement (NFR4) - Choice: A 2D visualization tool will display code embeddings to help analyze and improve indexing and search. - Validation Criteria: - Visualization correctly clusters similar code types. - Report analysis identifies strategies for improving search quality. - Implementation Considerations: - Implement dimension reduction techniques (t-SNE, UMAP) for 2D visualization. - Used to make informed decisions about the search quality. - Related Requirements: - FR2.2 (Code Indexing) - FR2.3 (Vector Database) Scala Implementation Requirement (IR1) - Choice: The system should be implemented in Scala, following functional programming principles. - Success Criteria: - Scala tools are used to ensure consisent. - Functional programming patterns are used to ensure consistent code quality. - Implementation Considerations: - Use appropriate abstraction mechanisms (strategies, factories, memoization, etc.) - Implement error handling using functional approaches (Try, Option) - Related Requirements: - FR2.2 (Code Indexing) - FR2.3 (Vector Database) - FR2.4 (LLM Integration) Qdrant Vector Database Requirement (IR2) - Choice: The system should use Qdrant as the vector database for code embeddings. - Rationale: Qdrant provides efficient vector search capabilities with filtering options needed for code search. - Validation Criteria: - Qdrant client wrapper handles all required vector operations. - Collection schemas designed for code embeddings and text embeddings. - Implementation Considerations: - Implement the AIServices wrapper around the Qdrant module. Configure distance metrics. - Related Requirements: - FR2.3 (Vector Database) - NFR1 (Performance) Ollama Integration Requirement (IR3) - Choice: The system should integrate with Ollama for LLM functionalities. - Rationale: Ollama provides locally-hosted LLM models, yielding good privacy and reduced latency. - Success Criteria: - The application successfully communicates with Ollama. - Implementation Considerations: - Implement the AIServices wrapper around the Ollama module. - Use prompt templates optimized for code understanding. - Related Requirements: - FR1.6 (Chat) - FR2.4 (LLM Integration) - NFR1 (Performance) Layered Architecture (IR4) - Choice: The system should follow a layered architecture approach, ensuring better modularity. - Success Criteria: - All components separated into Presentation, Application, Domain, and Infrastructure layers. - The separation is assessed via ArchUnit tests. ","permalink":"https://rfvasile.github.io/git-inspector/process/static/requirement-specifications/","summary":"\u003cdiv class=\"highlight\"\u003e\u003cpre tabindex=\"0\" class=\"chroma\"\u003e\u003ccode class=\"language-md\" data-lang=\"md\"\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003eCode Search Productivity (BR1)\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e\u003cspan class=\"k\"\u003e-\u003c/span\u003e Choice: Enable developers to efficiently search and understand code within Git repositories.\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e  \u003cspan class=\"k\"\u003e-\u003c/span\u003e Rationale: Developers spend significant time searching through codebases, and improving this process directly impacts productivity.\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e  \u003cspan class=\"k\"\u003e-\u003c/span\u003e Validation Criteria:\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e At least 85\\% of test users report improved workflow efficiency in post-usage surveys (SUS).\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e Average query-to-result time under 10 seconds for the predetermined repositories.\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e  \u003cspan class=\"k\"\u003e-\u003c/span\u003e Implementation Considerations:\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e Ensure integration with common development workflows. This can be done using Gradio interfaces.\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e Focus on search result quality and relevance (i.e. analyze the effectiveness of the generated embeddings).\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e  \u003cspan class=\"k\"\u003e-\u003c/span\u003e Related Requirements:\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e FR1.2 (Search)\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e FR1.5 (Context)\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e FR1.6 (Chat)\n\u003c/span\u003e\u003c/span\u003e\u003c/code\u003e\u003c/pre\u003e\u003c/div\u003e\u003cdiv class=\"highlight\"\u003e\u003cpre tabindex=\"0\" class=\"chroma\"\u003e\u003ccode class=\"language-md\" data-lang=\"md\"\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003eImproving Code Understanding (BR2)\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e\u003cspan class=\"k\"\u003e-\u003c/span\u003e Choice: Improve developer productivity by facilitating code search/understanding workflows.\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e  \u003cspan class=\"k\"\u003e-\u003c/span\u003e Rationale: Understanding existing code is often more time-consuming than writing new code.\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e  \u003cspan class=\"k\"\u003e-\u003c/span\u003e Validation Criteria:\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e Code explanations rated as \u0026#34;accurate and helpful\u0026#34; by at least 70\\% of test users (SUS).\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e  \u003cspan class=\"k\"\u003e-\u003c/span\u003e Implementation Considerations:\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e Implement contextual code explanations (i.e. use separate models that understand code and natural language).\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e Provide relationship visualization between embedding by using a 2D visualization tool.\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e Prioritize speed and accuracy in responses by allowing the users to select any open source model.\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e  \u003cspan class=\"k\"\u003e-\u003c/span\u003e Related Requirements:\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e FR1.6 (Chat)\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e FR2.4 (LLM integration)\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e NFR2 (Usability)\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e NFR4 (Visualization)\n\u003c/span\u003e\u003c/span\u003e\u003c/code\u003e\u003c/pre\u003e\u003c/div\u003e\u003cdiv class=\"highlight\"\u003e\u003cpre tabindex=\"0\" class=\"chroma\"\u003e\u003ccode class=\"language-md\" data-lang=\"md\"\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003eRepository URL Input Interface (FR1.1)\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e\u003cspan class=\"k\"\u003e-\u003c/span\u003e Choice: As a user, I can specify a Git repository URL to inspect its code, so that I can access and analyze specific codebases I\u0026#39;m interested in.\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e  \u003cspan class=\"k\"\u003e-\u003c/span\u003e Rationale: The system needs a secure and user-friendly way to fetch Git repository URLs.\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e  \u003cspan class=\"k\"\u003e-\u003c/span\u003e Validation Criteria:\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e The acceptance tests parse valid GitHub URLs successfully.\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e Error feedback displayed within 2 seconds of validation failure.\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e URL validation completes within half a second for all inputs.\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e  \u003cspan class=\"k\"\u003e-\u003c/span\u003e Implementation Considerations:\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e Implement URL validation and display clear feedback in case of parsing errors.\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e  \u003cspan class=\"k\"\u003e-\u003c/span\u003e Related Requirements:\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e FR2.1 (Repository Cloning)\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e NFR3 (Security)\n\u003c/span\u003e\u003c/span\u003e\u003c/code\u003e\u003c/pre\u003e\u003c/div\u003e\u003cdiv class=\"highlight\"\u003e\u003cpre tabindex=\"0\" class=\"chroma\"\u003e\u003ccode class=\"language-md\" data-lang=\"md\"\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003eCode Search Using Markdown (FR1.2)\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e\u003cspan class=\"k\"\u003e-\u003c/span\u003e Choice: As a user, I can search for code using keywords or natural language, so that I can quickly find relevant code sections without manually browsing through files.\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e  \u003cspan class=\"k\"\u003e-\u003c/span\u003e Rationale: This allows users to search for code using natural language, making it easier to find relevant code sections.\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e  \u003cspan class=\"k\"\u003e-\u003c/span\u003e Validation Criteria:\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e Search results return in under 2 seconds for the predetermined repositories.\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e Language filtering correctly categorizes at least 95\\% of code files.\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e  \u003cspan class=\"k\"\u003e-\u003c/span\u003e Implementation Considerations:\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e Use embedding model to convert natural language queries to vectors.\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e Support filtering by language, content type and extension.\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e  \u003cspan class=\"k\"\u003e-\u003c/span\u003e Related Requirements:\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e FR2.2 (Code Indexing)\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e FR2.3 (Vector Database)\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e NFR1 (Performance)\n\u003c/span\u003e\u003c/span\u003e\u003c/code\u003e\u003c/pre\u003e\u003c/div\u003e\u003cdiv class=\"highlight\"\u003e\u003cpre tabindex=\"0\" class=\"chroma\"\u003e\u003ccode class=\"language-md\" data-lang=\"md\"\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003eSearch Results Display System (FR1.3)\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e\u003cspan class=\"k\"\u003e-\u003c/span\u003e Choice: As a user, I can view the search results with code snippets and links to the original files in the repository, so that I can efficiently evaluate search results and navigate to the full context when needed.\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e  \u003cspan class=\"k\"\u003e-\u003c/span\u003e Validation Criteria:\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e 95\\% of users can correctly identify file locations from the display assessed via the SUS survey.\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e Code snippets maintain proper indentation and formatting (Python, Scala frontend).\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e  \u003cspan class=\"k\"\u003e-\u003c/span\u003e Implementation Considerations:\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e Display code snippets with syntax highlighting (Python).\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e Show file path and location information.\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e  \u003cspan class=\"k\"\u003e-\u003c/span\u003e Related Requirements:\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e FR1.2 (Search)\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e FR1.5 (Context)\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e NFR2 (Usability)\n\u003c/span\u003e\u003c/span\u003e\u003c/code\u003e\u003c/pre\u003e\u003c/div\u003e\u003cdiv class=\"highlight\"\u003e\u003cpre tabindex=\"0\" class=\"chroma\"\u003e\u003ccode class=\"language-md\" data-lang=\"md\"\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003eCode Search using Code Embeddings (FR1.4)\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e\u003cspan class=\"k\"\u003e-\u003c/span\u003e Choice: As a user, I can filter search results by programming language, so that I can focus on code written in languages relevant to my current task.\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e  \u003cspan class=\"k\"\u003e-\u003c/span\u003e Rationale: Developers often need to restrict searches to specific languages or file types.\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e  \u003cspan class=\"k\"\u003e-\u003c/span\u003e Validation Criteria:\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e Language detection accuracy \u0026gt;95\\% across all common programming languages (see partser impl.).\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e Multiple simultaneous filters function correctly in 100\\% as assessed by the acceptance tests.\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e  \u003cspan class=\"k\"\u003e-\u003c/span\u003e Implementation Considerations:\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e Detect and classify programming languages during indexing.\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e Create efficient language metadata for fast filtering.\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e Support multiple simultaneous language filters.\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e Include language identification in UI.\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e  \u003cspan class=\"k\"\u003e-\u003c/span\u003e Related Requirements:\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e FR1.2 (Search)\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e FR1.3 (Search Results)\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e FR2.2 (Code Indexing)\n\u003c/span\u003e\u003c/span\u003e\u003c/code\u003e\u003c/pre\u003e\u003c/div\u003e\u003cdiv class=\"highlight\"\u003e\u003cpre tabindex=\"0\" class=\"chroma\"\u003e\u003ccode class=\"language-md\" data-lang=\"md\"\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003eCode Context Visualization (FR1.5)\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e\u003cspan class=\"k\"\u003e-\u003c/span\u003e Choice: As a user, I can view the context around a code snippet in the search results, so that I can better understand how the code fits into the broader implementation.\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e  \u003cspan class=\"k\"\u003e-\u003c/span\u003e Rationale: This allows users to view the context around a code snippet in the search results, making it easier to understand how the code fits into the broader implementation.\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e  \u003cspan class=\"k\"\u003e-\u003c/span\u003e Validation Criteria:\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e Fetching a repository loads within 5 seconds for the predetermined repositories.\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e 90\\% of users report sufficient context for understanding code purpose (SUS)\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e  \u003cspan class=\"k\"\u003e-\u003c/span\u003e Implementation Considerations:\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e Display the entire code being indexed in the search results.\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e When answering questions, display relevant snippets from the codebase.\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e Allow the user to switch between the full text and retrieved snippets.\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e  \u003cspan class=\"k\"\u003e-\u003c/span\u003e Related Requirements:\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e FR1.3 (Search Results)\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e NFR2 (Usability)\n\u003c/span\u003e\u003c/span\u003e\u003c/code\u003e\u003c/pre\u003e\u003c/div\u003e\u003cdiv class=\"highlight\"\u003e\u003cpre tabindex=\"0\" class=\"chroma\"\u003e\u003ccode class=\"language-md\" data-lang=\"md\"\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003eModel with Past Chat History (FR1.6)\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e\u003cspan class=\"k\"\u003e-\u003c/span\u003e Choice: As a user, I can ask code-related questions via chat, and the chat history is preserved, so that I can have a continuous conversation with the system.\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e  \u003cspan class=\"k\"\u003e-\u003c/span\u003e Rationale: This allows users to have a continuous conversation with the system, making it easier to understand how the code fits into the broader implementation.\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e  \u003cspan class=\"k\"\u003e-\u003c/span\u003e Validation Criteria:\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e Context-aware responses remain relevant for at least 2 consecutive related questions.\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e Chat history can be cleared by regenerating the index.\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e  \u003cspan class=\"k\"\u003e-\u003c/span\u003e Implementation Considerations:\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e Maintain chat history within session scope.\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e Structure LLM prompts to include chat history and retrieved code.\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e  \u003cspan class=\"k\"\u003e-\u003c/span\u003e Related Requirements:\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e FR2.4 (LLM Integration)\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e FR2.3 (Vector Database)\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e NFR2 (Usability)\n\u003c/span\u003e\u003c/span\u003e\u003c/code\u003e\u003c/pre\u003e\u003c/div\u003e\u003cdiv class=\"highlight\"\u003e\u003cpre tabindex=\"0\" class=\"chroma\"\u003e\u003ccode class=\"language-md\" data-lang=\"md\"\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003eRepository Cloning and Management (FR2.1)\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e\u003cspan class=\"k\"\u003e-\u003c/span\u003e Choice: As a developer, I need the system to fetch and clone Git repositories from provided URLs, so that I can work with up-to-date code without performing these operations manually.\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e  \u003cspan class=\"k\"\u003e-\u003c/span\u003e Rationale: This allows developers to work with up-to-date code without performing these operations manually.\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e  \u003cspan class=\"k\"\u003e-\u003c/span\u003e Validation Criteria:\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e Predetermined repositories clone successfully within 30 seconds.\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e UI remains responsive (no blocking) during 100\\% of cloning operations.\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e Invalid repository URLs are handled gracefully.\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e  \u003cspan class=\"k\"\u003e-\u003c/span\u003e Implementation Considerations:\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e Use Uithub for extracting the repository code.\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e Implement caching mechanism for previously cloned repositories.\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e Add repository verification to ensure valid Git URLs.\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e  \u003cspan class=\"k\"\u003e-\u003c/span\u003e Related Requirements:\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e FR2.2 (Code Indexing)\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e NFR1 (Performance)\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e NFR3 (Security)\n\u003c/span\u003e\u003c/span\u003e\u003c/code\u003e\u003c/pre\u003e\u003c/div\u003e\u003cdiv class=\"highlight\"\u003e\u003cpre tabindex=\"0\" class=\"chroma\"\u003e\u003ccode class=\"language-md\" data-lang=\"md\"\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003eVector Database Generation for RAG (FR2.2)\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e\u003cspan class=\"k\"\u003e-\u003c/span\u003e Choice: As a developer, I need the system to index the code of the fetched repositories, so that I can perform fast and accurate searches across the entire codebase.\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e  \u003cspan class=\"k\"\u003e-\u003c/span\u003e Validation Criteria:\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e The clusters generated by the embeddings are well defined, suggesting successful embedding generation.\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e Metadata correctly captures language and file type.\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e  \u003cspan class=\"k\"\u003e-\u003c/span\u003e Implementation Considerations:\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e Use Qdrant for caching code embeddings.\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e Utilize metadata to enhance search results relevance.\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e  \u003cspan class=\"k\"\u003e-\u003c/span\u003e Related Requirements:\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e FR2.2 (Code Indexing)\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e NFR1 (Performance)\n\u003c/span\u003e\u003c/span\u003e\u003c/code\u003e\u003c/pre\u003e\u003c/div\u003e\u003cdiv class=\"highlight\"\u003e\u003cpre tabindex=\"0\" class=\"chroma\"\u003e\u003ccode class=\"language-md\" data-lang=\"md\"\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003eSemantic Search (FR2.3)\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e\u003cspan class=\"k\"\u003e-\u003c/span\u003e Choice: As a developer, I need the system to use a vector database to store code embeddings, so that I can perform semantic searches that understand code context beyond simple keyword matching.\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e  \u003cspan class=\"k\"\u003e-\u003c/span\u003e Validation Criteria:\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e Queries complete in less than 300ms for the predetermined repositories.\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e Vector similarity scores correctly correlate with semantic relevance as determined by the cluster analysis.\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e  \u003cspan class=\"k\"\u003e-\u003c/span\u003e Implementation Considerations:\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e Configure Qdrant collection schema for code embeddings.\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e Use metadata for filtering specific file types.\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e  \u003cspan class=\"k\"\u003e-\u003c/span\u003e Related Requirements:\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e FR1.2 (Search)\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e FR2.2 (Code Indexing)\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e NFR1 (Performance)\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e IR2 (Qdrant)\n\u003c/span\u003e\u003c/span\u003e\u003c/code\u003e\u003c/pre\u003e\u003c/div\u003e\u003cdiv class=\"highlight\"\u003e\u003cpre tabindex=\"0\" class=\"chroma\"\u003e\u003ccode class=\"language-md\" data-lang=\"md\"\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003eLLM Integration for Natural Language Queries (FR2.4)\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e\u003cspan class=\"k\"\u003e-\u003c/span\u003e Choice: As a developer, I need the system to integrate with an LLM to process natural language queries, so that I can interact with the codebase using plain English rather than specialized query syntax.\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e  \u003cspan class=\"k\"\u003e-\u003c/span\u003e Validation Criteria:\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e Ollama integration successfully handles queries within tests without errors.\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e  \u003cspan class=\"k\"\u003e-\u003c/span\u003e Implementation Considerations:\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e Implement prompt engineering techniques to guide responses (i.e. conditional RAG, search by file type).\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e Design context management for large repositories (limit the amount of tokens being processed).\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e  \u003cspan class=\"k\"\u003e-\u003c/span\u003e Related Requirements:\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e FR1.6 (Chat Functionality)\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e NFR1 (Performance)\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e NFR3 (Security)\n\u003c/span\u003e\u003c/span\u003e\u003c/code\u003e\u003c/pre\u003e\u003c/div\u003e\u003cdiv class=\"highlight\"\u003e\u003cpre tabindex=\"0\" class=\"chroma\"\u003e\u003ccode class=\"language-md\" data-lang=\"md\"\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003eSystem Performance Optimization (NFR1)\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e\u003cspan class=\"k\"\u003e-\u003c/span\u003e Choice: The system will index code for targeted repositories within 10 seconds on the specified hardware.\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e  \u003cspan class=\"k\"\u003e-\u003c/span\u003e Success Criteria:\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e Search queries return results in under 40 seconds for the predetermined repositories.\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e Embedding generation completes in under 30 seconds for the predetermined repositories.\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e Chat responses for simple search (without code context) arrives within 20 seconds for all tests.\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e  \u003cspan class=\"k\"\u003e-\u003c/span\u003e Implementation Considerations:\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e Embeddings are stored in the Qdrant vector database.\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e The embeddings for retrieving chunks are generated using Ollama.\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e  \u003cspan class=\"k\"\u003e-\u003c/span\u003e Related Requirements:\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e FR2.1 (Repository Cloning)\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e FR2.2 (Code Indexing)\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e FR1.2 (Search)\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e FR1.6 (Chat)\n\u003c/span\u003e\u003c/span\u003e\u003c/code\u003e\u003c/pre\u003e\u003c/div\u003e\u003cdiv class=\"highlight\"\u003e\u003cpre tabindex=\"0\" class=\"chroma\"\u003e\u003ccode class=\"language-md\" data-lang=\"md\"\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003eSystem Usability Testing (NFR2)\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e\u003cspan class=\"k\"\u003e-\u003c/span\u003e Choice: The system should achieve a System Usability Scale (SUS) score of 70+ based on at least 5 target users.\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e  \u003cspan class=\"k\"\u003e-\u003c/span\u003e Rationale: This ensures that the system is usable, as evaluated by a group of users.\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e  \u003cspan class=\"k\"\u003e-\u003c/span\u003e Validation Criteria:\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e First-time users find the interface easy to use without assistance in \u0026gt;70\\% of cases (SUS)\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e 80\\% of users rate UI intuitiveness as \u0026#34;good\u0026#34; or \u0026#34;excellent\u0026#34; (SUS)\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e  \u003cspan class=\"k\"\u003e-\u003c/span\u003e Implementation Considerations:\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e Implement clean, intuitive UI. Rely on established UX design patterns by using Gradio components.\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e  \u003cspan class=\"k\"\u003e-\u003c/span\u003e Related Requirements:\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e FR1.3 (Search Results)\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e FR1.5 (Context)\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e FR1.6 (Chat)\n\u003c/span\u003e\u003c/span\u003e\u003c/code\u003e\u003c/pre\u003e\u003c/div\u003e\u003cdiv class=\"highlight\"\u003e\u003cpre tabindex=\"0\" class=\"chroma\"\u003e\u003ccode class=\"language-md\" data-lang=\"md\"\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003eUser Interface Security (NFR3)\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e\u003cspan class=\"k\"\u003e-\u003c/span\u003e Choice: The system should sanitize user search query inputs to prevent Cross-Site Scripting (XSS) attacks.\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e  \u003cspan class=\"k\"\u003e-\u003c/span\u003e Validation Criteria:\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e 100\\% of malformed/malicious URLs rejected before processing\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e  \u003cspan class=\"k\"\u003e-\u003c/span\u003e Implementation Considerations:\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e Validate repository URLs against known valid patterns.\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e Test against standard Github URLs.\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e  \u003cspan class=\"k\"\u003e-\u003c/span\u003e Related Requirements:\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e FR1.1 (Repository Input)\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e FR1.2 (Search)\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e FR1.6 (Chat)\n\u003c/span\u003e\u003c/span\u003e\u003c/code\u003e\u003c/pre\u003e\u003c/div\u003e\u003cdiv class=\"highlight\"\u003e\u003cpre tabindex=\"0\" class=\"chroma\"\u003e\u003ccode class=\"language-md\" data-lang=\"md\"\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003eEmbedding Visualization Requirement (NFR4)\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e\u003cspan class=\"k\"\u003e-\u003c/span\u003e Choice: A 2D visualization tool will display code embeddings to help analyze and improve indexing and search.\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e  \u003cspan class=\"k\"\u003e-\u003c/span\u003e Validation Criteria:\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e Visualization correctly clusters similar code types.\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e Report analysis identifies strategies for improving search quality.\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e  \u003cspan class=\"k\"\u003e-\u003c/span\u003e Implementation Considerations:\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e Implement dimension reduction techniques (t-SNE, UMAP) for 2D visualization.\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e Used to make informed decisions about the search quality.\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e  \u003cspan class=\"k\"\u003e-\u003c/span\u003e Related Requirements:\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e FR2.2 (Code Indexing)\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e FR2.3 (Vector Database)\n\u003c/span\u003e\u003c/span\u003e\u003c/code\u003e\u003c/pre\u003e\u003c/div\u003e\u003cdiv class=\"highlight\"\u003e\u003cpre tabindex=\"0\" class=\"chroma\"\u003e\u003ccode class=\"language-md\" data-lang=\"md\"\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003eScala Implementation Requirement (IR1)\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e\u003cspan class=\"k\"\u003e-\u003c/span\u003e Choice: The system should be implemented in Scala, following functional programming principles.\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e  \u003cspan class=\"k\"\u003e-\u003c/span\u003e Success Criteria:\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e Scala tools are used to ensure consisent.\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e Functional programming patterns are used to ensure consistent code quality.\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e  \u003cspan class=\"k\"\u003e-\u003c/span\u003e Implementation Considerations:\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e Use appropriate abstraction mechanisms (strategies, factories, memoization, etc.)\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e Implement error handling using functional approaches (Try, Option)\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e  \u003cspan class=\"k\"\u003e-\u003c/span\u003e Related Requirements:\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e FR2.2 (Code Indexing)\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e FR2.3 (Vector Database)\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e FR2.4 (LLM Integration)\n\u003c/span\u003e\u003c/span\u003e\u003c/code\u003e\u003c/pre\u003e\u003c/div\u003e\u003cdiv class=\"highlight\"\u003e\u003cpre tabindex=\"0\" class=\"chroma\"\u003e\u003ccode class=\"language-md\" data-lang=\"md\"\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003eQdrant Vector Database Requirement (IR2)\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e\u003cspan class=\"k\"\u003e-\u003c/span\u003e Choice: The system should use Qdrant as the vector database for code embeddings.\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e  \u003cspan class=\"k\"\u003e-\u003c/span\u003e Rationale: Qdrant provides efficient vector search capabilities with filtering options needed for code search.\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e  \u003cspan class=\"k\"\u003e-\u003c/span\u003e Validation Criteria:\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e Qdrant client wrapper handles all required vector operations.\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e Collection schemas designed for code embeddings and text embeddings.\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e  \u003cspan class=\"k\"\u003e-\u003c/span\u003e Implementation Considerations:\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e Implement the AIServices wrapper around the Qdrant module. Configure distance metrics.\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e  \u003cspan class=\"k\"\u003e-\u003c/span\u003e Related Requirements:\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e FR2.3 (Vector Database)\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e NFR1 (Performance)\n\u003c/span\u003e\u003c/span\u003e\u003c/code\u003e\u003c/pre\u003e\u003c/div\u003e\u003cdiv class=\"highlight\"\u003e\u003cpre tabindex=\"0\" class=\"chroma\"\u003e\u003ccode class=\"language-md\" data-lang=\"md\"\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003eOllama Integration Requirement (IR3)\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e\u003cspan class=\"k\"\u003e-\u003c/span\u003e Choice: The system should integrate with Ollama for LLM functionalities.\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e  \u003cspan class=\"k\"\u003e-\u003c/span\u003e Rationale: Ollama provides locally-hosted LLM models, yielding good privacy and reduced latency.\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e  \u003cspan class=\"k\"\u003e-\u003c/span\u003e Success Criteria:\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e The application successfully communicates with Ollama.\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e  \u003cspan class=\"k\"\u003e-\u003c/span\u003e Implementation Considerations:\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e Implement the AIServices wrapper around the Ollama module.\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e Use prompt templates optimized for code understanding.\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e  \u003cspan class=\"k\"\u003e-\u003c/span\u003e Related Requirements:\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e FR1.6 (Chat)\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e FR2.4 (LLM Integration)\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e NFR1 (Performance)\n\u003c/span\u003e\u003c/span\u003e\u003c/code\u003e\u003c/pre\u003e\u003c/div\u003e\u003cdiv class=\"highlight\"\u003e\u003cpre tabindex=\"0\" class=\"chroma\"\u003e\u003ccode class=\"language-md\" data-lang=\"md\"\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003eLayered Architecture (IR4)\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e\u003cspan class=\"k\"\u003e-\u003c/span\u003e Choice: The system should follow a layered architecture approach, ensuring better modularity.\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e  \u003cspan class=\"k\"\u003e-\u003c/span\u003e Success Criteria:\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e All components separated into Presentation, Application, Domain, and Infrastructure layers.\n\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e    \u003cspan class=\"k\"\u003e-\u003c/span\u003e The separation is assessed via ArchUnit tests.\n\u003c/span\u003e\u003c/span\u003e\u003c/code\u003e\u003c/pre\u003e\u003c/div\u003e","title":""},{"content":" Req ID Description BR1 Allow users to efficiently search and understand code within Git repositories. BR2 Improve developer productivity by facilitating code search/understanding workflows. FR1.1 As a user, I can specify a Git repository URL to inspect its code. FR1.2 As a user, I can search for code using keywords or natural language queries. FR1.3 As a user, I can view the search results with code snippets and links to the original files in the repository. FR1.4 As a user, I can filter search results by programming language. FR1.5 As a user, I can view the context around a code snippet in the search results. FR1.6 As a user, I can ask code-related questions via chat, and the chat history is preserved. FR2.1 As a developer, I need the system to fetch and clone Git repositories from provided URLs. FR2.2 As a developer, I need the system to index the code of the fetched repositories, to generate fast responses. FR2.3 As a developer, I need the system to use a vector database to store code embeddings for semantic search. FR2.4 As a developer, I need the system to integrate with an LLM to process natural language queries. NFR1 The system will index code for targeted repositories within 10 seconds on the specified hardware. NFR2 The system should achieve a System Usability Scale (SUS) score of 70+ based on at least 5 target users. NFR3 The system should sanitize user search query inputs to prevent Cross-Site Scripting (XSS) attacks. NFR4 A 2D visualization tool will display code embeddings to help analyze and improve indexing and search. IR1 The system should be implemented in Scala, following functional programming principles. IR2 The system should use Qdrant as the vector database for code embeddings. IR3 The system should integrate with Ollama for LLM functionalities. IR4 The system should follow a layered architecture approach, ensuring better modularity. ","permalink":"https://rfvasile.github.io/git-inspector/process/static/requirements/","summary":"\u003ctable class=\"custom_table_style\"\u003e\n  \u003cthead\u003e\n      \u003ctr\u003e\n          \u003cth style=\"text-align: left\"\u003eReq ID\u003c/th\u003e\n          \u003cth style=\"text-align: left\"\u003eDescription\u003c/th\u003e\n      \u003c/tr\u003e\n  \u003c/thead\u003e\n  \u003ctbody\u003e\n      \u003ctr\u003e\n          \u003ctd style=\"text-align: left\"\u003eBR1\u003c/td\u003e\n          \u003ctd style=\"text-align: left\"\u003eAllow users to efficiently search and understand code within Git repositories.\u003c/td\u003e\n      \u003c/tr\u003e\n      \u003ctr\u003e\n          \u003ctd style=\"text-align: left\"\u003eBR2\u003c/td\u003e\n          \u003ctd style=\"text-align: left\"\u003eImprove developer productivity by facilitating code search/understanding workflows.\u003c/td\u003e\n      \u003c/tr\u003e\n      \u003ctr\u003e\n          \u003ctd style=\"text-align: left\"\u003eFR1.1\u003c/td\u003e\n          \u003ctd style=\"text-align: left\"\u003eAs a user, I can specify a Git repository URL to inspect its code.\u003c/td\u003e\n      \u003c/tr\u003e\n      \u003ctr\u003e\n          \u003ctd style=\"text-align: left\"\u003eFR1.2\u003c/td\u003e\n          \u003ctd style=\"text-align: left\"\u003eAs a user, I can search for code using keywords or natural language queries.\u003c/td\u003e\n      \u003c/tr\u003e\n      \u003ctr\u003e\n          \u003ctd style=\"text-align: left\"\u003eFR1.3\u003c/td\u003e\n          \u003ctd style=\"text-align: left\"\u003eAs a user, I can view the search results with code snippets and links to the original files in the repository.\u003c/td\u003e\n      \u003c/tr\u003e\n      \u003ctr\u003e\n          \u003ctd style=\"text-align: left\"\u003eFR1.4\u003c/td\u003e\n          \u003ctd style=\"text-align: left\"\u003eAs a user, I can filter search results by programming language.\u003c/td\u003e\n      \u003c/tr\u003e\n      \u003ctr\u003e\n          \u003ctd style=\"text-align: left\"\u003eFR1.5\u003c/td\u003e\n          \u003ctd style=\"text-align: left\"\u003eAs a user, I can view the context around a code snippet in the search results.\u003c/td\u003e\n      \u003c/tr\u003e\n      \u003ctr\u003e\n          \u003ctd style=\"text-align: left\"\u003eFR1.6\u003c/td\u003e\n          \u003ctd style=\"text-align: left\"\u003eAs a user, I can ask code-related questions via chat, and the chat history is preserved.\u003c/td\u003e\n      \u003c/tr\u003e\n      \u003ctr\u003e\n          \u003ctd style=\"text-align: left\"\u003eFR2.1\u003c/td\u003e\n          \u003ctd style=\"text-align: left\"\u003eAs a developer, I need the system to fetch and clone Git repositories from provided URLs.\u003c/td\u003e\n      \u003c/tr\u003e\n      \u003ctr\u003e\n          \u003ctd style=\"text-align: left\"\u003eFR2.2\u003c/td\u003e\n          \u003ctd style=\"text-align: left\"\u003eAs a developer, I need the system to index the code of the fetched repositories, to generate fast responses.\u003c/td\u003e\n      \u003c/tr\u003e\n      \u003ctr\u003e\n          \u003ctd style=\"text-align: left\"\u003eFR2.3\u003c/td\u003e\n          \u003ctd style=\"text-align: left\"\u003eAs a developer, I need the system to use a vector database to store code embeddings for semantic search.\u003c/td\u003e\n      \u003c/tr\u003e\n      \u003ctr\u003e\n          \u003ctd style=\"text-align: left\"\u003eFR2.4\u003c/td\u003e\n          \u003ctd style=\"text-align: left\"\u003eAs a developer, I need the system to integrate with an LLM to process natural language queries.\u003c/td\u003e\n      \u003c/tr\u003e\n      \u003ctr\u003e\n          \u003ctd style=\"text-align: left\"\u003eNFR1\u003c/td\u003e\n          \u003ctd style=\"text-align: left\"\u003eThe system will index code for targeted repositories within 10 seconds on the specified hardware.\u003c/td\u003e\n      \u003c/tr\u003e\n      \u003ctr\u003e\n          \u003ctd style=\"text-align: left\"\u003eNFR2\u003c/td\u003e\n          \u003ctd style=\"text-align: left\"\u003eThe system should achieve a System Usability Scale (SUS) score of 70+ based on at least 5 target users.\u003c/td\u003e\n      \u003c/tr\u003e\n      \u003ctr\u003e\n          \u003ctd style=\"text-align: left\"\u003eNFR3\u003c/td\u003e\n          \u003ctd style=\"text-align: left\"\u003eThe system should sanitize user search query inputs to prevent Cross-Site Scripting (XSS) attacks.\u003c/td\u003e\n      \u003c/tr\u003e\n      \u003ctr\u003e\n          \u003ctd style=\"text-align: left\"\u003eNFR4\u003c/td\u003e\n          \u003ctd style=\"text-align: left\"\u003eA 2D visualization tool will display code embeddings to help analyze and improve indexing and search.\u003c/td\u003e\n      \u003c/tr\u003e\n      \u003ctr\u003e\n          \u003ctd style=\"text-align: left\"\u003eIR1\u003c/td\u003e\n          \u003ctd style=\"text-align: left\"\u003eThe system should be implemented in Scala, following functional programming principles.\u003c/td\u003e\n      \u003c/tr\u003e\n      \u003ctr\u003e\n          \u003ctd style=\"text-align: left\"\u003eIR2\u003c/td\u003e\n          \u003ctd style=\"text-align: left\"\u003eThe system should use Qdrant as the vector database for code embeddings.\u003c/td\u003e\n      \u003c/tr\u003e\n      \u003ctr\u003e\n          \u003ctd style=\"text-align: left\"\u003eIR3\u003c/td\u003e\n          \u003ctd style=\"text-align: left\"\u003eThe system should integrate with Ollama for LLM functionalities.\u003c/td\u003e\n      \u003c/tr\u003e\n      \u003ctr\u003e\n          \u003ctd style=\"text-align: left\"\u003eIR4\u003c/td\u003e\n          \u003ctd style=\"text-align: left\"\u003eThe system should follow a layered architecture approach, ensuring better modularity.\u003c/td\u003e\n      \u003c/tr\u003e\n  \u003c/tbody\u003e\n\u003c/table\u003e\n\n\u003cstyle\u003e\n.post-content table:not(.lntable .highlighttable,.highlight table,.gist .highlight){\n   \n  background-color: transparent;\n  border-radius: 6px;\n  border: 1px solid black;\n  outline: 2px solid black;\n  overflow-x: auto;\n  table-layout: fixed;\n  word-break: break-all;\n  font-size: 12px;\n}\n\n.dark .post-content table:not(.lntable .highlighttable,.highlight table,.gist .highlight){\n  outline: 2px solid rgb(54, 156, 95);\n}\n\n.post-content table:not(.lntable .highlighttable,.highlight table,.gist .highlight) thead{\n  background-color: #545d7b8a;\n}\n\n.dark .post-content table:not(.lntable .highlighttable,.highlight table,.gist .highlight) thead{\n  background-color: rgb(62, 62, 62);\n}\n\n.post-content table:not(.lntable .highlighttable,.highlight table,.gist .highlight) td,\n.post-content table:not(.lntable .highlighttable,.highlight table,.gist .highlight) tr,\n.post-content table:not(.lntable .highlighttable,.highlight table,.gist .highlight) th{\n  border-bottom: unset;\n  border: 1px solid black,\n}\n\n.post-content table:not(.lntable .highlighttable,.highlight table,.gist .highlight) td:hover,\n.post-content table:not(.lntable .highlighttable,.highlight table,.gist .highlight) td:focus{\n  background-color: rgba(67, 166, 86, 0.8);\n}\n\n.dark .post-content table:not(.lntable .highlighttable,.highlight table,.gist .highlight) td:hover,\n.dark .post-content table:not(.lntable .highlighttable,.highlight table,.gist .highlight) td:focus{\n  background-color: rgb(0, 0, 0, 0.7);\n}\n\u003c/style\u003e","title":""},{"content":" Requirement Design element Implementation Evidence Done BR1: Search Productivity Project-wide BusinessRequirementsSuite.scala\nSUS Questionnaire\nEmbedding Diagrams ✓ BR2: Improve Code Understanding Project-wide createTextEmbeddingModel\ncreateCodeEmbeddingModel\nPython/Scala Frontend\nSUS Questionnaire\nEmbedding Diagrams ✓ FR1.1: Repository URL Input Interface GithubWrapperService.scala UserFunctionalRequirementsSuite ✓ FR1.2: Code Search Using Markdown QdrantEmbeddingStore.scala\nGithubWrapperService.scala UserFunctionalRequirementsSuite ✓ FR1.3: Search Results Display System Scala frontend\nPython frontend SUS Questionnaire ✓ FR1.4: Code Search using Code QdrantEmbeddingStore.scala\nGithubWrapperService.scala UserFunctionalRequirementsSuite ✓ (see related FR1.2) FR1.5: Code Context Visualization Scala frontend\nPython frontend\nRepositoryWithLanguages\nGithubWrapperService UserFunctionalRequirementsSuite\nSUS Questionnaire ✓ FR1.6: Model with Past Chat History Pipeline.scala\nRAGComponentFactory.scala UserFunctionalRequirementsSuite ✓ FR2.1: Repository Cloning GithubWrapperService.scala\nFetchingService.scala SystemFunctionalRequirementsSuite ✓ FR2.2: Vector Database Generation IngestorService.scala\nCacheService.scala\nQdrantEmbeddingStore.scala SystemFunctionalRequirementsSuite ✓ FR2.3: Vector Database Implementation QdrantEmbeddingStore.scala\nGithubWrapperService.scala UserFunctionalRequirementsSuite ✓ FR2.4: LLM Integration for Code QueryRoutingStrategy.scala\nQueryFilterService.scala\nChatService.scala SystemFunctionalRequirementsSuite ✓ NFR1: Performance Optimization ChatService.scala\nCacheService.scala\nIngestorService.scala\nGithubWrapperService.scala NonFunctionalRequirementsSuite ✓ NFR2: System Usability Optimization GithubWrapperService.scala\nScala frontend\nPython frontend NonFunctionalRequirementsSuite\nSUS Questionnaire ✓ NFR3: User Interface Security Scala frontend\nPython frontend NonFunctionalRequirementsSuite ✓ NFR4: Embedding Visualization IngestorService.scala Final Report ✓ IR1: Scala Implementation (declarative programming) Project-wide Adherence to the Gemini style guide ✓ IR2: Qdrant Vector Database IngestorService.scala\nComponentFactory.scala application.conf ✓ IR3: Ollama Integration QueryRoutingStrategy.scala\nQueryFilterService.scala\nChatService.scala application.conf ✓ IR4: Layered Architecture Project-wide ArchUnit tests ✓ ","permalink":"https://rfvasile.github.io/git-inspector/process/static/traceability-matrix/","summary":"\u003c!-- markdownlint-disable MD033 --\u003e\n\u003ctable style=\"display: table;\"\u003e\n  \u003cthead\u003e\n    \u003ctr\u003e\n      \u003cth\u003eRequirement\u003c/th\u003e\n      \u003cth\u003eDesign element\u003c/th\u003e\n      \u003cth\u003eImplementation Evidence\u003c/th\u003e\n      \u003cth\u003eDone\u003c/th\u003e\n    \u003c/tr\u003e\n  \u003c/thead\u003e\n  \u003ctbody\u003e\n    \u003ctr\u003e\n      \u003ctd\u003eBR1: Search Productivity\u003c/td\u003e\n      \u003ctd\u003eProject-wide\u003c/td\u003e\n      \u003ctd\u003eBusinessRequirementsSuite.scala\u003cbr\u003eSUS Questionnaire\u003cbr\u003eEmbedding Diagrams\u003c/td\u003e\n      \u003ctd\u003e✓\u003c/td\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e\n      \u003ctd\u003eBR2: Improve Code Understanding\u003c/td\u003e\n      \u003ctd\u003eProject-wide\u003c/td\u003e\n      \u003ctd\u003ecreateTextEmbeddingModel\u003cbr\u003ecreateCodeEmbeddingModel\u003cbr\u003ePython/Scala Frontend\u003cbr\u003eSUS Questionnaire\u003cbr\u003eEmbedding Diagrams\u003c/td\u003e\n      \u003ctd\u003e✓\u003c/td\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e\n      \u003ctd\u003eFR1.1: Repository URL Input Interface\u003c/td\u003e\n      \u003ctd\u003eGithubWrapperService.scala\u003c/td\u003e\n      \u003ctd\u003eUserFunctionalRequirementsSuite\u003c/td\u003e\n      \u003ctd\u003e✓\u003c/td\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e\n      \u003ctd\u003eFR1.2: Code Search Using Markdown\u003c/td\u003e\n      \u003ctd\u003eQdrantEmbeddingStore.scala\u003cbr\u003eGithubWrapperService.scala\u003c/td\u003e\n      \u003ctd\u003eUserFunctionalRequirementsSuite\u003c/td\u003e\n      \u003ctd\u003e✓\u003c/td\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e\n      \u003ctd\u003eFR1.3: Search Results Display System\u003c/td\u003e\n      \u003ctd\u003eScala frontend\u003cbr\u003ePython frontend\u003c/td\u003e\n      \u003ctd\u003eSUS Questionnaire\u003c/td\u003e\n      \u003ctd\u003e✓\u003c/td\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e\n      \u003ctd\u003eFR1.4: Code Search using Code\u003c/td\u003e\n      \u003ctd\u003eQdrantEmbeddingStore.scala\u003cbr\u003eGithubWrapperService.scala\u003c/td\u003e\n      \u003ctd\u003eUserFunctionalRequirementsSuite\u003c/td\u003e\n      \u003ctd\u003e✓ (see related FR1.2)\u003c/td\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e\n      \u003ctd\u003eFR1.5: Code Context Visualization\u003c/td\u003e\n      \u003ctd\u003eScala frontend\u003cbr\u003ePython frontend\u003cbr\u003eRepositoryWithLanguages\u003cbr\u003eGithubWrapperService\u003c/td\u003e\n      \u003ctd\u003eUserFunctionalRequirementsSuite\u003cbr\u003eSUS Questionnaire\u003c/td\u003e\n      \u003ctd\u003e✓\u003c/td\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e\n      \u003ctd\u003eFR1.6: Model with Past Chat History\u003c/td\u003e\n      \u003ctd\u003ePipeline.scala\u003cbr\u003eRAGComponentFactory.scala\u003c/td\u003e\n      \u003ctd\u003eUserFunctionalRequirementsSuite\u003c/td\u003e\n      \u003ctd\u003e✓\u003c/td\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e\n      \u003ctd\u003eFR2.1: Repository Cloning\u003c/td\u003e\n      \u003ctd\u003eGithubWrapperService.scala\u003cbr\u003eFetchingService.scala\u003c/td\u003e\n      \u003ctd\u003eSystemFunctionalRequirementsSuite\u003c/td\u003e\n      \u003ctd\u003e✓\u003c/td\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e\n      \u003ctd\u003eFR2.2: Vector Database Generation\u003c/td\u003e\n      \u003ctd\u003eIngestorService.scala\u003cbr\u003eCacheService.scala\u003cbr\u003eQdrantEmbeddingStore.scala\u003c/td\u003e\n      \u003ctd\u003eSystemFunctionalRequirementsSuite\u003c/td\u003e\n      \u003ctd\u003e✓\u003c/td\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e\n      \u003ctd\u003eFR2.3: Vector Database Implementation\u003c/td\u003e\n      \u003ctd\u003eQdrantEmbeddingStore.scala\u003cbr\u003eGithubWrapperService.scala\u003c/td\u003e\n      \u003ctd\u003eUserFunctionalRequirementsSuite\u003c/td\u003e\n      \u003ctd\u003e✓\u003c/td\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e\n      \u003ctd\u003eFR2.4: LLM Integration for Code\u003c/td\u003e\n      \u003ctd\u003eQueryRoutingStrategy.scala\u003cbr\u003eQueryFilterService.scala\u003cbr\u003eChatService.scala\u003c/td\u003e\n      \u003ctd\u003eSystemFunctionalRequirementsSuite\u003c/td\u003e\n      \u003ctd\u003e✓\u003c/td\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e\n      \u003ctd\u003eNFR1: Performance Optimization\u003c/td\u003e\n      \u003ctd\u003eChatService.scala\u003cbr\u003eCacheService.scala\u003cbr\u003eIngestorService.scala\u003cbr\u003eGithubWrapperService.scala\u003c/td\u003e\n      \u003ctd\u003eNonFunctionalRequirementsSuite\u003c/td\u003e\n      \u003ctd\u003e✓\u003c/td\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e\n      \u003ctd\u003eNFR2: System Usability Optimization\u003c/td\u003e\n      \u003ctd\u003eGithubWrapperService.scala\u003cbr\u003eScala frontend\u003cbr\u003ePython frontend\u003c/td\u003e\n      \u003ctd\u003eNonFunctionalRequirementsSuite\u003cbr\u003eSUS Questionnaire\u003c/td\u003e\n      \u003ctd\u003e✓\u003c/td\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e\n      \u003ctd\u003eNFR3: User Interface Security\u003c/td\u003e\n      \u003ctd\u003eScala frontend\u003cbr\u003ePython frontend\u003c/td\u003e\n      \u003ctd\u003eNonFunctionalRequirementsSuite\u003c/td\u003e\n      \u003ctd\u003e✓\u003c/td\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e\n      \u003ctd\u003eNFR4: Embedding Visualization\u003c/td\u003e\n      \u003ctd\u003eIngestorService.scala\u003c/td\u003e\n      \u003ctd\u003eFinal Report\u003c/td\u003e\n      \u003ctd\u003e✓\u003c/td\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e\n      \u003ctd\u003eIR1: Scala Implementation (declarative programming)\u003c/td\u003e\n      \u003ctd\u003eProject-wide\u003c/td\u003e\n      \u003ctd\u003eAdherence to the Gemini style guide\u003c/td\u003e\n      \u003ctd\u003e✓\u003c/td\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e\n      \u003ctd\u003eIR2: Qdrant Vector Database\u003c/td\u003e\n      \u003ctd\u003eIngestorService.scala\u003cbr\u003eComponentFactory.scala\u003c/td\u003e\n      \u003ctd\u003eapplication.conf\u003c/td\u003e\n      \u003ctd\u003e✓\u003c/td\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e\n      \u003ctd\u003eIR3: Ollama Integration\u003c/td\u003e\n      \u003ctd\u003eQueryRoutingStrategy.scala\u003cbr\u003eQueryFilterService.scala\u003cbr\u003eChatService.scala\u003c/td\u003e\n      \u003ctd\u003eapplication.conf\u003c/td\u003e\n      \u003ctd\u003e✓\u003c/td\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e\n      \u003ctd\u003eIR4: Layered Architecture\u003c/td\u003e\n      \u003ctd\u003eProject-wide\u003c/td\u003e\n      \u003ctd\u003eArchUnit tests\u003c/td\u003e\n      \u003ctd\u003e✓\u003c/td\u003e\n    \u003c/tr\u003e\n  \u003c/tbody\u003e\n\u003c/table\u003e","title":""}]