The Frontend module

This module is a Vite-powered React single-page application that provides the user interface for the system. It allows users to upload code files, automatically extracts comments from those files, queries the backend classification API, and presents the predicted categories inline with the source code for inspection.

Purpose and capabilities

The application supports Java, Python, and Pharo source files, automatically inferring the language from the file extension (.java, .py, .pharo/.cls). After upload, it parses the file, extracts individual comments, and maintains them in local React state along with the full file content and detected language. Users can then trigger classification, which sends each comment to the backend /predict endpoint, receives the predicted labels, and displays them next to the corresponding lines in the code viewer.

In our pipeline, we train models using a composite feature "comment | class_name" to provide richer semantic context, disambiguating short or ambiguous comments by incorporating class names as privileged information, which stabilizes learning on noisy datasets and improves generalization following Learning Using Privileged Information (LUPI) principles. At inference time, however, the frontend supplies only the raw "comment" string to ensure a simple, realistic user experience, avoiding the need to query class names, which may be unavailable in refactoring or snippet-based scenarios, while relying solely on the always-available comment text for seamless deployment. This balances training efficiency with practical usability.

Architecture and configuration

The project is configured as a standard Vite React app, with main.jsx mounting the root App component into the index.html container under React’s StrictMode. The package.json file defines the project metadata, development scripts (dev, build, preview, lint), and dependencies, including React, ReactDOM, and prism-react-renderer for syntax highlighting in the code viewer. The Vite configuration (vite.config.js) is minimal, enabling the React plugin and relying on Vite’s default optimizations for development and production builds.

Core application logic

The App.jsx component orchestrates the application’s main workflow using React hooks for state and effects. It tracks the uploaded file, its textual content, the selected language, the list of extracted comments, the currently chosen model type, the set of available models per language, the classification results, and a status object used by the status bar. On initial load, it queries the backend /models endpoint (through a base URL read from VITE_API_URL or defaulting to http://localhost:8080) to discover which language–model-type combinations are available and uses that information to populate the model selection UI and initial defaults.

File handling and comment extraction

When a user selects a file, the application validates the extension to ensure the language is supported, reads the file content via FileReader, and stores it in state. It then calls a utility function from utils/comment-extractor to extract comments for the detected language, producing a list of comment objects that include at least the text and line number. The app clears any existing classifications, updates the status message with the number of comments found, and, if possible, selects an appropriate default model type based on the available models for that language.

Classification workflow and caching

The classification process is initiated by the user through the UploadPanel controls. The app iterates over the list of extracted comments and, for each one, first consults a custom hook (use-classification-cache) to see if an identical (comment, language, model type) combination has already been classified. Cached results are immediately reused to avoid redundant backend calls. For uncached comments, the app formats the input using formatCommentForAPI (which can combine comment text with additional context from the surrounding file when needed) and sends a JSON payload to the backend /predict endpoint containing the text, language, and model type.

Handling API responses and UI updates

For each response, the application normalizes the prediction format (for example flattening nested arrays), stores the predictions in the cache, and merges them into the classifications state keyed by line number. Throughout this loop, it keeps track of how many comments have been successfully processed and updates an isClassifying flag used to control the UI (e.g., disabling buttons or showing a loading state). When the process completes, it sets a success status summarizing how many comments were classified out of the total, or logs and surfaces error statuses when requests fail.

User interface structure and folder organization

The user interface is composed from several dedicated components located under src/components:

Header renders the application’s top bar, title, or branding.
UploadPanel hosts file selection, language and model selectors, and action buttons for classifying or clearing the current session.
CodeViewer shows the source file content with syntax highlighting and overlays classification results on the corresponding comment lines, likely using prism-react-renderer for styling.
StatusBar displays contextual feedback such as errors, progress messages, and success notifications.

Supporting logic is organized into src/hooks (such as the classification cache hook) and src/utils (including comment extraction and API formatting helpers), promoting reuse and separating concerns between presentation, state management, and low-level utilities. Styling is provided via App.css, index.css, and other CSS modules under src, while the public directory and index.html follow the standard Vite structure for static assets and the root HTML shell.