Docker Deployment for API Inference
This document provides an overview of the Docker configuration used to containerize the API.
Dockerfile Execution Flow
View Dockerfile
FROM python:3.12-slim
# Create a non-root user to run the application and set permissions
RUN useradd -m -u 1000 turinguser
RUN mkdir -p /app/models && chown -R turinguser:turinguser /app /app/models
USER turinguser
# Set environment variables
# PATH to include local user binaries and project root
ENV PATH="/home/turinguser/.local/bin:$PATH"
ENV PROJ_ROOT=/app
# Set the working directory in the container
WORKDIR /app
# Copy essential files to install dependencies
COPY --chown=turinguser requirements.txt .
# Install Python dependencies
RUN pip install --default-timeout=1000 --no-cache-dir torch --index-url https://download.pytorch.org/whl/cpu
RUN pip3 install -v -r requirements.txt --upgrade --default-timeout=1000 --no-cache-dir --break-system-packages
# Copy remaining project files
COPY --chown=turinguser turing ./turing
COPY --chown=turinguser reports ./reports
# Expose port 7860 for the FastAPI application
EXPOSE 7860
# Default command to run the FastAPI application on port 7860
CMD ["uvicorn", "turing.api.app:app", "--host", "0.0.0.0", "--port", "7860"]
The Dockerfile is designed to create a secure and optimized environment for running the machine learning API. Here is a detailed breakdown of the build process:
- Base Image: The build starts from the official
python:3.12-slimimage. - Security & User Permissions: To follow security best practices, a non-root user named
turinguser(UID 1000) is created. This user is granted ownership of the/appand/app/modelsdirectories, ensuring the application does not run with root privileges. - Environment Configuration: The system
PATHis updated to include the local user’s binary directory, and thePROJ_ROOTvariable is set to/appfor consistent internal path referencing. - Optimized Dependency Installation:
- The process first installs the CPU-only version of
torch. This significantly reduces the final image size compared to the standard GPU-enabled versions. - The remaining project requirements are then installed with a high timeout and the
--no-cache-dirflag to ensure a stable, clean build without storing unnecessary temporary files.
- The process first installs the CPU-only version of
- Application Deployment: The core source code (
turingfolder) and thereportsdirectory are copied into the container with the correct ownership forturinguser. - Port and Startup: Port 7860 is exposed to allow external traffic. The container concludes by defining the default command to launch the FastAPI application using Uvicorn, bound to all network interfaces on the specified port.
Docker Image Optimization (.dockerignore)
To maintain a lightweight and efficient production image, the .dockerignore file is used to exclude all source code and project files that are not strictly related to the API application. By filtering out training scripts, data validation modules, test suites, and development utilities, the container remains focused solely on the inference logic required to serve the model.
Docker Compose Configuration
View Docker Compose configuration
The docker-compose.yml file is used to streamline the local development environment by automating the container's lifecycle and managing its external dependencies. This configuration serves as an orchestration layer that ensures the API runs with the same settings every time, regardless of the host machine.
The setup performs several critical functions:
- Build Automation: It automatically triggers the build process of the Dockerfile located in the current directory and assigns the name
turing_appto the resulting container. - Network Mapping: It establishes a bridge between the host and the container by mapping port 7860, making the FastAPI service accessible at
localhost:7860. - Environment Injection: It handles the secure transmission of sensitive credentials, such as MLflow and DagsHub tokens, from the host environment into the containerized application using specific tracking and user variables.
How to Run Locally
1. Environment Variables
The application requires the following environment variables to communicate with MLflow and Dagshub. You can define these in a .env file in the root directory:
MLFLOW_USERMLFLOW_PWDDAGSHUB_TOKEN
2. Using Docker Compose (Recommended)
The easiest way to start the API locally is using the provided docker-compose.yml file.
The API will be reachable at: http://localhost:7860
3. Using Docker CLI
Alternatively, you can build and run the container manually:
# Build the image
docker build -t turing-api .
# Run the container with environment variables
docker run -p 7860:7860 \
-e MLFLOW_TRACKING_USERNAME=your_user \
-e MLFLOW_TRACKING_PASSWORD=your_pwd \
-e DAGSHUB_USER_TOKEN=your_token \
turing-api
API Endpoints
Once the container is running, you can access the interactive documentation (Swagger UI) at: http://localhost:7860/docs