Docker Deployment for API Inference

This document provides an overview of the Docker configuration used to containerize the API.

Dockerfile Execution Flow

View Dockerfile

FROM python:3.12-slim

# Create a non-root user to run the application and set permissions
RUN useradd -m -u 1000 turinguser
RUN mkdir -p /app/models && chown -R turinguser:turinguser /app /app/models
USER turinguser

# Set environment variables 
# PATH to include local user binaries and project root
ENV PATH="/home/turinguser/.local/bin:$PATH"
ENV PROJ_ROOT=/app

# Set the working directory in the container
WORKDIR /app

# Copy essential files to install dependencies
COPY --chown=turinguser requirements.txt .

# Install Python dependencies
RUN pip install --default-timeout=1000 --no-cache-dir torch --index-url https://download.pytorch.org/whl/cpu
RUN pip3 install -v -r requirements.txt --upgrade --default-timeout=1000 --no-cache-dir --break-system-packages

# Copy remaining project files
COPY --chown=turinguser turing ./turing
COPY --chown=turinguser reports ./reports

# Expose port 7860 for the FastAPI application
EXPOSE 7860

# Default command to run the FastAPI application on port 7860
CMD ["uvicorn", "turing.api.app:app", "--host", "0.0.0.0", "--port", "7860"]

The Dockerfile is designed to create a secure and optimized environment for running the machine learning API. Here is a detailed breakdown of the build process:

Base Image: The build starts from the official python:3.12-slim image.
Security & User Permissions: To follow security best practices, a non-root user named turinguser (UID 1000) is created. This user is granted ownership of the /app and /app/models directories, ensuring the application does not run with root privileges.
Environment Configuration: The system PATH is updated to include the local user’s binary directory, and the PROJ_ROOT variable is set to /app for consistent internal path referencing.
Optimized Dependency Installation:
- The process first installs the CPU-only version of torch. This significantly reduces the final image size compared to the standard GPU-enabled versions.
- The remaining project requirements are then installed with a high timeout and the --no-cache-dir flag to ensure a stable, clean build without storing unnecessary temporary files.
Application Deployment: The core source code (turing folder) and the reports directory are copied into the container with the correct ownership for turinguser.
Port and Startup: Port 7860 is exposed to allow external traffic. The container concludes by defining the default command to launch the FastAPI application using Uvicorn, bound to all network interfaces on the specified port.

Docker Image Optimization (.dockerignore)

To maintain a lightweight and efficient production image, the .dockerignore file is used to exclude all source code and project files that are not strictly related to the API application. By filtering out training scripts, data validation modules, test suites, and development utilities, the container remains focused solely on the inference logic required to serve the model.

Docker Compose Configuration

View Docker Compose configuration

services:
  api:
    build: .
    container_name: turing_app

    ports:
      - "7860:7860"

    environment:
      - MLFLOW_TRACKING_USERNAME=${MLFLOW_USER}
      - MLFLOW_TRACKING_PASSWORD=${MLFLOW_PWD}
      - DAGSHUB_USER_TOKEN=${DAGSHUB_TOKEN}

    command: uvicorn turing.api.app:app --host 0.0.0.0 --port 7860 --reload

The docker-compose.yml file is used to streamline the local development environment by automating the container's lifecycle and managing its external dependencies. This configuration serves as an orchestration layer that ensures the API runs with the same settings every time, regardless of the host machine.

The setup performs several critical functions:

Build Automation: It automatically triggers the build process of the Dockerfile located in the current directory and assigns the name turing_app to the resulting container.
Network Mapping: It establishes a bridge between the host and the container by mapping port 7860, making the FastAPI service accessible at localhost:7860.
Environment Injection: It handles the secure transmission of sensitive credentials, such as MLflow and DagsHub tokens, from the host environment into the containerized application using specific tracking and user variables.

How to Run Locally

1. Environment Variables

The application requires the following environment variables to communicate with MLflow and Dagshub. You can define these in a .env file in the root directory:

MLFLOW_USER
MLFLOW_PWD
DAGSHUB_TOKEN

2. Using Docker Compose (Recommended)

The easiest way to start the API locally is using the provided docker-compose.yml file.

# Build and run the container
docker-compose up --build

The API will be reachable at: http://localhost:7860

3. Using Docker CLI

Alternatively, you can build and run the container manually:

# Build the image
docker build -t turing-api .

# Run the container with environment variables
docker run -p 7860:7860 \
  -e MLFLOW_TRACKING_USERNAME=your_user \
  -e MLFLOW_TRACKING_PASSWORD=your_pwd \
  -e DAGSHUB_USER_TOKEN=your_token \
  turing-api

API Endpoints

Once the container is running, you can access the interactive documentation (Swagger UI) at: http://localhost:7860/docs