Getting Started

  1. Fork this Repository: Click the “Fork” button at the top right of this repository to create your own copy.
  2. Clone Your Repository:
    git clone https://github.com/your-username/your-forked-repo.git
    cd your-forked-repo
    
    
  3. GitHub account
  4. Basic knowledge of Python and machine learning
  5. Git command-line tool (optional)

This repository contains code to train a Random Forest classifier on the Iris dataset and deploy it using Docker. The main purpose is to showcase how to encapsulate machine learning models and their dependencies within a Docker container for easier deployment and reproducibility.

Prerequisites

Before running the Docker container, ensure you have the following installed:

  • Docker: Install Docker according to your operating system from here.

Project Structure

The project structure is organized as follows:

├── Dockerfile ├── README.md └── src

Project Structure:

    mlops_labs/
    └── Labs/
        └── Docker_Container_Labs/
            └── Docker_lab1/
                ├── README.md
                ├── src/
                │   ├── __init__.py
                │   └── main.py
                ├── .gitignore
                └── train_model
  • Dockerfile: Contains instructions to build the Docker image for the project.
  • README.md: This file explaining the project and how to use it.
  • src/main.py: Python script to train a Random Forest classifier on the Iris dataset and save the trained model.

Dockerfile

The Dockerfile specifies the steps to create a Docker image. Here’s a breakdown of each section:

# Use an official Python runtime as a parent image
FROM python:3.9

# Set the working directory in the container
WORKDIR /app

# Copy the model training script into the container
COPY src/main.py .

# Install Scikit-Learn and joblib
RUN pip install -r requirements.txt

# Run the script when the container launches
CMD ["python", "main.py"]
  • FROM python:3.9: Specifies the base image to use. We’re using the official Python image for version 3.9. WORKDIR /app: Sets the working directory inside the container to /app.
  • COPY src/main.py .: Copies the main.py script from the src directory on the host machine to the /app directory inside the container.
  • RUN pip install -r requirements.txt: Installs the dependencies specified in the requirements.txt file. Note: Currently, there’s no requirements.txt file provided in the project, but it should contain the dependencies required by the script.
  • CMD ["python", "main.py"]: Specifies the command to run when the container is launched. In this case, it runs the main.py script.

Usage

To build the Docker image and run the container, follow these steps:

  • Clone this repository to your local machine.
  • Navigate to the project directory in your terminal.
  • Build the Docker image using the following command: docker build -t iris-classification .
  • This command builds the Docker image with the tag iris-classification.
  • Once the image is built successfully, you can run a container from it: docker run iris-classification
  • This command starts a container from the iris-classification image, which executes the main.py script inside the container.

Conclusion

This project demonstrates how to containerize a machine learning model using Docker, allowing for easy deployment across different environments. By encapsulating the model and its dependencies within a Docker container, we ensure consistency and reproducibility in model deployment.

For further customization or integration into existing workflows, feel free to modify the Dockerfile or expand upon the provided codebase.

For any issues or inquiries, please open an issue in this repository.