/web-to-ai-ml-integrations

Use Gunicorn and Nginx to Serve ML Model

Serve your ML model with Gunicorn & Nginx. Step-by-step guide for setup, deployment, and performance optimization.

Book a free  consultation
4.9
Clutch rating 🌟
600+
Happy partners
17+
Countries served
190+
Team members
Matt Graham, CEO of Rapid Developers

Book a call with an Expert

Starting a new venture? Need to upgrade your web app? RapidDev builds application with your growth in mind.

Book a free No-Code consultation

Use Gunicorn and Nginx to Serve ML Model

Crafting the WSGI Entry Point for Your ML Model

 
  • Develop a small web application that exposes your ML model via a WSGI interface using a framework like Flask or FastAPI.
  • This file acts as the entry point for Gunicorn. It instantiates the application and assigns routes to serve predictions.
  • For example, consider the following Python code that loads the ML model and defines a prediction endpoint:

// Import libraries
from flask import Flask, request, jsonify
import joblib // A library for model persistence

// Initialize the Flask application
app = Flask(**name**)

// Load your pre-trained ML model
model = joblib.load("model.joblib") // Replace with your actual model path

// Define a route for predictions
@app.route("/predict", methods=["POST"])
def predict():
    input\_data = request.json // Extract JSON data from the request
    prediction = model.predict([input\_data["features"]]) // Run the model prediction
    return jsonify({"prediction": prediction.tolist()}) // Return result as JSON

// Entry point for Gunicorn to serve the app via WSGI
if **name** == "**main**":
    app.run(debug=True)
  • This file (e.g., app.py) is the WSGI entry point for Gunicorn.

 

Configuring Gunicorn as the WSGI HTTP Server

 
  • Gunicorn is a robust WSGI HTTP server capable of handling multiple requests and managing worker processes.
  • Launch Gunicorn specifying the WSGI entry point. In our example, the application object is available in app inside app.py.
  • Run Gunicorn with desired parameters to set the number of worker processes and binding address:

// Example command to start Gunicorn with 4 worker processes
gunicorn --workers 4 --bind 127.0.0.1:8000 app:app

// Explanation:
// --workers 4        // Specifies the number of worker processes to handle incoming requests.
// --bind 127.0.0.1:8000   // Binds the server to localhost on port 8000.
// app:app            // Indicates that the WSGI application callable is "app" in the "app.py" file.
  • Tweaking Gunicorn parameters such as worker class (e.g., "gevent" for asynchronous handling) can optimize performance based on your model's load.

 

Using Nginx as a Reverse Proxy to Forward Requests

 
  • Nginx is a high-performance web server often placed in front of application servers to handle static content, load balancing, and reverse proxying.
  • Configure Nginx to forward incoming requests on standard web ports (80 for HTTP or 443 for HTTPS) to Gunicorn, which runs on a local port (e.g., 8000).
  • Create a configuration file (e.g., ml_model_nginx.conf) with the following directives:

// Example Nginx configuration:

server {
    listen 80; // Listen on port 80
    server_name your_ml_model_domain.com; // Replace with your domain

    location / {
        proxy\_pass http://127.0.0.1:8000; // Forward requests to Gunicorn on localhost:8000
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote\_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
    }

    // Optionally, serve static files here if your application has any
}
  • Place this configuration in the appropriate Nginx directory (depending on your operating system, e.g., /etc/nginx/sites-available/ on many Linux distributions) and create a symbolic link to sites-enabled.
  • Reload or restart Nginx to apply the changes.

 

Optimizing Performance and Security

 
  • Connection Timeouts and Buffering: Adjust Nginx parameters such as proxy_read_timeout and client_body_timeout to handle long-running ML predictions if necessary.
  • Worker Process Scaling: Monitor Gunicorn worker status in production. Increase worker counts or use asynchronous worker classes for high load situations.
  • SSL Termination: Utilize Nginx’s SSL capabilities by configuring HTTPS. This involves setting up SSL certificates and updating the server block to listen on port 443.
  • Security Headers: Implement Nginx settings that add HTTP security headers (e.g., Content-Security-Policy, X-Frame-Options) to mitigate web vulnerabilities.

 

Testing and Troubleshooting Your Setup

 
  • Verify that Gunicorn is serving the ML model by sending POST requests to http://127.0.0.1:8000/predict and ensure the JSON response returns correctly.
  • Test Nginx’s reverse proxy functionality by accessing http://your_ml_model\_domain.com/predict. Use tools like curl or Postman for testing API endpoints.
  • Monitor logs for both Gunicorn and Nginx in case of errors:
    • Gunicorn Logs: Typically output to STDOUT or a specified log file. They provide insights into worker crashes or request handling issues.
    • Nginx Logs: Located in directories such as /var/log/nginx/access.log and /var/log/nginx/error.log. Analyze these logs for proxy errors and connection issues.
  • Consider configuring log rotation to manage logs in long-running production environments.

 


Recognized by the best

Trusted by 600+ businesses globally

From startups to enterprises and everything in between, see for yourself our incredible impact.

RapidDev was an exceptional project management organization and the best development collaborators I've had the pleasure of working with.

They do complex work on extremely fast timelines and effectively manage the testing and pre-launch process to deliver the best possible product. I'm extremely impressed with their execution ability.

Arkady
CPO, Praction
Working with Matt was comparable to having another co-founder on the team, but without the commitment or cost.

He has a strategic mindset and willing to change the scope of the project in real time based on the needs of the client. A true strategic thought partner!

Donald Muir
Co-Founder, Arc
RapidDev are 10/10, excellent communicators - the best I've ever encountered in the tech dev space.

They always go the extra mile, they genuinely care, they respond quickly, they're flexible, adaptable and their enthusiasm is amazing.

Mat Westergreen-Thorne
Co-CEO, Grantify
RapidDev is an excellent developer for custom-code solutions.

We’ve had great success since launching the platform in November 2023. In a few months, we’ve gained over 1,000 new active users. We’ve also secured several dozen bookings on the platform and seen about 70% new user month-over-month growth since the launch.

Emmanuel Brown
Co-Founder, Church Real Estate Marketplace
Matt’s dedication to executing our vision and his commitment to the project deadline were impressive. 

This was such a specific project, and Matt really delivered. We worked with a really fast turnaround, and he always delivered. The site was a perfect prop for us!

Samantha Fekete
Production Manager, Media Production Company
The pSEO strategy executed by RapidDev is clearly driving meaningful results.

Working with RapidDev has delivered measurable, year-over-year growth. Comparing the same period, clicks increased by 129%, impressions grew by 196%, and average position improved by 14.6%. Most importantly, qualified contact form submissions rose 350%, excluding spam.

Appreciation as well to Matt Graham for championing the collaboration!

Michael W. Hammond
Principal Owner, OCD Tech

We put the rapid in RapidDev

Need a dedicated strategic tech and growth partner? Discover what RapidDev can do for your business! Book a call with our team to schedule a free, no-obligation consultation. We’ll discuss your project and provide a custom quote at no cost.