/web-to-ai-ml-integrations

Serve Machine Learning Model with FastAPI

Follow our step-by-step guide to serve your ML model with FastAPI. Deploy efficient, scalable machine learning apps in minutes.

Book a free  consultation
4.9
Clutch rating 🌟
600+
Happy partners
17+
Countries served
190+
Team members
Matt Graham, CEO of Rapid Developers

Book a call with an Expert

Starting a new venture? Need to upgrade your web app? RapidDev builds application with your growth in mind.

Book a free No-Code consultation

Serve Machine Learning Model with FastAPI

Loading Your Pre-trained Model and Creating Prediction Logic

 

This section explains how to load a pre-trained machine learning model (for example, a model saved using joblib or pickle) and how to embed prediction logic within your FastAPI backend. Here, we assume that you have a model file called model.pkl that has been previously trained for tasks like classification or regression.

  • Create a Python module (for example, model.py) that will handle model loading and prediction functions.
  • Within this module, import necessary libraries like pickle or joblib (depending on how your model is saved) and load the model once when the module is first imported.
  • Define a function, such as predict(input\_data), which processes input data and returns predictions.

// Import necessary libraries for model operations
import pickle

// Load the pre-trained model on module initialization for efficiency
with open("model.pkl", "rb") as model\_file:
    model = pickle.load(model\_file)

// Define a function that accepts input data and outputs predictions
def predict(input\_data):
    // Process the input\_data if necessary (e.g., converting to the required data type/format)
    // Make prediction using the loaded model
    prediction = model.predict([input\_data])
    return prediction

Creating the FastAPI Application and Defining Endpoints

 

In this part, you integrate the prediction logic into a FastAPI application so that the model can be served as an HTTP endpoint. FastAPI is a modern, fast (high-performance) web framework for building APIs with Python 3.6+ that uses type hints for data validation and automatic documentation generation.

  • Create the main FastAPI file (for example, app.py).
  • Import FastAPI along with the model and prediction function from your module.
  • Define a POST endpoint (such as /predict) that accepts data in JSON format and returns model predictions.
  • Use Pydantic models to define and validate the shape of incoming JSON data.

// Import necessary modules from FastAPI and Pydantic
from fastapi import FastAPI
from pydantic import BaseModel
from model import predict  // Import the predict function from our model module

// Define the data schema for the prediction request using Pydantic
class PredictionRequest(BaseModel):
    // For example, if the model takes two features as input:
    feature1: float
    feature2: float
    // Extend this with additional fields as required by your model

// Initialize the FastAPI application
app = FastAPI()

// Create a POST endpoint to receive data and return predictions
@app.post("/predict")
async def prediction\_endpoint(request: PredictionRequest):
    // Convert request data to a format expected by the predict function, here as a list of features
    input\_data = [request.feature1, request.feature2]
    prediction = predict(input\_data)
    return {"prediction": prediction[0]}  // Return the first (or only) prediction

Handling Data Preprocessing Within the Endpoint

 

Often, the raw data received by your endpoint may require transformation or scaling to be compatible with your machine learning model. You can encapsulate this processing inside the endpoint logic or within a dedicated preprocessing function.

  • Preprocessing function can be implemented similarly to the prediction function.
  • This function should transform the incoming data into the correct input format for your machine learning model (for instance, scaling inputs using a saved scaler).
  • Call this function within your endpoint function before sending the data to the model prediction logic.

// Optional: Define a preprocessing function if needed
def preprocess(input\_data):
    // Insert transformation logic here, such as normalization or encoding
    processed_data = input_data  // Replace this with actual transformation logic
    return processed\_data

// Updated endpoint using preprocessing
@app.post("/predict")
async def prediction\_endpoint(request: PredictionRequest):
    // Format the received data
    raw\_data = [request.feature1, request.feature2]
    processed_data = preprocess(raw_data)
    prediction = predict(processed\_data)
    return {"prediction": prediction[0]}

Enabling Automatic API Documentation and Testing

 

FastAPI automatically generates interactive API documentation using tools like Swagger UI and ReDoc. This feature is invaluable for testing endpoints and understanding API usage without additional setup.

  • Run your FastAPI application using a development server like Uvicorn.
  • Open a browser and navigate to http://127.0.0.1:8000/docs for Swagger UI or http://127.0.0.1:8000/redoc for ReDoc.
  • These pages allow you to test your /predict endpoint by providing input values and seeing the predicted outputs.

// Run the FastAPI application using Uvicorn
// This code is typically run in the terminal, not within the script itself
// Command: uvicorn app:app --reload

Ensuring Efficient Model Serving and Scalability

 

While the above steps produce a functional machine learning API, there are further considerations for performance and scalability in production environments.

  • Model Caching: Loading the model once during app startup is efficient and reduces load times for predictions.
  • Asynchronous Processing: FastAPI supports asynchronous endpoints. For heavy computation or longer inference times, consider offloading prediction tasks to background workers.
  • Deployment Options: Use containerization (with Docker) or orchestration (with Kubernetes) for scalable, distributed model serving.
  • Monitoring: Incorporate logging and monitoring to track API performance and diagnose issues during high-traffic periods.

By following this approach, you can effectively serve a machine learning model with FastAPI while ensuring the system is maintainable, scalable, and easy to understand.

 


Recognized by the best

Trusted by 600+ businesses globally

From startups to enterprises and everything in between, see for yourself our incredible impact.

RapidDev was an exceptional project management organization and the best development collaborators I've had the pleasure of working with.

They do complex work on extremely fast timelines and effectively manage the testing and pre-launch process to deliver the best possible product. I'm extremely impressed with their execution ability.

Arkady
CPO, Praction
Working with Matt was comparable to having another co-founder on the team, but without the commitment or cost.

He has a strategic mindset and willing to change the scope of the project in real time based on the needs of the client. A true strategic thought partner!

Donald Muir
Co-Founder, Arc
RapidDev are 10/10, excellent communicators - the best I've ever encountered in the tech dev space.

They always go the extra mile, they genuinely care, they respond quickly, they're flexible, adaptable and their enthusiasm is amazing.

Mat Westergreen-Thorne
Co-CEO, Grantify
RapidDev is an excellent developer for custom-code solutions.

We’ve had great success since launching the platform in November 2023. In a few months, we’ve gained over 1,000 new active users. We’ve also secured several dozen bookings on the platform and seen about 70% new user month-over-month growth since the launch.

Emmanuel Brown
Co-Founder, Church Real Estate Marketplace
Matt’s dedication to executing our vision and his commitment to the project deadline were impressive. 

This was such a specific project, and Matt really delivered. We worked with a really fast turnaround, and he always delivered. The site was a perfect prop for us!

Samantha Fekete
Production Manager, Media Production Company
The pSEO strategy executed by RapidDev is clearly driving meaningful results.

Working with RapidDev has delivered measurable, year-over-year growth. Comparing the same period, clicks increased by 129%, impressions grew by 196%, and average position improved by 14.6%. Most importantly, qualified contact form submissions rose 350%, excluding spam.

Appreciation as well to Matt Graham for championing the collaboration!

Michael W. Hammond
Principal Owner, OCD Tech

We put the rapid in RapidDev

Need a dedicated strategic tech and growth partner? Discover what RapidDev can do for your business! Book a call with our team to schedule a free, no-obligation consultation. We’ll discuss your project and provide a custom quote at no cost.Â