Book a call with an Expert

Starting a new venture? Need to upgrade your web app? RapidDev builds application with your growth in mind.

Deploy Hugging Face Transformer Model via Flask

Loading the Model & Initializing the Tokenizer

Overview: We will start by loading a pre-trained Hugging Face Transformer model and its corresponding tokenizer. Here, we will use a pipeline from the transformers library. A pipeline provides a simplified API for using models for tasks such as text generation, question answering, sentiment analysis, etc.
Implementation: Choose a task-specific pipeline (for example, text-generation or sentiment-analysis). For demonstration purposes, we will use the text-generation pipeline with a lightweight model like distilgpt2 to ensure that inference is fast and resource-friendly.


// Import necessary modules from transformers
from transformers import pipeline

// Create a pipeline for text generation; this downloads and loads the model and tokenizer automatically
model_pipeline = pipeline("text-generation", model="distilgpt2")

Setting Up Flask App & Routing

Overview: Flask is a lightweight Python web framework that allows us to build web APIs quickly. We will set up a basic Flask application with a single POST endpoint. This endpoint will accept JSON input and return the model's generated text.
Endpoint Details: The endpoint (e.g., /generate) will receive a JSON payload containing input text under a key like prompt, process it using our model, and return the generated response.


from flask import Flask, request, jsonify

app = Flask(name)
// Define a POST endpoint for generation
@app.route('/generate', methods=['POST'])
def generate_text():
    // Extract JSON data from request
    data = request.get_json()
    // Validate that the expected key exists
    if not data or 'prompt' not in data:
        return jsonify({"error": "Missing 'prompt' in request"}), 400
prompt = data['prompt']

// Perform inference using the loaded model pipeline
results = model\_pipeline(prompt)

// Return the results in JSON format
return jsonify(results)

// Run the Flask app if the script is executed directly
if name == 'main':
    app.run(debug=True)

Implementing the Inference Logic

Overview: The core functionality is embedded within the endpoint logic. The key component is how we call the model pipeline with the provided prompt and then return the result as JSON. The inference logic here is designed to be as simple as possible while allowing expansion for more complex behaviors.
Details: In this code, the variable results stores an array of dictionaries. Each dictionary might contain keys like generated\_text. Modify the structure as needed for your specific model or requirements.

Testing & Running the Application

Running: Execute the Flask app by running the Python script. The endpoint will be available at http://localhost:5000/generate by default.
Testing: Use tools such as cURL or Postman to send POST requests. Ensure that the request header has Content-Type: application/json and that the payload includes the text prompt.


// Example cURL command to test the endpoint:
// curl -X POST -H "Content-Type: application/json" -d '{"prompt": "Once upon a time"}' http://localhost:5000/generate

Recognized by the best

Get a Free Consultation

Trusted by 600+ businesses globally

From startups to enterprises and everything in between, see for yourself our incredible impact.

RapidDev was an exceptional project management organization and the best development collaborators I've had the pleasure of working with.

They do complex work on extremely fast timelines and effectively manage the testing and pre-launch process to deliver the best possible product. I'm extremely impressed with their execution ability.

Arkady

CPO, Praction

Working with Matt was comparable to having another co-founder on the team, but without the commitment or cost.

He has a strategic mindset and willing to change the scope of the project in real time based on the needs of the client. A true strategic thought partner!

Donald Muir

Co-Founder, Arc

RapidDev are 10/10, excellent communicators - the best I've ever encountered in the tech dev space.

They always go the extra mile, they genuinely care, they respond quickly, they're flexible, adaptable and their enthusiasm is amazing.

Mat Westergreen-Thorne

Co-CEO, Grantify

RapidDev is an excellent developer for custom-code solutions.

We’ve had great success since launching the platform in November 2023. In a few months, we’ve gained over 1,000 new active users. We’ve also secured several dozen bookings on the platform and seen about 70% new user month-over-month growth since the launch.

Emmanuel Brown

Co-Founder, Church Real Estate Marketplace

Matt’s dedication to executing our vision and his commitment to the project deadline were impressive.

This was such a specific project, and Matt really delivered. We worked with a really fast turnaround, and he always delivered. The site was a perfect prop for us!

Samantha Fekete

Production Manager, Media Production Company

The pSEO strategy executed by RapidDev is clearly driving meaningful results.

Working with RapidDev has delivered measurable, year-over-year growth. Comparing the same period, clicks increased by 129%, impressions grew by 196%, and average position improved by 14.6%. Most importantly, qualified contact form submissions rose 350%, excluding spam.

Appreciation as well to Matt Graham for championing the collaboration!

Michael W. Hammond

Principal Owner, OCD Tech

More Reviews

We put the rapid in RapidDev

Need a dedicated strategic tech and growth partner? Discover what RapidDev can do for your business! Book a call with our team to schedule a free, no-obligation consultation. We’ll discuss your project and provide a custom quote at no cost.