Hugging Face AI & Digitalocean: Explore Automations For Business

Hugging Face AI & Digitalocean: Explore Automations For Business

Share

X
Facebook
LinkedIn
Email

AI is becoming an essential tool for improving business services and processes. With Hugging Face’s pre-trained AI models and Digitalocean’s straightforward cloud deployment, businesses can quickly implement solutions like chatbots, automated workflows, and advanced data analysis. 

This guide explores how to deploy Hugging Face models on Digitalocean so we can begin to imagine how businesses might apply them to real-world challenges.

Table of Contents

Building AI Apps on Hugging Face: The business case.

Your business can start building AI powered applications now. 

Here are a few examples:

An AI powered chat tailored to your business services

One such example is Babylon Health’s AI chatbot, which uses natural language processing (NLP) to provide medical advice and symptom diagnosis. 

It asked questions to identify issues, suggest treatments, and guide users on the urgency of seeking help and where to find it.

Using AI technology to automate banking functions

Erica is a mobile phone app which uses AI to automate certain banking functions. Among the functions it allows people to do are letting customers check their account balances, print and scan check books and get bespoke financial recommendations from the app.

It gives personalised assistance based on different people’s needs, using deep learning which enables it to interpret user queries.

Generating CSS code with AI to streamline website design

Recently, developers have integrated Hugging Face models into web apps to generate HTML and CSS code using AI. This approach is particularly valuable for automating web design tasks, where the AI creates functional layouts and styles, reducing manual effort from a builder like Elementor or Divi. 

TailwindCSS is commonly incorporated in these use cases to ensure seamless styling of generated content. Companies such as AWS have partnered with Hugging Face to streamline such integrations by providing pre-configured environments to deploy and manage these models at scale.

Now, let’s Explore Hugging Face AI Models

In this article, we’ll cover:

  • What is Hugging Face, and what does it do?
  • What is Digital ocean, and what is it used for?
  • How to Set up, access and interact with Hugging Face models

We’ll give a step-by-step guide with examples.

Let’s get started.

What is Hugging Face?

Hugging Face is a leading open-source AI platform specialising in natural language processing (NLP) and machine learning models.

Known for its easy-to-use Transformers library, Hugging Face provides pre-trained models for tasks like text classification, translation, summarisation, and even computer vision (image and video).

Hugging Face AI Community

What is Hugging Face Used For?

These models allow developers and researchers to implement powerful AI solutions with minimal effort, fostering innovation across industries.

By democratising access to state-of-the-art machine learning, Hugging Face has become a cornerstone in the AI and developer community.

Deploying a Hugging Face model to a cloud computing platform

For this, we’re going to use Digitalocean’s 1-click model deployment service to make life easier.

Before we get into that, let’s cover Digitalocean and what it’s used for.

First up, what is Digitalocean?

Put simply, Digitalocean is a cloud computing platform designed to simplify the deployment, management, and scaling of applications.

Digitalocean homepage

What does Digitalocean do?

Popular among developers and small-to-medium businesses, Digitalocean provides an intuitive interface, affordable pricing, and a range of services like virtual machines, Kubernetes, databases, and storage solutions.

What is Digitalocean used for?

Digitalocean is renowned for its focus on developer-friendly tools and simple pricing, making it a competitive alternative to Amazon AWS and Microsoft Azure cloud hosting.

In fact, Digitalocean provides the underlying infrastructure for Makilo Managed Services.

Best Hugging Face models on Digitalocean: Available 1-click models

Production-ready, inference endpoints your business can use out-the-box to build AI powered applications.

  1. LLaMA: Best for foundational NLP tasks and custom fine-tuning across a range of general purposes.
  2. Qwen: Prioritises conversational and task-specific AI with multilingual support.
  3. Mistral: Designed for high-efficiency and scalability with powerful ensemble capabilities.
  4. Gemma: Specialised for Italian NLP with exceptional accuracy for Italian-focused tasks.
  5. Nous-Hermes: Combines LLaMA and Mixtral strengths, excelling at instruction-following and dialogue systems in high-complexity environments.

Why use DigitalOcean 1-click models over managed services like ChatGPT?

  1. Predictable Costs: Unlike usage-based pricing of services like ChatGPT API, DigitalOcean offers fixed infrastructure costs, allowing you to scale with better budget control.
  2. Optimised Resource Allocation: Smaller models only require a single GPU, making them significantly more cost-efficient compared to large models, which need up to 8 GPUs.
  3. Data Privacy: Your data remains private and is not used to train or improve shared models, offering complete control over sensitive information.
  4. Custom Fine-Tuning: Host your models to apply custom fine-tuning for tailored performance, which is often limited or expensive with hosted APIs.
  5. Scalability: You can deploy only the resources you need, adjusting capacity as your requirements grow or change.

Digitalocean droplet: step by step guide

In this section, we’ll discuss:

  • Setting up your digital ocean droplet
  • Accessing your Hugging Face models
  • Interacting with your AI model

(Remember, these GPU droplets are charged per hour (not per month).

Don’t forget to destroy after you’re done using them.

Setting up your droplet for Hugging Face models

  1. Create a free account at Digitalocean.com
  2. Go to GPU Droplets and select 1-click Models. We’ll be using Qwen (2-7B-Instruct) which is a medium-sized instruction model designed for conversational AI and assistant-style tasks.
    Create a Digitalocean GPU Droplet
  3. Scroll down to select the single GPU droplet which costs ~$3 per hour. You’ll note that the larger models require 8x GPUs so you’ll see the cost bump by approx. 8 fold.
    Digitalocean GPU Plans
  4. After a few minutes your droplet will be ready.
    Digitalocean Hugs on DO demo

How to access your AI mode

Next we need to fetch the access URL endpoint and token so we can interact with our model.

  1. Click the Web Console button in the top-right to launch the console interface for the droplet. Read through the Message of the day to find your endpoint and token.
    Digitalocean CLI access details
  2. To interact with the model we will use the Inference API along with the Postman software tool for using APIs.

How to interact with your AI model

Now let’s play!

  1. Open Postman
  2. Add a blank HTTP API request
  3. Set the request from GET to POST
  4. Paste in your endpoint URL http://xxx.xxx.xx.xx (see previous section) along with the /v1/chat/completions
  5. Select the Authorization tab, set Auth Type to Bearer Token then paste in your access token (see previous section)
    Postman authorisation
  6. Select the Body tab, set to raw and paste in the request body
    {

      “messages”: [

        {

          “role”: “user”,

          “content”: “What is the capital on France”

        }

      ],

      “max_tokens”: 500,

      “stream”: false

    }

  7. Hit Send. This may take a few seconds. You’ll see the response with the message content from the assistant in the response body
    Postman query for chat completions inference API - simple

There you have it! Your very own AI chat assistant.

Hugging Face for your business

Now that we’ve got a model to use, it’s time to make it work harder for your business.

Here are some examples of Hugging Face in action:

Ask it something more complex (this may take a couple of minutes)

Postman query for chat completions inference API - complex

The response time can be changed by tweaking the max_tokens value in the response body. This is the maximum number of tokens that can be generated in the chat completion.

Notice how lowering it from 500 to 64 dramatically speeds up the response time at the expense of the response message complexity.

Postman query for chat completions inference API - fewer tokens

Now what if we want to provide an image along with our prompt?

Postman query for chat completions inference API - image

Doh! The model is quick to tell us that it’s unable to analyse images. If we want to do that then we’ll need the Qwen/Qwen2-VL-7B-Instruct model (note the VL – Vision Language model) for visual (images, videos) and textual data) https://huggingface.co/Qwen/Qwen2-VL-7B-Instruct.

Digitalocean does not currently support this as a 1-click model. However, there is nothing stopping us downloading the model and trying it out locally – so watch this space.

Finally, remember these GPU droplets are charged per hour (not month) so don’t forget to destroy them when you’re finished.

Conclusion: Hugging Face AI Models & Digitalocean

Using Hugging Face ai models on Digitalocean offers businesses a simple, cost-effective way to integrate AI into your operations. 

Whether it’s streamlining workflows, enhancing customer interactions, or analyzing data, these tools make AI accessible to businesses of all sizes. By deploying scalable and efficient AI solutions, companies can improve efficiency and stay competitive in an increasingly digital world.

If you’d like to know more or have a challenge you’d like to discuss, then contact us or email tom@makilo.co.uk directly.

Recommended for you