Leveraging Hugging Face Models on Azure Machine Learning

Russell Kidson
Mar 17, 2023
Updated • Mar 17, 2023

Microsoft's Azure Open Source Day recently showcased a novel reference application created utilizing cloud-native tools and services, emphasizing Microsoft's open source tools. The application serves as a tool to aid pet owners in finding their lost pets by utilizing machine learning to rapidly compare photographs of missing animals with images from animal shelters, rescues, and community sites. 

This application serves as a prime illustration of how open source tools can facilitate the development of intricate sites and services. Such tools include infrastructure as code tools, application frameworks, and other functionality-enhancing tools for code.

The aforementioned application's centerpiece is an open-source machine learning model embedded within a comprehensive library of various models and data sets that were developed by the Hugging Face community. The platform's diverse range of tools and services enables its community to build such a vast library of resources. Hence, Hugging Face's models are a suitable option for utilization, as they can be imported for inferencing within your own code, executed on your servers, or accessed via a cloud API.

Related: Microsoft Copilot is here

Why choose Hugging Face?

Another reason for considering the collaboration with Hugging Face in Azure is that it provides the flexibility to apply AI to a wide variety of business challenges. Although Microsoft's Cognitive Services APIs cover numerous prevalent AI scenarios with their well-defined APIs, they represent a singular company's viewpoint on what machine learning services are appropriate for enterprises. As a result, they are somewhat of a generalist solution, intended for broad utilization rather than specific applications. If your code requires support for an edge case, it can be a labor-intensive process to make the appropriate adjustments to the APIs.

Certainly, there is the option of creating custom models utilizing Azure's Machine Learning studio, by utilizing tools such as PyTorch and TensorFlow to construct and train models from the ground up. However, this approach necessitates a considerable amount of expertise in data science and machine learning in creating and training models. Moreover, there are other challenges associated with a "from scratch" methodology to machine learning. Azure offers an expanding selection of virtual machine options for machine learning training, but the process can require substantial computational resources and can be expensive to execute, particularly for large models requiring a significant volume of data. We cannot all match Open AI and create cloud-based supercomputers for training purposes, especially on a tight budget.

Hugging Face's Transformer model framework comprises over 40,000 models that can help mitigate the challenges associated with customization by providing a vast array of models that have been developed and trained by the community for a broader range of scenarios than what Microsoft offers alone. Additionally, Hugging Face's Transformers can operate on more than just text, as they have been trained to work with natural language, audio, and computer vision. These functions, or "tasks," are extensive and include over 2,000 different models for image classification and almost 18,000 for text classification.


Hugging Face in the context of Microsoft Azure

Microsoft has recently announced its support for Hugging Face models on Azure, offering a comprehensive selection of endpoints that can be integrated into your code, enabling you to import models from both the Hugging Face Hub and its pipeline API. These models are developed and tested by the Hugging Face community and can be readily leveraged for inference via the endpoint approach.

It is noteworthy that the models are available free of charge, and the only cost incurred is for the Azure compute resources necessary to execute inference tasks. The costs associated with this can be significant, particularly when working with significant amounts of data. As such, it is strongly recommended that you compare pricing with Azure's own Cognitive Services.

Constructing endpoints for your code

The process of creating an endpoint is relatively straightforward. Begin by selecting Hugging Face Azure ML from the Azure Marketplace to add the service to your account. Add the endpoint to a resource group, then specify a name and region. Next, select a model from the Hugging Face Hub, followed by the model ID and any associated tasks. You must also select an Azure compute instance for the service and a VNet to ensure that your service remains secure. Once these steps have been completed, an endpoint can be created, generating the required URLs and keys.

It is noteworthy that the service supports endpoints that can autoscale according to the number of requests per minute. By default, a single instance is available; however, you can use the sliders in the configuration screen to set the minimum and maximum number of instances. The scaling is based on the average number of requests over a five-minute period, designed to smooth out spikes in demand that may result in unnecessary costs.

At present, the Azure integration has limited documentation available; however, one can gain a sense of it by examining Hugging Face's AWS endpoint documentation. The Endpoint API is constructed based on the current Inference API, allowing you to determine how to structure payloads.

The service provides a convenient playground URL, enabling you to test your inferencing model. This includes sample Python and JavaScript code, as well as the option to use curl from the command line. Data is transmitted as JSON, with responses delivered in a similar manner. You can use standard libraries to compile and process the JSON, permitting you to embed REST calls to the API in your code. If you are using Python, you can take the sample code and paste it into a Jupyter notebook, allowing you to collaborate with colleagues in developing a more comprehensive application.

Using Azure machine learning to customize Hugging Face models

Currently in preview, it is now possible to utilize Hugging Face's foundation models in Azure Machine Learning using the same tools utilized for building and training your custom models. This capability represents an efficient method of working with the models, utilizing familiar technologies and tools, and leveraging Azure Machine Learning to refine and deploy Hugging Face models in your applications. It is possible to locate models using the Azure Machine Learning registry, which can be executed promptly.

This feature provides a rapid method of incorporating additional pretrained model endpoints into your code. You also have the option of fine-tuning models using your data, using Azure storage for both training and test data and working with Azure Machine Learning's pipelines to manage the process. Treating Hugging Face models as a foundation for your own models is a logical approach, as they have been proven in a range of cases that may not be suitable for your needs. For instance, a model that has been trained to recognize flaws in metal work may have some of the necessary features for handling plastic or glass, requiring additional training to reduce the risk of errors.

Inter-organizational collaboration is the way forward

As the open source machine learning community continues to grow, it is crucial that companies such as Microsoft embrace it. Although companies like Microsoft have experience and expertise, they lack the scale and specialization of the wider community. By collaborating with communities like Hugging Face, developers can enjoy an expanded range of options and greater flexibility, benefitting all parties involved. Ultimately, this approach leads to a more vibrant and dynamic machine learning landscape, enabling developers to achieve their goals with greater ease and efficiency.


Tutorials & Tips

Previous Post: «
Next Post: «


  1. Liana said on April 12, 2023 at 6:23 am

    thanks for info

Leave a Reply

Check the box to consent to your data being stored in line with the guidelines set out in our privacy policy

We love comments and welcome thoughtful and civilized discussion. Rudeness and personal attacks will not be tolerated. Please stay on-topic.
Please note that your comment may not appear immediately after you post it.