Creating a self-generating AI with OctoAI

Emre Çitak
Jun 14, 2023

When OctoML launched in 2019, its primary focus was on optimizing machine learning (ML) models. The company gained recognition for its ability to fine-tune models and package them into deployable containers for different hardware setups.

This approach was well-received by ML engineers, but OctoML realized that there was a need for a more comprehensive solution to address the challenges faced by businesses when deploying ML-based applications.

OctoAI is here

Today, OctoML announces the launch of OctoAI, its latest offering that takes the company's services to the next level. While not a complete pivot, OctoAI represents a shift in emphasis from optimizing models to helping businesses leverage existing open-source models and customize them according to their specific needs.

The platform aims to simplify the process of building and deploying ML-based applications without the burden of managing the underlying infrastructure.

OctoAI simplifies the process of building and deploying machine learning (ML)-based applications by abstracting away the complexities of ML infrastructure

Simplifying AI compute with OctoAI

The core idea behind OctoAI is to provide a self-optimizing compute service for AI. By leveraging OctoAI, businesses can build ML-based applications and put them into production without worrying about the complexities of ML infrastructure.

The platform offers a managed compute service that automates the selection of hardware based on user priorities, whether it's prioritizing low latency or cost efficiency. OctoAI also automatically optimizes models, resulting in cost savings and performance gains.

Multiple models, singular goal

While OctoAI provides automated optimizations, it also gives users the flexibility to set their own parameters and choose the hardware that best suits their requirements. However, OctoML expects that most users will find value in allowing OctoAI to manage these tasks, enabling them to focus on their core ML applications.

Users have the flexibility to set their own parameters and choose hardware if they prefer more control over their ML applications

In addition, OctoML offers accelerated versions of popular foundation models such as Dolly 2, Whisper, FILM, FLAN-UL2, and Stable Diffusion. These pre-accelerated models are ready to use out of the box, saving businesses time and effort in implementing them.

The program is in early access now, but you may sign-up for it via the link here.

Notably, OctoML achieved impressive improvements with the Stable Diffusion model, making it three times faster and reducing costs by five times compared to the original model.


Tutorials & Tips

Previous Post: «
Next Post: «


There are no comments on this post yet, be the first one to share your thoughts!

Leave a Reply

Check the box to consent to your data being stored in line with the guidelines set out in our privacy policy

We love comments and welcome thoughtful and civilized discussion. Rudeness and personal attacks will not be tolerated. Please stay on-topic.
Please note that your comment may not appear immediately after you post it.