Elotl
  • Home
  • Platform
    • Luna
    • Nova
  • Resources
    • Blog
    • Youtube
    • Podcast
    • Meetup
  • Usecases
    • GenAI
  • Company
    • Team
    • Careers
    • Contact
    • News
  • Free Trial
    • Luna Free Trial
    • Nova Free Trial
  • Home
  • Platform
    • Luna
    • Nova
  • Resources
    • Blog
    • Youtube
    • Podcast
    • Meetup
  • Usecases
    • GenAI
  • Company
    • Team
    • Careers
    • Contact
    • News
  • Free Trial
    • Luna Free Trial
    • Nova Free Trial
Search

Blog

Right Place, Right Size: Using an Autoscaler-Aware Multi-Cluster Kubernetes Fleet Manager for ML/AI Workloads

7/11/2024

 

Introduction

Picture
Are you tired of juggling multiple Kubernetes clusters, desperately trying to match your ML/AI workloads to the right resources? A smart K8s fleet manager like the Elotl Nova policy-driven multi-cluster orchestrator simplifies the use of multiple clusters by presenting a single K8s endpoint for workload submission and by choosing a target cluster for the workload based on placement policies and candidate cluster available capacity.  Nova is autoscaler-aware, detecting if workload clusters are running either the K8s cluster autoscaler or the Elotl Luna intelligent cluster autoscaler.

In this blog, we examine how Nova policies combined with its autoscaler-awareness can be used to achieve a variety of "right place, right size" outcomes for several common ML/AI GPU workload scenarios. When Nova and Luna team up you can:
  1. Reduce the latency of critical ML/AI workloads by scheduling on available GPU compute.
  2. Reduce your bill by directing experimental jobs to sunk-cost clusters.
  3. Reduce your costs via policies that select GPUs with the desired price/performance.


Read More

A Guide to Disaster Recovery for FerretDB with Elotl Nova on Kubernetes

2/12/2024

 
Originally published on blog.ferretdb.io
Picture
Running a database without a disaster recovery process can result in loss of business continuity, resulting in revenue loss and reputation loss for a modern business.

Cloud environments provide a vast set of choices in storage, networking, compute, load-balancing and other resources to build out DR solutions for your applications. However, these building blocks need to be architected and orchestrated to build a resilient end-to-end solution. Ensuring continuous operation of the databases backing your production apps is critical to avoid losing your customers' trust.

Successful disaster recovery requires:
  • Reliable components to automate backup and recovery
  • A watertight way to identify problems
  • A list of steps to revive the database
  • Regular testing of the recovery process

This blog post shows how to automate these four aspects of disaster recovery using FerretDB, Percona PostgreSQL and Nova. Nova automates parts of the recovery process, reducing mistakes and getting your data back online faster.

Read More

    Topic

    All
    ARM
    Autoscaling
    Deep Learning
    Disaster Recovery
    GPU Time-slicing
    Luna
    Machine Learning
    Node Management
    Nova
    Troubleshooting
    VPA

    Archives

    May 2025
    April 2025
    January 2025
    November 2024
    October 2024
    August 2024
    July 2024
    June 2024
    April 2024
    February 2024

    RSS Feed

​© 2025 Elotl, Inc.
  • Home
  • Platform
    • Luna
    • Nova
  • Resources
    • Blog
    • Youtube
    • Podcast
    • Meetup
  • Usecases
    • GenAI
  • Company
    • Team
    • Careers
    • Contact
    • News
  • Free Trial
    • Luna Free Trial
    • Nova Free Trial