Skip to main content
Version: v2.9.0

Cluster

The Cluster page displays a comprehensive list of clusters associated with the respective tenant. It provides detailed insights into each cluster, including their status and YCU consumption.

Users can access Clusters by selecting the ‘Clusters’ option in the Yeedu navigation panel.

Yeedu enables users to both create and manage clusters.

cluster-list

The clusters landing page displays the following information:

  • ID: Unique identifier associated with the clusters.
  • Status: Current status of the clusters.
  • Cluster Name: Name of the cluster.
  • Cloud Provider: Cloud provider for which the cluster is created.
  • Cluster Type: Type of cluster, which can be Yeedu, Standalone, or Cluster.
  • Runtime Version: Spark and Scala runtime version for the cluster.
  • Active Memory and Cores: Memory and cores allocated for the cluster.
  • Total YCU: YCU (Yeedu Compute Unit), used to calculate total compute cost in Yeedu.
  • Created By: Username of the user who created the cluster.
  • Modified By: Username of the user who last modified the cluster.
  • Action Tab: Provides options for the user to stop the running cluster and destroy the cluster.

Manage

Create Cluster

On the right side of the Cluster dashboard, users can find the '+ Cluster' button located above the Action tab. This button is utilized for creating a new Cluster. Clicking on the '+ Cluster' button will prompt a pop-up window to create a new Cluster.

cluster-create-screen

  • Cluster Name: Input the name of the cluster, following these prerequisites:

    • Maximum length: 64 characters
    • Lowercase letters
    • Permitted symbols: '-', '_', '@', and '.'
  • Description: Describe the cluster in this field.

  • Cloud Environment: Select the cloud environment for deploying the cluster. Yeedu supports Azure, AWS, and GCP cloud platforms.

  • Runtime Version: Choose the Spark infrastructure version based on requirements.

  • Cluster Type: Select from three types: Yeedu, Standalone, and Cluster. Once selected, the cluster type cannot be modified later.

In Yeedu Mode, a fully managed and optimized Spark infrastructure is provided, leveraging cost-optimized resources to deliver optimal performance and scalability for Spark workloads. This mode runs Spark in local mode, ensuring a seamless and hassle-free experience in managing and executing Spark jobs.

  • Min Instances: The minimum number of instances required when creating the cluster. The default minimum is 1.
  • Max Instances: The maximum number of instances the cluster can scale up to, with a limit of 30 instances.
  • No. of Worker per Node: The number of Spark worker nodes to be deployed within the same instance.
  • No. of Parallel Execution: The maximum number of parallel jobs that can be executed on a specific instance. arch-yeedu-mode
  • Compute: Recommended machine types to create virtual machine.
  • Instances: Configure details such as minimum instances, maximum instances, number of parallel executions, and number of workers.
  • Auto Shutdown: Enable auto shutdown of the cluster and instances after a specified number of minutes of inactivity.
  • Labels: Add labels to the cluster for organizational purposes.
  • Dependencies: Select any required dependency repositories for the cluster.

cluster-spark-config-screen cluster-adv-config.png

  • Spark Config: In the Spark Config section, users can add additional configurations such as packages, JARs, archives, repositories, Py-files, environment variables, and default configurations.
  • Advanced Options: Here, users can provide additional details like init scripts, Hive Metastore settings, and others.

On the right side of the page, there is a section for cluster details where users can view the cluster's configuration while they are creating the cluster.

At the bottom of the page, users have two buttons:

  • Reset: Reset all configurations selected for the cluster and clear all fields.
  • Create: Submits the details to create a new cloud environment; ensure all required fields (*) are filled before clicking.

View / Modify Cluster

The cluster dashboard comprises of 5 windows: Events, Configuration, Dependencies, JDBC, Access Management

cluster-on-click-screen

Events

The Events window has 3 sections:

  • The first section displays basic configuration details about the cluster like ID, Cloud Provider, Cluster Type, Runtime Version, Instance size, Min Instances, Max Instances, Idle Timeout(ms), Auto shutdown , Max parallel execution , Spark workers by node.
  • Job Statistics: This section displays the count of jobs associated with the cluster along with their status. The status displayed are: Submitted, Running , Done, Error, Terminated, Stopping, Stopped and Total Job Count
  • Events: The events section displays all the events associated with cluster along with the option to view logs of the specific events. There are 2 tabs in Cluster log: Stdout and Stderr.
    • Stdout: Standard output of the cluster.
    • Stderr: Standard error of the cluster. On the right side of the window, user has 3 buttons:
    • Refresh: Allows refreshing the log window.
    • Copy: Enables copying the logs.
    • Download: Allows downloading the logs in '.log' format.
  • JDBC: For detailed information on JDBC, please click on Setting Up Thrift Server for Yeedu

Configuration

The Configuration window enables the user to edit or make changes to the already created cluster.

note

To Edit the Cluster details, the cluster must be in stopped state.

configuration-edit

Users can make changes to all the fields except for cluster type. Once cluster is created , cluster type cannot be updated. Users need to destroy the cluster and create a new one in that case. Apart from cluster type, all other fields are editable, and the user can make changes as per their requirements.

Dependencies

In this window, the users can manage the Dependency repository associated with the cluster.

On the right side of the screen, the users have an option to upload new files by clicking on the ‘+ File’ button. Users can also delete the files by clicking on the 3 Dot button in the action tab.

dependencies

JDBC

The JDBC tab in the cluster provides essential details for connecting to the cluster's Thrift server via JDBC. These details include the JDBC driver, connection URL, username, and authentication token. Additionally, the tab displays a list of all workspaces linked to the cluster, simplifying workspace management and access. However, Thrift SQL is not supported due to the cluster’s current runtime version.

jdbc

Access Management

In the Access Management tab of cluster , user can manage which workspaces should have access to the following cluster. Users can add or delete workspaces by either searching or selecting the checkbox next to workspace name. If the ‘Can Attach to’ checkbox is selected, then the cluster is attached to the workspace.

access-management

Auto Shutdown

  • If no jobs are submitted within the selected Auto Shutdown time, the cluster will automatically stop.
  • The minimum auto shutdown time is 10 minutes.
  • This feature helps in cost efficiency by shutting down unused clusters.

Warm Start/Stop Mechanism

The Warm Start/Stop feature is implemented to reduce cluster startup time. Instead of taking 6-7 minutes, the optimized process ensures that the cluster is ready within one minute. This improves resource allocation efficiency and minimizes delays.

  • Go to the Cluster Dashboard.
  • Click on the + Cluster button (found above the Action tab on the right side of the screen).
  • A pop-up window will appear, prompting you to enter details for the new cluster.
  • Fill in the necessary fields, such as:
    • Cluster Name
    • Cloud Environment
    • Run Time Version
    • Compute Type
    • Instance Configuration
  • Click on Create to start the process.
  • It takes 6 to 7 minutes for the backend to complete the bootstrap process.
  • Once the cluster is created, it will be available for use within one minute.

Starting and Stopping a Cluster

You can start or stop the cluster anytime depending on job execution.

  • To start the cluster:

    • Navigate to the cluster in the dashboard.
    • Click the Start button next to the cluster name.

cluster-warmstart.png

  • To stop the cluster:

    • Click the Stop button.
    • This stops all running jobs and frees up resources.

cluster-warm-stop.png

Viewing Cluster Details

  • Once the cluster is running, you can view details such as:

    • Cloud Provider (AWS, Azure, etc.)
    • Compute Type (CPU, Memory, GPUs, etc.)
    • Runtime Version (Spark, Hadoop, Python versions)
    • Disk Volume and Storage Details
    • Instance Size and Execution Slots

cluster-running-jobs.png

Managing Cluster Jobs

  • The Cluster Dashboard provides real-time insights into:

    • Total jobs submitted.
    • Jobs currently running.
    • Errors encountered.
    • Completed and stopped jobs.
  • Users can click on View Logs to check job details and troubleshoot issues.

Graviton Instances

Graviton (ARM64 or aarch64) instances are ARM-based compute types optimized for:

  • Better price-performance ratio
  • High memory and disk throughput
  • Workloads like Python pipelines, Spark jobs, or GPU-free machine learning

How to Use aarch64 in Cluster Creation

Go to:
Clusters → + Create Cluster

cluster-warm-stop.png

Yeedu Cluster Turbo Acceleration

What Is Turbo Acceleration?

Turbo Acceleration is a performance-boosting option for Spark clusters in Yeedu. When enabled:

  • Spark jobs and SQL queries execute faster
  • Session responsiveness improves in Notebooks
  • Backend is optimized for better parallelism and reduced latency
  • Comes at no extra cost

How to Enable Turbo Mode

Go to:
Clusters → + Create Cluster → select Turbo (radio button)

cluster-warm-stop.png

In the Acceleration section:

OptionDescription
No AccelerationDefault Spark mode (no special optimizations)
TurboEnables optimized Spark execution engine
CUDAReserved for GPU-accelerated environments

Use Cases for Turbo

Use Turbo for scenarios requiring fast execution and responsiveness—such as interactive notebooks, complex SparkSQL queries, exploratory data analysis, and performance-optimized pipelines.

note

The recommended resolution for Yeedu UI is 1920 x 1080.