Skip to main content
Version: v2.7.1

Architecture

Yeedu Architecture Overview

yeedu_clouds

Yeedu is a comprehensive Software-as-a-Service (SaaS) platform that is designed to provide organizations with a cost-optimized infrastructure for Apache Spark workloads across the three major cloud providers: Amazon Web Services (AWS), Microsoft Azure, and Google Cloud Platform (GCP).

With Yeedu, organizations can easily deploy, manage, and optimize their Apache Spark infrastructure, regardless of which cloud provider they choose. This allows them to focus on their core business functions while Yeedu takes care of the technical aspects of their Spark workloads.

Yeedu's architecture comprises three core components: the Yeedu Control Plane, Yeedu Compute, and a UI/CLI.

Yeedu Control Plane

The Yeedu Control Plane serves as the platform's backbone, managing all backend services essential for its operation:

  • Web Application: This serves as a centralized hub enabling users to create, manage, and oversee jobs and notebooks. Additionally, it facilitates cluster creation based on various configurations.

  • Monitoring: Yeedu monitoring ensures the functionality of computes and workloads, providing oversight and alerting for any discrepancies.

  • Dependency Management: Yeedu supports a Dependency repository, allowing the efficient utilization of dependencies for running workloads.

  • Cluster Management: Configuration of all clusters is handled within the Yeedu Control Plane, with the capability to span across various cloud providers (GCP, AWS, Azure).

  • Notebooks and Jobs: Notebooks and jobs can be combined within a workspace in Yeedu.

  • Tenant Management: Tenants in Yeedu provide logical isolation of computes and resources. Users are assigned roles within each tenant, dictating their access and actions.

Yeedu Compute

Yeedu Compute harnesses the capabilities of the Control Plane to deliver necessary computational resources:

  • Cluster Creation: Users can initialize compute clusters via the Control Plane, utilized for executing all jobs and notebooks, subject to workspace access permissions.

  • Cluster Types: Yeedu supports multiple cluster types, including YEEDU, STANDALONE, and CLUSTER types for Spark infrastructure. YEEDU presents a cost-optimized approach for workload execution.

    • Yeedu Mode : In YEEDU Mode, we provide a fully managed and optimized Spark infrastructure that leverages our cost-optimized infrastructure to deliver optimal performance and scalability for your Spark workloads.

    • Standalone Mode: In Standalone Cluster mode, Yeedu leverages the flexibility and portability of Docker containers to deploy Spark master and worker nodes within a single virtual machine. This allows for a streamlined and simplified deployment process, with all the components residing within a single environment.

    • Cluster Mode: Yeedu takes advantage of dedicated virtual machine architecture to deploy Spark master and worker nodes on separate virtual machines. This allows for a more scalable and robust deployment, with the ability to horizontally scale the worker nodes based on workload demands.

UI and CLI

The Yeedu platform offers both a UI (UI) and a Command Line Interface (CLI), delivering:

  • Access Control: Users with appropriate permissions can execute operations as per their assigned roles through the web application.

  • Authorization Mechanisms: Users can authenticate into Yeedu using LDAP, SSO, AAD, depending on configurations within the Yeedu Control Plane.

note

All Yeedu components are deployed within the customer's Virtual Private Cloud (VPC), ensuring heightened security and data governance.

This architectural framework ensures Yeedu's scalability, security, and adaptability to diverse computational demands across various environments.