Skip to main content
Version: v2.4.4

Introduction

What's Yeedu?

Yeedu is a platform that orchestrates the lifecycle of Apache Spark as a computing resource for AWS, GCP, and Azure. Yeedu stands out as the exclusive platform globally, facilitating a seamless provision of Apache Spark as a unified computing solution across multiple cloud environments through a single deployment.

yeedu_clouds

Why Yeedu?

Yeedu, developed by Yeedu LLP, is a dedicated data engineering company with extensive experience in constructing modern data platforms for large enterprises. Drawing upon years of expertise in implementing Apache Spark, we have crafted a platform designed to address key challenges and shortcomings prevalent in the industry.

yeedu_clouds

Stability

Many offerings provide Apache Spark as a computing resource using Kubernetes. However, Kubernetes wasn't designed to handle heterogeneous analytics workloads effectively.

The major Kubernetes providers, such as EKS, AKS, and GKE, offer Kubernetes in a nested VM setup, with a Virtual Machine inside another Virtual Machine. For every instruction, the hypervisor wastes CPU resources by converting it to the parent hypervisor's instruction set, which is then finally converted into the host machine's instructions.

Yeedu dynamically orchestrates the lifecycle of virtual machines or bare metal to deliver Apache Spark as a computing resource. We've built a scheduler from the ground up, focusing on cloud optimization.

Cost

Companies are overcharging for their Apache Spark offerings, with a single deployment potentially amounting to 3-9 times the actual compute usage.

Multicloud & Portability

With the capability to transition seamlessly from one cloud provider to another without overhauling your foundational code, you achieve the flexibility to run workloads wherever it proves most cost-effective. While you might currently benefit from a discount offered by a specific cloud vendor, that discount may not be available tomorrow.

Resource Optimization

  • Yeedu offers multiple schedulers. One of these schedulers is optimized for IO-bound jobs, providing up to a 5-times efficiency improvement compared to schedulers like Kubernetes, YARN, or Mesos.
  • By minimizing the need to transfer data across data centers from various cloud providers, adopting a multi-cloud approach allows organizations to establish computing resources near their data.

Multi-Version Support

Yeedu offers various versions of Apache Spark runtime with Long Term Support (LTS). Migrating from one version of Apache Spark to another may not be straightforward and could entail a process spanning several months.