Databricks Migration Guide
Databricks offers proprietary versions of Apache Spark, Delta Lake, and other components, with certain features that are not available in the open-source versions.
Unsupported dbutils
Commands
Yeedu does not support Databricks-specific commands, including the commonly used dbutils
commands. It is recommended to avoid using these commands in Yeedu environments.
List of Unsupported dbutils
Commands and Alternatives
Unsupported Command | Databricks Documentation | Yeedu Alternative |
---|---|---|
dbutils.widgets | dbutils.widgets | Yeedu does not have direct support for interactive widgets. |
dbutils.taskValues | dbutils.taskValues | Yeedu does not support task value sharing. |
dbutils.secrets | dbutils.secrets | In Yeedu secrets can be managed at cluster level as Yeedu config secret |
dbutils.fs | dbutils.fs | Yeedu does not support file system commands through dbutils . |
dbutils.library (for DB 11.0 and Above) | dbutils.library (for DB 11.0 and Above) | Yeedu supports runtime package installation using pip install <<package>> in the notebook. |
dbutils.notebook.run | dbutils.notebook.run | Yeedu does not support notebook execution as part of workflows. |
Unsupported SQL Features (Delta Tables)
The following SQL features, commonly used in Databricks with Delta tables, are not supported in Yeedu. However, alternative methods can be employed to achieve similar outcomes.
List of Unsupported SQL Features and Alternatives
Unsupported Feature | Databricks Command | Yeedu Alternative |
---|---|---|
DELETE FROM with Subqueries | DELETE FROM <table> WHERE id IN (SELECT id FROM <table>) | Use MINUS to create a new table from this table to achieve similar functionality. |
UPDATE with Subqueries | UPDATE <table> SET <column> = <value> WHERE <condition> | Use MERGE INTO for conditional updates. |
TRUNCATE TABLE | TRUNCATE TABLE <table> | Use DELETE FROM <table> to remove all records. |
Database Selection Syntax | USE DATABASE <database_name> | Use USE <database_name> in Yeedu. |
Unsupported Features
Databricks Workflows
Yeedu currently does not support the creation of workflows with dependent tasks like those in Databricks. However, users can achieve workflow orchestration in Yeedu by creating the workflows using Apache Airflow and Prefect. Yeedu provides operators for both Prefect and Airflow that allow users to execute notebook code, JARs, and manage complex workflows through these orchestration tools.
Magic Commands
Magic commands are specific to Databricks (e.g., %run
, %fs
, %sql
) and are not supported in Yeedu. Use standard notebook execution methods available in Yeedu for similar operations.
Multi Language Single Notebook
Yeedu notebooks are designed to support only one programming language per notebook, allowing users to choose between Scala or Python.