Skip to main content
Version: v2.7.1

PyPI version

Setting up Yeedu with Prefect

This guide provides step-by-step instructions for integrating Yeedu with Prefect to create and manage data workflows.

Installation

To use the Yeedu Operator in your Prefect environment, install it using the following command:

pip3 install prefect-yeedu-operator

Overview

The YeeduOperator enables you to efficiently define and orchestrate Yeedu jobs and notebooks. It simplifies the process by:

  • Submitting Jobs and Notebooks: Scheduling jobs and notebooks in Yeedu using this operator, integrating them seamlessly into your Prefect workflows.
  • Monitoring: The operator keeps you informed about the status of your submitted Yeedu jobs and notebooks, providing you with real-time updates.
  • Handling Completion: After execution, the operator accurately captures and manages the status updates of Yeedu jobs and notebooks.
  • Managing Logs: All relevant logs associated with Yeedu jobs and notebooks are accessible within Prefect.

1. Prerequisites

Before using PrefectYeeduOperator, make sure you meet the following requirements:

1.1 Create Prefect blocks

Blocks can be created in two ways

Using Prefect UI

  1. Accessing the Blocks Section

    • Navigate to the Prefect UI and click on Configurations in the left panel.
    • From dropdown menu, select the Blocks section .
  2. Creating a JSON block

    • Click on the plus symbol + to create new connection.

    • Search for JSON and click on Create. Fill in the required details

      • Block Name: Provide a unique name. Example: yeedu_connection_details
      • Value: Provide below JSON
      {
      "username": "YSU0000",
      "YEEDU_VERIFY_SSL": "false",
      "YEEDU_SSL_CERT_FILE": "Provide the path to the certificate file if YEEDU_VERIFY_SSL is true"
      }
  3. Creating a Secret block

    • Click on the plus symbol + to create new secret.
    • Search for Secret and click on Create. Fill in the required details
      • Block Name: Provide a unique name. Example: block_login_password
      • Value: Provide Yeedu login password or token

    Note: If the Yeedu authentication type is

    • LDAP or Azure AD: create a block for storing password.
    • Azure AD SSO: create a block for storing token.

Using python

  1. Create and Save a JSON Block

    Add the following lines of code to a python file.

    • value - Modify the JSON value details as per your requirement

    • name - block_name

      from prefect.blocks.system import JSON
      json_block = JSON(value={
      "username":"ysu0000",
      "YEEDU_VERIFY_SSL":"false",
      "YEEDU_SSL_CERT_FILE":"Provide the path to the certificate file if YEEDU_VERIFY_SSL is true"})
      json_block.save(name="connection-block")

  2. Create and Save a Secret Block

    Add the following lines of code to a python file.

    • value - Replace <Password> with the required value

    • name - block_name

      from prefect.blocks.system import Secret
      secret_block = Secret(value="<Password>")
      secret_block.save(name="secret-block")
  3. Update saved block value

    To overwrite the existing block value, pass overwrite=True

        json_block.save(name="connection-block",overwrite=True)
  4. Run python file

    python3 create_block.py

For detailed guidance on managing blocks, please visit Blocks Documentation

Note- Before creating blocks using Python code, ensure that your environment is authenticated to Prefect Cloud

1.2 Authenticate to Prefect Cloud via CLI

See API Keys documentation to generate API key from prefect UI.

prefect cloud login -k <API_KEY>

2. Flow Creation

To schedule a job or notebook using the YeeduOperator in Prefect, use the flow code provided below.

from prefect import flow
from operators.yeedu import YeeduOperator

@flow(retries=0, retry_delay_seconds=5, log_prints=True)
def flow_name():

job_url = '<YEEDU_JOB_URL>'
connection_block_name = '<NAME_OF_CONNECTION_BLOCK_VARIABLE>'

login_password_block_name = 'NAME_OF_PASSWORD_BLOCK_VARIABLE'

# Uncomment the following line to use a token block for SSO sign-in
# token_block_name='<NAME_OF_TOKEN_BLOCK>'

# Initialize and execute the Yeedu operator
operator = YeeduOperator(
job_url=job_url,
connection_block_name=connection_block_name,
login_password_block_name=login_password_block_name
# Uncomment the following line to use a token block
# token_block_name=token_block_name
)

operator.execute()

3. Execute the Flow - Creating Deployment

3.1 Deployment with Local storage

This deployment mode uses local machine as Prefect Worker and Prefect Flow Code stored Locally

  • Add the following lines to your flow code

    if __name__ == "__main__":

    flow_name.serve(name="<NAME_OF_DEPLOYMENT>")
  • Run the python file

    python3 file_name.py 

3.2 Deployment with Remote Storage

This deployment mode uses Prefect Work pools with Prefect Flow Code stored in Remote Github Repository

Use the following code snippet.

  • url - The URL of the GitHub repository.
  • branch - provide branch name
  • access_token - provide block name .
  • name - The name of the deployment
  • work_pool_name - The name of the work pool.
  • entrypoint - Specify the path to the flow code file stored in GitHub along with the name of the flow
from prefect import flow
from prefect.runner.storage import GitRepository
from prefect.blocks.system import Secret

flow.from_source(
source=GitRepository(
url="<YOUR_GITHUB_REPO_URL>",
branch="<YOUR_BRANCH_NAME>",
credentials={
"access_token": Secret.load("<NAME_OF_THE_GITHUB_TOKEN_BLOCK_VARIABLE>")
}
),
entrypoint="<YOUR_ENTRYPOINT_FILE>:<YOUR_FLOW_FUNCTION>",
).deploy(
name="<NAME_OF_THE_DEPLOYMENT>",
work_pool_name="<NAME_OF_WORKER>"
)