Setting up Yeedu with Prefect
This guide provides step-by-step instructions for integrating Yeedu with Prefect to create and manage data workflows.
Installation
To use the Yeedu Operator in your Prefect environment, install it using the following command:
pip3 install prefect-yeedu-operator
Overview
The YeeduOperator
enables you to efficiently define and orchestrate Yeedu jobs and notebooks. It simplifies the process by:
- Submitting Jobs and Notebooks: Scheduling jobs and notebooks in Yeedu using this operator, integrating them seamlessly into your Prefect workflows.
- Monitoring: The operator keeps you informed about the status of your submitted Yeedu jobs and notebooks, providing you with real-time updates.
- Handling Completion: After execution, the operator accurately captures and manages the status updates of Yeedu jobs and notebooks.
- Managing Logs: All relevant logs associated with Yeedu jobs and notebooks are accessible within Prefect.
1. Prerequisites
Before using PrefectYeeduOperator, make sure you meet the following requirements:
- Access to Yeedu
- Creating blocks via the Prefect UI: To store Yeedu credentials
- Authenticate to Prefect Cloud via CLI: Ensure you are authenticated to Prefect Cloud from the CLI before submitting flows
- Install YeeduOperator: To use the Yeedu Operator in your Prefect environment, install
prefect-yeedu-operator
1.1 Create Prefect blocks
Blocks can be created in two ways
Using Prefect UI
-
Accessing the Blocks Section
- Navigate to the Prefect UI and click on
Configurations
in the left panel. - From dropdown menu, select the
Blocks
section .
- Navigate to the Prefect UI and click on
-
Creating a JSON block
-
Click on the plus symbol
+
to create new connection. -
Search for
JSON
and click onCreate
. Fill in the required detailsBlock Name
: Provide a unique name. Example:yeedu_connection_details
Value
: Provide below JSON
{
"username": "YSU0000",
"YEEDU_VERIFY_SSL": "false",
"YEEDU_SSL_CERT_FILE": "Provide the path to the certificate file if YEEDU_VERIFY_SSL is true"
}
-
-
Creating a Secret block
- Click on the plus symbol
+
to create new secret. - Search for
Secret
and click onCreate
. Fill in the required detailsBlock Name
: Provide a unique name. Example:block_login_password
Value
: Provide Yeedu login password or token
Note: If the Yeedu authentication type is
- LDAP or Azure AD: create a block for storing password.
- Azure AD SSO: create a block for storing token.
- Click on the plus symbol
Using python
-
Create and Save a JSON Block
Add the following lines of code to a python file.
-
value
- Modify the JSON value details as per your requirement -
name
- block_namefrom prefect.blocks.system import JSON
json_block = JSON(value={
"username":"ysu0000",
"YEEDU_VERIFY_SSL":"false",
"YEEDU_SSL_CERT_FILE":"Provide the path to the certificate file if YEEDU_VERIFY_SSL is true"})
json_block.save(name="connection-block")
-
-
Create and Save a Secret Block
Add the following lines of code to a python file.
-
value
- Replace<Password>
with the required value -
name
- block_namefrom prefect.blocks.system import Secret
secret_block = Secret(value="<Password>")
secret_block.save(name="secret-block")
-
-
Update saved block value
To overwrite the existing block value, pass
overwrite=True
json_block.save(name="connection-block",overwrite=True)
-
Run python file
python3 create_block.py
For detailed guidance on managing blocks, please visit Blocks Documentation
Note- Before creating blocks using Python code, ensure that your environment is authenticated to Prefect Cloud
1.2 Authenticate to Prefect Cloud via CLI
See API Keys documentation to generate API key from prefect UI.
prefect cloud login -k <API_KEY>
2. Flow Creation
To schedule a job or notebook using the YeeduOperator in Prefect, use the flow code provided below.
- job_url: The URL of the Yeedu notebook or job.
- connection_block_name - connection details
- login_password_block_name - password
from prefect import flow
from operators.yeedu import YeeduOperator
@flow(retries=0, retry_delay_seconds=5, log_prints=True)
def flow_name():
job_url = '<YEEDU_JOB_URL>'
connection_block_name = '<NAME_OF_CONNECTION_BLOCK_VARIABLE>'
login_password_block_name = 'NAME_OF_PASSWORD_BLOCK_VARIABLE'
# Uncomment the following line to use a token block for SSO sign-in
# token_block_name='<NAME_OF_TOKEN_BLOCK>'
# Initialize and execute the Yeedu operator
operator = YeeduOperator(
job_url=job_url,
connection_block_name=connection_block_name,
login_password_block_name=login_password_block_name
# Uncomment the following line to use a token block
# token_block_name=token_block_name
)
operator.execute()
3. Execute the Flow - Creating Deployment
3.1 Deployment with Local storage
This deployment mode uses local machine as Prefect Worker and Prefect Flow Code stored Locally
-
Add the following lines to your flow code
if __name__ == "__main__":
flow_name.serve(name="<NAME_OF_DEPLOYMENT>") -
Run the python file
python3 file_name.py
3.2 Deployment with Remote Storage
This deployment mode uses Prefect Work pools with Prefect Flow Code stored in Remote Github Repository
Use the following code snippet.
- url - The URL of the GitHub repository.
- branch - provide branch name
- access_token - provide block name .
- name - The name of the deployment
- work_pool_name - The name of the work pool.
- entrypoint - Specify the path to the flow code file stored in GitHub along with the name of the flow
from prefect import flow
from prefect.runner.storage import GitRepository
from prefect.blocks.system import Secret
flow.from_source(
source=GitRepository(
url="<YOUR_GITHUB_REPO_URL>",
branch="<YOUR_BRANCH_NAME>",
credentials={
"access_token": Secret.load("<NAME_OF_THE_GITHUB_TOKEN_BLOCK_VARIABLE>")
}
),
entrypoint="<YOUR_ENTRYPOINT_FILE>:<YOUR_FLOW_FUNCTION>",
).deploy(
name="<NAME_OF_THE_DEPLOYMENT>",
work_pool_name="<NAME_OF_WORKER>"
)