Trillo Workbench Lessons
  • Introduction
    • Sample Application
  • Introduction to Lessons
    • IDE Lessons
    • Annotated Directory Structure
  • New classes using workbench UI
  • Save many records into tables using the workbench UI
  • New function using Workbench UI
  • Introduction to Code Lessons
    • Anatomy of Trillo Function
  • Begin Code Lessons
    • Paginated list
    • Pagination using SQL Query
    • DataIterator
    • Read file from cloud storage bucket
    • Generating signed URL for a file download
    • Import a CSV file from bucket into BigQuery
    • Query a dataset of BigQuery
    • Using an Document AI Processor
    • Chat using GenAI
    • Function calling another function
    • Accessing Runtime Context
    • Write a workflow (pipeline for data processing)
  • Appendixes
    • ServerlessFunction
    • Map, List and Tree
    • Logs and Audit Logs
    • DataRequest Class
    • DataResponse Class
    • DataIterator Class
    • Result
Powered by GitBook
On this page
  1. Begin Code Lessons

Write a workflow (pipeline for data processing)

In this lesson, we demonstrate a workflow implementation that reads data from BigQuery and write as “ndjosn” to bucket. It uses BucketOp to write the file to the bucket.

Code: /lessons/Write_a_workflow/WriteWorkflow.java

In Trillo Workbench any function can run as a background process or like a workflow. When it runs in the background a unique task id is assigned to it. All its logs are stored in the database using taskId as one of the fields.

A workflow is generally a long running process and processes a large amount of data such as a large file, large dataset in BigQuery or 1000s of invoices received this week as PDF on an email address. Therefore code for workflow is written as steps that are processing a chunk of data in each iteration. Some of the steps may run concurrently using java threads. A workflow may farm out work to multiple sub workflows running on a cluster of machines. The workflow loops periodically checks if the workflow has been canceled by another process or user. Trillo Workbench handles complexity transparently. It uses “Op” subclasses for concurrent processing.

Steps

  1. Create a new function called Workflow.java.

  2. ...

PreviousAccessing Runtime ContextNextAppendixes

Last updated 1 year ago