AWS, data engineering, sql server, talend, postgres, business intelligence, car diy, photography
Search
Wednesday 11 September 2019
Talend Component crib sheet
Here’s a handy reference table for the common Talend components.
Use this document as a reference to understand what types of flows act as input and output.
Input
Component
Component Description
Output
· Row.Iterate
· Trigger
tPostgressqlConnection
· Trigger
· Row.Main
tMap
map input to output columns
derive columns
perform transformations
· Row.Main
· Row.Main
tLogRow
used as data viewer and debugging
print input data in the Run window
· Row.Main
· Trigger
· Row.Main
tPostgresqlOutput
used to insert input data into the database
· Row.Main
· Trigger
· Row.Iterate
· Trigger
tFileList
This outputs an iterative list of files read from a location.
The input can be from a trigger or from another row.iterate.
· Row.Main
· Trigger
· Row.Iterate
· Trigger
tFileInputDelimited
reads data into the job so that the field attributes / schema can be further transformed
· Row.Main
· Row.Iterate
· Trigger
· Row.Iterate
· Trigger
tFileInputJSON
reads data from a location.
presents this data / schema to the job for further transformations.
· Row.Main
· Row.Iterate
· Trigger
· Row.Main
tFileOutputDelimited
used to write data to a file in a specified location
· Row.Main
· Trigger
· Row.Iterate
· Row.Main
· Trigger
tJava
used to run arbitrary java code, to help with bespoke transformations.
can be used to set values to context variables, or to print messages back to the run window to help debugging the job.
· Row.Main
· Row.Iterate
· Trigger
· Row.Iterate
· Row.Main
tExtractJSONField
used to split key:value pair data from a field / string
· Row.Main
· Row.Iterate
· Trigger
· Row.Iterate
tIterateToFlow
converts row.iterate to a row.main
· Row.Main
· Trigger
· Row.Main
tFlowToIterate
converts row.main to a row.iterate
· Row.Iterate
· Trigger
· Row.Iterate
· Trigger
tS3Connection
creates a connection to AWS S3
· Trigger
· Row.Iterate
· Trigger
tS3List
used to list the objects in S3 bucket
outputs this list for iteration
· Row.Iterate
· Trigger
· Row.Iterate
· Trigger
tS3Get
used to download the objects from s3 to a location
· Trigger
· Row.Iterate
· Trigger
tHashOutput
used to load row.main data flow into local cache
· Row.Iterate
· Row.Main
· Trigger
· Row.Main
tHashInput
used to retrieve data from local cache for further transformation within the job
No comments:
Post a Comment