Scenario
The starting point is the table of names below.| name |
| andrew |
| andrew |
| andrew |
| berty |
| berty |
| emily |
| emily |
| emily |
| edward |
| edward |
| edward |
| edward |
| hugo |
We want to add a new column, which is the equivalent to the tsql
row_number() over (partition by name order by 1) as rownumber
| name | rownumber |
| andrew | 1 |
| andrew | 2 |
| andrew | 3 |
| berty | 1 |
| berty | 2 |
| emily | 1 |
| emily | 2 |
| emily | 3 |
| edward | 1 |
| edward | 2 |
| edward | 3 |
| edward | 4 |
| hugo | 1 |
Talend TOS
In Talend Open Studio, create a job with the following components:tFixedFlowInput > tMap > tLogRow
“>” is a “Row > Main” flow.

tFixedFlowInput
I configure this component to- schema is just 1 column “name” of "String” datatype

- use the inline content below
andrew
andrew
andrew
berty
berty
emily
emily
emily
edward
edward
edward
edward
hugo
tMap
In this component I declare an output “out1” with columns- name, with String type
- rownumber, with Integer type

Note that I use an expression to populate the “rownumber” column.
Numeric.sequence(row1.name,1,1)
This expression can be explained as:
- Numeric.sequence : this defines a integer sequence
- row1.name : this is the column that is “partition by” and “order by” when compared to the tsql windows function
- 1 : the first “1” is the starting point for the sequence
- 1 : the second “1” is the sequence increment
tLogRow
Finally, the results are returned back to the execution console for review, using the tLogRow component.
The Results
Below is a screenshot of the final execution results.
Thanks, you just made my day ! I was struggling with the same.
ReplyDelete