Fai Ho Fu's Blog: March 2019

Thursday, 28 March 2019

How AWS Lambda works

Below is a diagram (taken from AWS.training) that’s the result of a Lambda cold start, using AWS X-Ray.

In this example, an object is added to an S3 bucket, which asynchronously triggers the Lambda function invocation.

Asynchronous push events are queued

The diagram below is for a Lambda warm start.
Note the shorter duration.
Cold start: 878ms
Warm start: 199ms

The important fact is the lower billed duration of the warm start.
Cold start: 558ms (297 + 261)
Warm start: 82ms

Wednesday, 27 March 2019

Talend TAC Job Conductor Trigger details from database

Below is a simple query to retrieve “Job Conductor – Task – Trigger” details from the database tables associated to Talend Admin Center (TAC).

select
et.id as executiontask_id
--,et.dtype
,et."label" as task_name
,et.idquartzjob
,t.trigger_name
,t.job_group
,tt.dtype
,tt."label" as trigger_label
,tt.description as trigger_desc
,tt.active as trigger_active
,to_timestamp(t.next_fire_time/1000) as next_fire_time
,to_timestamp(t.prev_fire_time/1000) as prev_fire_time
,to_timestamp(t.start_time/1000) as start_time
,t.start_time / 1000 as start_time_seconds
,t.start_time as start_time_unix_milliseconds
,t.trigger_state
,ct.cron_expression
,tt.listminutes
,tt.listhours
,tt.listdaysofweek
,tt.listdaysofmonth
,tt.listyears
,et.jobid
,et.idremotejob
,et.jobname
,et.generatedprojectname
,et.generatedjobname
,et.generatedjobversion
,et.generatedsvnrevision
,et.artifactgroupid
,et.artifactid
,et.runasuser
,et.status
,et.errorstatus
,et.jobversion
,et.context
,et.branch
,et.active
,et.lastscriptgenerationdate
,et.lastdeploymentdate
,et.lastrundate
,et.lastendedrundate
,et.lasttriggeringdate
,et.jobscriptarchivefilename
from executiontask et
left join qrtz_triggers t on t.job_name = et.idquartzjob::varchar
left join qrtz_cron_triggers ct on ct.trigger_name = t.trigger_name
left join talendtrigger tt on ct.trigger_name = tt.idquartztrigger::varchar
;

NOTE

Within the qrtz_cron_triggers table, the cron expression could look similar to

0 1,2,3 4,5,6 ? 4 2,3,4,5,6 2020

Note that there’s a space to help separate the difference parts of the cron expression

seconds:0 –this is always zero as the cron-scheduler within TALEND doesn’t have the functionality to set to the second.

minutes:1,2,3
hours:4,5,6
days of month:? –this means it’ll run for every day in the month
month:4 –April = 4th month in the year
days of week (1 = Sunday):2,3,4,5,6 –this is for Monday, Tuesday, Wednesday, Thursday, Friday
year:2020

EXECUTION PLAN

When Job Tasks are set up as a sequence, or have inter-relationships (parent child), or setup as some form of hierarchy within the EXECUTION PLAN CONSOLE, then these details can be viewed using the following query:

select
etp."label" as parent_task_name
,et."label" as child_task_name
,case when e.executionplanpart_parent_id is null then 'Header' else '' end as Header
,e.id as executionplan_id
,e.executionplanpart_parent_id
,e.executionplan_executionplan_id
,e.executiontask_task_id
,e.status
,e.startdate
,e.enddate
--,e.*
--,et.*
FROM public.executionplanpart e
left join executiontask et on et.id = e.executiontask_task_id
left join executiontask etp on etp.id = e.executionplan_executionplan_id
order by e.executionplan_executionplan_id , e.id
;

Wednesday, 20 March 2019

Talend Job Statistics error, Connection refused, connecting to socket

https://community.talend.com/t5/Administering-and-Monitoring/resolved-Sockets-range-to-get-statistics/td-p/99764

When executing a Talend job via TAC (Talend Administration Center / the Talend Server), the following error message can be returned if the JobServer / Execution service is a remote server.
ie The JobServer or Execution service isn't on the same server as TAC.

[statistics] connecting to socket on port 10226
[statistics] connection refused

The statistics port 10226 is chosen at random by the server from a range of 10000 to 11000.
The connection is refused if this is blocked by a firewall (or security group in the case of AWS instances).
The details below help correct this.

Changing Statistics port settings

TOS (Talend Open Studio)

Window -> Preferences -> Talend -> Run/Debug and there you can change the stats port range.

TAC

For Talend Administration Center, edit the following file:
TOMCAT_FOLDER/webapps/org.talend.administrator/WEB-INF/classes/configuration.properties
Look for these lines (or add them onto the end of the file):
# The range where find a free port on the Administrator machine, where the job will send the statistics informations during its execution
scheduler.conf.statisticsRangePorts=10000-11000

And then change the scheduler.conf.statisticsRangePorts parameter for the desired range, like this:
scheduler.conf.statisticsRangePorts=10000-10010

The file should look something like this:

Tuesday, 19 March 2019

AWS Webinar: Testing and Deployment Best Practices for AWS Lambda-Based Applications

[This page contains the Q&A from the AWS webinar on “Testing and Deployment Best Practices for AWS Lambda-Based applications.

The details below are useful, but always use this information in combination with your own research.

Answers to the questions were provided by Eric Johnson of AWS during the webinar.

]

AWS SAM: AWS Serverless Application Model
The AWS Serverless Application Model (AWS SAM) is a model to define serverless applications. AWS SAM is natively supported by AWS CloudFormation and defines simplified syntax for expressing serverless resources. The specification currently covers APIs, Lambda functions and Amazon DynamoDB tables.
https://docs.aws.amazon.com/lambda/latest/dg/serverless_app.html

Q: What is your recommendation on creating and maintaining connection pools to RDS / POSTGRES databases that are needed in a Lamba Function?
A: If you are using Aurora Serverless look at the Data API. https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/data-api.html

Q: Can you use Cloud9 to debug SAM local?
A: Yes, however, C9 also offers a UI for debugging, that is based on SAM local - either work.

Q: I don't really understand the context object of a lambda function. What is it used for?
A: This object provides methods and properties that provide information about the invocation, function, and execution environment.
https://docs.aws.amazon.com/lambda/latest/dg/nodejs-prog-model-context.html

Q: How does debugging work with SAM?
A: You can set breakpoints and have a call stack available. It connects to the debug port provided by SAM Local. https://docs.aws.amazon.com/serverless-application-model/latest/developerguide/serverless-sam-cli-using-debugging.html

Q: Is there any way to autogenerate sam template? or need to write it yourself?
A: Using "SAM init" on the SAM cli will create a base template for a project.

Q: How to do canary, an blue/green with serverless?
A: https://aws.amazon.com/blogs/compute/implementing-canary-deployments-of-aws-lambda-functions-with-alias-traffic-shifting/ You can do traffic shifting between your serverless functions. Just what Chris is explaining right now.

Q: If a lambda function is contained on a VPC, being that VPC different per stage… how can we define the VPC to be used for a certain Environment? (if using tags to deploy)
A: You can specify VPC through template parameters.

Q: How to daisy-chain lambdas together?
A: Step-functions allows you to write loosely coupled lambda functions together instead of creating dependency between them. More elegant and less glue code required;

Q: If running a serverless .net core app on a lambda function which exposes numerous endpoints does a cold start relate to the lambda function as a whole or per endpoint?
A: Cold start is specific to a Lambda function and not individual endpoints supported by the same Lambda function. For example: if one endpoint starts the Lambda execution environment, another call to an endpoint on the same lambda function will be warm and ready to go.

Q: Is there anyway to use OpenAPI to generate SAM template?
A: API Gateway resources can use OpenAPI documents to define the endpoints. You can also export your API Gateway to an OpenAPI doc from within the console.

Q: What is the best way to trigger N number of lambda functions at once? Should SQS be used?
A: Lambda functions can be triggered from SQS but it is not guaranteed they will be triggered at the same time. You can also broadcast to multiple Lambda functions via SNS.

Q: Is serverless a good idea to sue for a project where my users start using the app everyday between 8;30 to 9 A.m and shuts down at 5:30 pm?
A: As far as timing, Serverless would work well for this because you will only be charged for the invocations. There will be no resources sitting idle.

Q: is sam also a sub component of the aws cli or will it be?
A: SAM is not a sub-component of the AWS CLI, but DOES depend upon it. For example, SAM package is a wrapper for aws lambda package.

Q: will there be delay to execute the Lamda when user first start using in the morning? considering the cold start ?
A: The first time a Lambda function is invoked there will be a slight cold start delay. The length of this delay is dependant on multiple factors like memory, function size, language, and if it is in a VPC. Outside of a VPC the cold start is often in the milliseconds

Q: Do lambda layers work locally in node js? I've updated aws-cli, sam local, and pip, but this doesn't seem to work locally
A: Yes it does work. You do need the latest version of AWS CI and SAM. Updating AWS CLI requires the latest Python as well for the latest features. If you are on a MAC I have found that using Brew is the best way to install.

Q: Is the AWS SAM CLI suitable for use in automation?
A: Depends on the automation, but I would generally say it is best for development. Automation should be achieved through tools like CodePipeline, CodeBuild, CodeDeploy, etc

Q: Are there some example projects using AWS SAM?
A: The Serverless Application Repository has many examples to look at. https://aws.amazon.com/serverless/serverlessrepo/

Q: Where do the various environment configs live, in their own template files?
A: Configs can live several places. Parameter Store, Build Variables, Template Variables. If using a different account for environments I encourage the use of the parameter store

Q: It seems like when I use layers in sam templates. I need to push from a sam template first to create a new version of the layer. Then manually change the version number in the template to reference the new layer version. Is there a way to reference code that you are zipping up and sending to s3 to become a layer version in the lambda template. Then referencing that same template for zipping into s3 for your layer link?
A: For iterating purposes, if the layer is on the same template you can reference the logical name of the layer. It will grab the latest each time

Monday, 11 March 2019

AWS training links

Below are a short list of links to AWS training material that I’ve found extremely useful.

Comment	Link
AWS Training / Learning library home page. Free registration required. Hundreds of modules to learn about the different AWS services. An incredible resource which is amazingly free.	https://www.aws.training/ https://aws.amazon.com/events/awsome-day-2019/
AWS Cloud Practitioner Essentials course. A great way to quickly learn the basics of the core services, through watching a series of structure videos with multiple choice questions at the end of each topic to enforce and consolidate your learning.	https://www.aws.training/learningobject/curriculum?id=27076
Qwiklabs. A a great resource to learn cloud skills. The courses provide sandbox environments to learn new skills. There’s also free courses to learn to help learn some of the basics of the course services. Qwiklabs covers both AWS and GCP.	https://www.qwiklabs.com/catalog?keywords=&cloud%5B%5D=AWS&format%5B%5D=any&level%5B%5D=any&duration%5B%5D=any&price%5B%5D=free&modality%5B%5D=any&language%5B%5D=any
AWS Monthly webinars. These are a great resource to see what’s new, or gleam in depth insight into a particular service.\ These sessions change frequently as AWS services evolve, so also keep popping back to the page to see what’s new.	https://aws.amazon.com/about-aws/events/monthlywebinarseries/ ARCHIVE https://aws.amazon.com/about-aws/events/monthlywebinarseries/archive/
AWS Documentation. The trusty default go to when you need to reference something within AWS.	https://docs.aws.amazon.com/index.html#lang/en_us
AWS Youtube channel	https://www.youtube.com/user/AmazonWebServices

AWS FREE TIER - when creating a new account -- does the free-tier time/window start as soon as the account is created, or as soon as a particular service is started?

The 12 Months starts from the account creation

https://aws.amazon.com/free/?all-free-tier.sort-by=item.additionalFields.SortRank&all-free-tier.sort-order=asc

Please see the FAQ - https://aws.amazon.com/free/free-tier-faqs/

As well as a link to an article discussing tracking free tier usage.

You are also able to leverage Cloudwatch and creating billing alarms

https://docs.aws.amazon.com/awsaccountbilling/latest/aboutv2/tracking-free-tier-usage.html

Thursday, 7 March 2019

[NOTES] Building a Highly Efficient Analytics Team

Notes from the PragmaticWorks webinar
https://pragmaticworks.com/Training/Details/Building-a-Highly-Efficient-Analytics-Team

How to build a better analytics team

Analytics teams help differentiate difference businesses.

How businesses use their data to

optimise business process

understand what is a “normal benchmark”
highlight opportunities to improve (based on existing benchmarks)

highlight new opportunities that wasn’t obvious previously (can’t see the wood for the trees)

How do we define “world class”?

Attitude
know that there’s always room to learn
Aptitude
grow your skillset to deliver measurable benefit
Adaptability
Acceleration

How to assess your current team?

Build a team based on potential, not on current-skill set

Getting from today to world-class

rate your team on the “4 A’s”
Clean up low hanging fruit
Hand off “chores”
Create capacity for improvement
Set aggressive 90-day goals
Get creative with team activities (hack-a-thons etc)
Create specific plans and targets
Put your plan into action and socialize results!

Gaining the Necessary Skills

What skills are needed?

Data Cleansing
Data Modelling
Data Analysis eXpression
Data Visualization Best Practice
Good story telling, guiding the user through the main themes/conclusions that the data presents
Power BI Administration
Data Governance

How do you gain these skills?

Free webinars
Blogs
Books
YouTube
On-Site Training
Web-based On-Demand Learning
Pluralsight, Udemy

Pushing your team

Set goals

By the end of the month, you should know how to…
By the end of the Quarter, you should know how to implement …

Find out what motivates the team

Competition between co-workers
Buy lunch for your team when they reach a goal

Define Learning Paths

What things you want your team to know, and in what order

Other considerations

Specialist vs Jack-of-all-Trades
Find ways to jump start your teams experience early (Hack-a-thons)
Value of Certifications
Be prepared why certifications are valuable, and what was learnt, and how that matters

Search