sg

serverless-glue

Serverless plugin to deploy Glue Jobs

Showing:

Popularity

Downloads/wk

12

GitHub Stars

7

Maintenance

Last Commit

1mo ago

Contributors

5

Package

Dependencies

1

License

MIT

Type Definitions

Tree-Shakeable

No?

Categories

Readme

Serverless Glue

This is a plugin for Serverless framework that provide the posibliti to deploy AWS Glue Jobs

Install

  1. run npm install --save-dev serverless-glue
  2. add serverless-glue in serverless.yml plugin section
    plugins:
        - serverless-glue
    

How work

The plugin create CloufFormation resources of your configuration before make the serverless deploy then add it to the serverless template.

So any glue-job deployed with this plugin is part of your stack too.

How configure your GlueJobs

Configure yours glue jobs in custom section like this:

custom:
  Glue:
    bucketDeploy: someBucket # Required
    s3Prefix: some/s3/key/location/ # optional, default = 'glueJobs/'
    tempDirBucket: someBucket # optional, default = '{serverless.serviceName}-{provider.stage}-gluejobstemp' 
    tempDirS3Prefix: some/s3/key/location/ # optional, default = ''. The job name will be appended to the prefix name
    jobs:
      - job:
          name: super-glue-job # Required
          script: src/glueJobs/test-job.py # Required script will be named with the name after '/' and uploaded to s3Prefix location
          tempDir: true # Optional true | false
          type: spark # spark / pythonshell # Required
          glueVersion: python3-2.0 # Required python3-1.0 | python3-2.0 | python2-1.0 | python2-0.9 | scala2-1.0 | scala2-0.9 | scala2-2.0 
          role: arn:aws:iam::000000000:role/someRole # Required
          MaxConcurrentRuns: 3 # Optional
          WorkerType: Standard  # Optional  | Standard  | G1.X | G2.X
          NumberOfWorkers: 1 # Optional
    triggers:
      - trigger:
          name: some-trigger-name # Required
          schedule: 30 12 * * ? * # Optional, CRON expression. The trigger will be created with On-Demand type if the schedule is not provided.
          jobs: # Required. One or more jobs to trigger
            - job:
                name: super-glue-job # Required
                args: # optional
                  --arg1: value1
                  --arg2: value2
                timeout: 30 # optional
            - job:
                name: another-glue-job

you can define a lot of jobs..

custom:
    Glue:
    bucketDeploy: someBucket
    jobs:
        - job:
            ...
        - job:
            ...

And a lot of triggers..

custom:
    Glue:
    triggers:
        - trigger:
            ...
        - trigger:
            ...

Glue configuration parameters

ParameterTypeDescriptionRequired
bucketDeployStringS3 Bucket nametrue
s3PrefixStringS3 prefix namefalse
tempDirBucketStringS3 Bucket name for Glue temporary directory. If dont pass argument the bucket'name will generates with pattern {serverless.serviceName}-{provider.stage}-gluejobstempfalse
tempDirS3PrefixStringS3 prefix name for Glue temporary directoryfalse
jobsArrayArray of glue jobs to deploytrue

Jobs configurations parameters

ParameterTypeDescriptionRequired
nameStringname of jobtrue
scriptStringscript path in the projecttrue
tempDirBooleanflag indicate if job required a temp folder, if true plugin create a bucket for tmpfalse
typeStringIndicate if the type of your job. Values can use are : spark or pythonshelltrue
glueVersionStringIndicate language and glue version to use ( [language][version]-[glue version]) the value can you use are:
  • python3-1.0
  • python3-2.0
  • python2-1.0
  • python2-0.9
  • scala2-1.0
  • scala2-0.9
  • scala2-2.0
true
roleStringarn role to execute jobtrue
MaxConcurrentRunsDoublemax concurrent runs of the jobfalse
WorkerTypeStringThe type of predefined worker that is allocated when a job runs. Accepts a value of Standard, G.1X, or G.2X.false
NumberOfWorkersIntegernumber of workersfalse

Triggers configuration parameters

ParameterTypeDescriptionRequired
nameStringname of the triggertrue
scheduleStringCRON expressionfalse
jobsArrayAn array of jobs to triggertrue

Only On-Demand and Scheduled triggers are supported.

Trigger job configuration parameters

ParameterTypeDescriptionRequired
nameStringThe name of the Glue job to triggertrue
timeoutIntegerJob execution timeoutfalse
argsMapjob argumentsfalse

And now?...

Only run serverless deploy

Rate & Review

Great Documentation0
Easy to Use0
Performant0
Highly Customizable0
Bleeding Edge0
Responsive Maintainers0
Poor Documentation0
Hard to Use0
Slow0
Buggy0
Abandoned0
Unwelcoming Community0
100