zoukankan      html  css  js  c++  java
  • [SAA] 32. Data Engineering

    AWS Batch Overview

    • Run batch jobs as Docker images
    • Dynamic provisioning of the instances (EC2 & Spot Instances) - in VPC
    • Optimal quantity and type based on volume and requirements
    • No need to manage clusters, fully serverless
    • You just pay for the underlying EC2 instance
    • Example: batch process of images, running thousands of concurrent jobs
    • Schedule Batch Jobs using CloudWatch Events
    • Orchestrate Batch Jobs using AWS Step Functions

     

    Lambda vs Batch

    Lambda

    • Time limit: 15 mins
    • Limted runtime
    • Limited temporary disk space
    • Serverless

    Batch

    • No time limit
    • Any runtinme as long as it's package as a Docker image
    • Rely on EBS / instance store for disk space
    • Relies on EC2 (can be managed by AWS)

    Compute Environments

    Managed Compute Environment

    • AWS Batch managed the capacity and instance types within the environment
    • You can choose On-Demand or Spot Instance
    • You can set a maximum price for Spot instance
    • Launched within your own VPC
      • If you launch within your own private subnet, make sure it has access to the ECS service
      • Either using a NAT Gateway / instance or using VPC Endpoint for ECS

    Unmanaged Compute Environment

    • You control and manage instance configuration, provisioning and scaling

     

    Kinesis

    CloudWatch cannot send to Kinesis Data Firehose or Kinesis Data Streams

    Near real-time: Kinesis Data Firehose

    Kinesis agent can directly configured to send data to Kinesis Data Firehose

    Firehose can connect to S3

    Kinesis Data Firehose is near real-time

    Using Lambda to send to ElasticSearch

    Athena

    • Quicksight for visiulization dashboard
    • CloudTrail can stream logs to CloudWatch

    • EMR can choose to use Spot Fleet to control the cost
    • Athena: data must stay in S3
    • Redshift Spectrum for serverless queries on S3

     

     

     

     

  • 相关阅读:
    ElasticSearch 2 (1)
    Vagrant (2) —— 基本安装与配置(下)
    Vagrant (1) —— 基本安装与配置(上)
    Vagrant (3) —— 复制/备份Vagrant Box
    vue中$forceUpdate的使用
    vue+ElementUi 选择框选中之后翻页进行状态保持及默认选中
    loonflow 工单系统
    一些后端知识
    前端学习计划
    async/await函数的执行顺序的理解
  • 原文地址:https://www.cnblogs.com/Answer1215/p/15323895.html
Copyright © 2011-2022 走看看