进入S3桶,查看源数据,它是csv格式:
它的内容如下:
Date,Salesperson,Lead Name,Segment,Region,Target Close,Forecasted Monthly Revenue,Opportunity Stage,Weighted Revenue,Is Closed,ActiveItem,IsLatest
1/2/2011,Gerri Hinds,US_SMB_1317,SMB,US,2/2/2011,103699,Lead,10370,FALSE,NO,0
1/3/2011,David King,EMEA_Enterprise_405,Enterprise,EMEA,4/9/2011,393841,Lead,39384,FALSE,NO,0
1/6/2011,James Swanger,US_Enterprise_1466,Enterprise,US,5/4/2011,326384,Lead,32638,FALSE,NO,0
1/11/2011,Gerri Hinds,US_SMB_2291,SMB,US,2/14/2011,76316,Lead,7632,FALSE,NO,0
进入ETL处理完成后保存数据的桶,它以year, month, day
进行分区,并使用parquet格式:
进入Athena服务,选择StepFunctionsWorkshopWorkgroup
:
Data source选择AwsDataCatalog
,database选择default
。然后执行查询:
SELECT * FROM "default"."sales_marketing_revenue" limit 10;
如果数据ETL成功,则会查到结果:
先删除State Machine,再删除CloudFormation创建出来的栈。