确认数据清洗成功

检查S3

进入S3桶,查看源数据,它是csv格式:

image-20231112222953440

它的内容如下:

Date,Salesperson,Lead Name,Segment,Region,Target Close,Forecasted Monthly Revenue,Opportunity Stage,Weighted Revenue,Is Closed,ActiveItem,IsLatest
1/2/2011,Gerri Hinds,US_SMB_1317,SMB,US,2/2/2011,103699,Lead,10370,FALSE,NO,0
1/3/2011,David King,EMEA_Enterprise_405,Enterprise,EMEA,4/9/2011,393841,Lead,39384,FALSE,NO,0
1/6/2011,James Swanger,US_Enterprise_1466,Enterprise,US,5/4/2011,326384,Lead,32638,FALSE,NO,0
1/11/2011,Gerri Hinds,US_SMB_2291,SMB,US,2/14/2011,76316,Lead,7632,FALSE,NO,0

进入ETL处理完成后保存数据的桶,它以year, month, day进行分区,并使用parquet格式:

image-20231112223102690

执行Athena查询

进入Athena服务,选择StepFunctionsWorkshopWorkgroup

image-20231112222319376

Data source选择AwsDataCatalog,database选择default。然后执行查询:

    SELECT * FROM "default"."sales_marketing_revenue" limit 10;

image-20231112222411537

如果数据ETL成功,则会查到结果:

image-20231112222521432

删除资源

先删除State Machine,再删除CloudFormation创建出来的栈。