[AWS] Create a Glue Catalog Table using AWS CDK

AWS CDK is a framework to manage cloud resources based on AWS CloudFormation. In this post, I will focus on how to create a Glue Catalog Table using AWS CDK.

Read More

[Airflow] Using the KubernetesPodOperator on Cloud Composer

Cloud Composer is a fully managed version of the open source workflow tool Apache Airflow on Google Cloud Platform (GCP). To run docker container from Cloud Composer, one of the way is to use the KubernetesPodOperator, which can launch Kubernetes pods into Kubernetes.

This post will cover these topics:

  • Build container images with Google Cloud Build
  • Create Kubernetes Secrets
  • Using the KubernetesPodOperator

Read More

[Airflow] Scheduling

The Airflow scheduler monitors all tasks and DAGs, it triggers tasks and provides tools to check their status. However, to schedule these tasks could be tricky.

Read More

[Data] Explore the Hotel Review Data

In this post, I am going to explore a Hotel Review dataset from Kaggle using pandas and visualize data using matplotlib.

Read More

[Spark] Run Spark Job on Amazon EMR

Amazon Elastic MapReduce (EMR) is a managed cluster platform on Amazon Web Services (AWS) for big data processing and analysis. It provides a simplifier way to run big data frameworks such as Apache Hadoop and Apache Spark.

This post will focus on running Apache Spark on EMR, and will cover:

  • Create a cluster on Amazon EMR
  • Submit the Spark Job
  • Load/Store data from/to S3

Read More

[Tensorflow] 初學筆記 (3) Variable

對 Tensorflow 的架構以及 Session 有了基本概念,接下來要了解 Tensorflow 是怎麼利用 Variables 來 maintain state。

Read More

[Tensorflow] 初學筆記 (2) Session

繼續來看 Session 的部分。

Session 是 Tensorflow 用來執行命令的語句。可以利用 seesion.run() 來執行已經建立好的 graph 上的某個部份的運算結果。這裡 Session 的筆記和範例程式碼一樣來自官方文件

Read More

[Tensorflow] 初學筆記 (1) Tensorflow 簡介

這系列紀錄了學習 Google Tensorflow 的歷程和一些小心得,主要是參考官方文件。

先從簡介開始吧。



Read More

[Python] NLTK 工具整理

我發現我越來越魚腦了,每次工具用完就忘,每次都要從頭查一次,那就來記錄一些比較常用的。

Read More

Flask + mod_wsgi + Apache on Windows

前幾天接到了一個臨時的任務,寫一個 API。我速速就完成了工作,一切都滿順利的。直到昨天, Mentor 跟我說了一句話:你這個服務是要架在 Windows Server 上的哦…



Read More