Footer

Resources

Rust Tiếng Việt
/archives
/series
/tags
Status

me@duyet.net

About
LinkedIn
Resume
Projects

© 2026 duyet.net | Sr. Data Engineer | 2026-03-20

DuyệtSr. Data Engineer

Home About Photos Insights CV

[Slide] Build simple data pipeline for ETL and data aggregation on AWS

Note: This post is over 8 years old. The information may be outdated.

The goal of this document is develop a simple data pipeline for ETL and data aggregation.
Mình vừa có một chút chia sẻ ngắn về xây dựng Data Pipeline trên AWS, phục vụ cho ETL và Data Aggregation. xin phép chia sẻ slide tại đây.

Nếu không xem được vui lòng download theo link sau: https://talk.duyet.net/data-pipeline-aws/design-datapipeline-aws.pdf

Nov 12, 2018·7 years ago

|

1 min read

|Data Engineering|

Data Engineering Talk

Related Posts

Spark on Kubernetes Performance Tuning

Spark Performance tuning is a process to improve the performance of Spark. In this post, I will focus on Spark running on Kubernetes.

Apr 10, 2021·5 years ago

Airflow 2.0 - Taskflow API

Chú trọng vào việc đơn giản hóa và rõ ràng cách viết Airflow DAG, cách trao đổi thông tin giữa các tasks, Airflow 2.0 ra mắt Taskflow API cho phép viết đơn giản và gọn gàng hơn so với cách truyền thống, đặc biệt vào các pipelines sử dụng PythonOperators.

Dec 26, 2020·5 years ago

Tại sao nên triển khai Apache Spark trên Kubernetes

Spark đã quá nổi tiếng trong thế giới Data Engineering và Bigdata. Kubernetes cũng ngày càng phổ biến tương tự, là một hệ thống quản lý deployment và scaling application. Bài viết này bàn đến một số lợi ích khi triển khai ứng dụng Apache Spark trên hệ thống Kubernetes.

Oct 24, 2020·5 years ago

Scheduling Python script in Airflow

To schedule a Python script or Python function in Airflow, we use `PythonOperator`. For Airflow 2.0+, consider using the TaskFlow API for a more modern approach.

Jun 24, 2020·6 years ago