Footer

Logo

Resources

  • Rust Tiếng Việt
  • /archives
  • /series
  • /tags
  • Status

me@duyet.net

  • About
  • LinkedIn
  • Resume
  • Projects

© 2026 duyet.net | Sr. Data Engineer

LogoDuyệtSr. Data Engineer
HomeAboutPhotosInsightsCV
All Topics

Data Engineering

57 articles exploring Data Engineering engineering and architecture

57 posts
9 years

2023

Apache OpenDAL in Rust to Access Any Kind of Data Services
Sep 09
DuckDB
Sep 03
Airflow control the parallelism and concurrency (draw)
Jul 16
Fossil Data Platform Rewritten in Rust 🦀Featured
Jun 18
Running Spark in GitHub ActionsFeatured
May 07
GPT vs Traditional NLP Models
Apr 01
Ask ChatGPT about 20 important concepts of Apache Spark
Feb 26
Rust Data Engineering: Processing Dataframes with Polars
Feb 19
Data Engineering Tools written in RustFeatured
Jan 22
Why ClickHouse Should Be the Go-To Choice for Your Next Data Platform?Featured
Jan 10

2022

Airflow Dataset (Data-aware scheduling)
Sep 27
Spark on Kubernetes tại Fossil 🤔Featured
Mar 09
Manage Redshift/Postgres Privileges GitOps Style
Feb 24

2021

Rust và Data Engineering? 🤔Featured
Nov 27
Spark on Kubernetes - better handling for node shutdownFeatured
Nov 22
Uptime with GitHub Actions
Sep 20
From Docker to Podman on MacOS
Sep 05
Good reasons to use ClickHouse
Aug 29
zx
Aug 28
Bitbucket Pipelines Notes
Aug 27
Postgres Full Text Search
Jul 04
Spark on Kubernetes Performance Tuning
Apr 10

2020

Airflow 2.0 - Taskflow APIFeatured
Dec 26
Tại sao nên triển khai Apache Spark trên Kubernetes
Oct 24
Scheduling Python script in Airflow
Jun 24
Spark History Server on KubernetesFeatured
May 29
3 ways to run Spark on KubernetesFeatured
May 24
Airflow DAG Serialization
May 01
Data Studio: Connecting BigQuery and Google Sheets to help with hefty data analysis
May 01

2019

Đánh giá hệ thống Information Retrieval (tiếp theo)
Oct 09
Sách hay (Engineering)
Sep 17
Đánh giá hệ thống Information RetrievalFeatured
Aug 31
Information Retrieval - Vector Space ModelFeatured
Aug 30
Airflow - một số ghi chép
Aug 27
Cài đặt Apache Airflow với Docker Compose
Aug 26
Gửi Slack Alerts trên Airflow
Aug 20
Airflow - "context" dictionary
Aug 09
Guess.js
Aug 09

2018

[Slide] Build simple data pipeline for ETL and data aggregation on AWS
Nov 12
Deploy Deep Learning model as a web service API
Jul 21
Sử dụng PyTorch với GPU miễn phí trên Google Colab
Jun 03
Propel - Machine learning for Javascript
Mar 01
Duckling - phân tích văn bản sang dữ liệu có cấu trúc
Feb 19

2017

Colaboratory - phiên bản custom của Jupyter Notebook từ Google
Nov 07
Python - Nhận dạng xe hơi với OpenCV
Sep 20
Phân lớp văn bản
Aug 11
natural - NLTK cho Javascript
Aug 06
Cài Apache Spark standalone bản pre-built
May 31
NLP - Truyện Kiều Word2vec
Apr 16
Learning R cheatsheet
Feb 05
Rancher - Quản lý Docker Container bằng UI
Jan 23

2016

vnTokenizer trên PySpark
Dec 14
R trên Jupyter Notebook (Ubuntu 14.04 / 14.10 / 16.04)
Nov 22
Spark: Convert Text (CSV) to Parquet để tối ưu hóa Spark SQL và HDFS
Sep 21
Chạy Apache Spark với Jupyter Notebook
Sep 20
PySpark - Thiếu thư viện Python trên Worker
Sep 08

2015

Tìm hiểu về hệ thống quảng cáo và quảng cáo Online
May 17