This is Wei Lee. I'm a
- ๐ Pythonista
- ๐ PyCon Taiwan organizer
- commitizen-tools maintainer
- Apache Airflow committer
#apache-airflow
Mentor and Memebot Bot @ OpenSource4You- ๐ท Traveler
- โบ Member of ๅฐๆนพ้ใฏใซโฒ
- ๐บ Anime Lover
- ๐ Reader
- ๐ต Ukulele Player
- ๐ Locker
I enjoy automating tedious tasks and creating high-quality code. Enjoy participating in open-source communities and contributing to open-source projects. Traveling is also a passion of mine, and I often use PyCon as an opportunity to explore new places. I have attended PyCon Taiwan ๐น๐ผ, PyCon US ๐บ๐ธ, PyCon JP ๐ฏ๐ต, PyCon CA ๐จ๐ฆ, Remote Python Pizza ๐, Euro Python (remotely) ๐ช๐บ, and PyCon APAC ๐ต๐ญ.
I share my technical notes, book digests, and occasional thoughts here. If you're interested in cooking, anime, and traveling, I chat about those things on Those things no one cares about.
You can find me through
I use
Skill
- Programming Language: Python
- Data Engineering: Snowflake, Redis, SQLite, PostgreSQL, MySQL, Redshift
- MLOps: Apache Airflow, DVC, dbt, Great Expectations
- Backend Development: FastAPI, Flask, Django,
- DevOps tools and others: GitHub Actions, Docker, Kubernetes, Jenkins, Git, AWS Services
Work Experience
[Aug 2024 - Current] Senior Software Engineer, Astronomer
- apache-airflow
- ruff
- Implement most of the AIR3XX rules to facilitate the migration from Airflow 2 to Airflow 3
[Feb 2023 - July 2024] Software Engineer, Astronomer
- apache-airflow
- Allow Airflow tasks to execute directly from the trigger
- Add REST API endpoint to manipulate queued dataset events
- Upgrade apache-airflow-providers-weaviate to 2.0.0 for weaviate-client >= 4.4.0 support
- Add Azure managed identities support to apache-airflow-providers-microsoft-azure
- Add
default_deferrable
configuration for easily turning on the deferrable mode of operators
- astronomer-providers
- Contribute existing operators/sensors back to apache-airflow and deprecate this project to reduce maintenance efforts
- Automated the deployment of integration tests and testing against the release of the airflow provider (#987, #1107, #1139, #1110)
- ask-astro
- Setup local dev tools and fix various existing bugs
[Apr 2017 - Feb 2023] Machine Learning Engineer, Rakuten USA
- Productionize machine learning projects
- Implemented SQS and gRPC services for grouping emails with similar structures and extracting user-sensitive data to increase the amount of training data without violating customer privacy regulations.
- Designed and implemented a two-stage labeling system that automatically communicates between Amazon Mechanical Turk and in-house experts to generate high-quality labeled data and enhance merchandise taxonomy to increase customer conversion rate.
- Migrated and automated the deployment process of AWS Lambda procedures that process customer lifetime value, reducing the effort of maintenance and deployment.
- Build and maintain data pipelines on Apache Airflow
- Implemented a pipeline that processes data larger than 10 GB to infer personalized preferences to help increase customer satisfaction.
- Migrated legacy 1.x Airflow server on AWS EC2 to 2.0.2 Airflow on AWS MWAA, saving developers' effort on dealing with legacy dependencies issues, and created a development airflow environment for doing experiments without affecting the production pipeline.
- Refactored data writing mechanism and reduced the data write time and AWS S3 cost.
- Built alerts and dashboards to monitor pipeline metrics, minimizing the effort of troubleshooting using DataDog, Prometheus, and Kibana.
- Standardize and maintain software engineering practices
- Created and maintained the project templates, with automatic code quality check, testing, containerization, project versioning, releasing, and deployment, and a standard workflow for existing projects to update tools, which reduced project creation time, the communication overhead during code review and provided an easy way for developers to introduce new standards.
- Implemented a life-cycle configuration management tool and a workflow for creating Amazon Sagemaker notebook instances, which saves data scientists' time in handling engineering problems.
- Improved container build time and reduced execution time by 70\% for Jenkins CI/CD pipelines.
- Maintain the core package that's used among most existing projects
- Optimized SQL in a data pipeline and reduced the execution time from infeasible to within half a day.
- Cooperate with overseas teams in US, Ukraine, and India
[Jan 2019 - March 2019] Project Manager, DLT Lab
- Containerized and fixed legacy projects in The Mosquito Man
- Introduced code review culture to a newly formed team
- Set up a drone CI/CD server and created CI pipelines for two ongoing projects
[May 2018 - Nov 2018] Chief Teaching Assistant, X-Village
- Managed the executive team with 16 members
- Organized two months of full-time courses and a one-semester 3-credit course
- Reviewed the teaching proposal of the Python course, "Programming Design Foundation"
- Designed exercises for "Data Structure," the first section of "Computer Science Foundations."
- Lectured on "Web Programming, Database/Cloud Computing," the fourth section of "Computer Science Foundations"
X-Village is an experimental education program designed to equip students who do not major in computer science with computational thinking skills and to foster future collaboration between computer science and other disciplines.
I was the program executor and the leader of the teaching assistant team. I also designed a half-day exercise for Data Structure and lectured on a four-hour web backend course for Web Programming, Database/Cloud Computing.
[July 2015 - July 2016] Substitute Military Service, K-12 Education Administration, Ministry of Education
- Maintained legacy systems implemented in multiple languages, including
C#
,VBScript
,PHP
, etc. - Developed automation programs for generating reports, which save 80% of human labor time
- Delivered a human resource management system using Django
Community Involvement
[Nov 2023 - Current] Volunteer, PyCon Taiwan
- Maintain pycontw-blog
[Nov 2022 - Sep 2023] Marketing Team Lead, PyCon Taiwan 2023
- Migrated PyCon Taiwan Blog to pycontw-blog / https://conf.python.tw
[Nov 2021 - Sep 2022] Vice-Chairperson, PyCon APAC 2022
- Coordinated three squads, including planning, sponsorship, and social media
- Hosted the first Ask Me Anything event to promote the Call for Proposals
[Oct 2020 - Nov 2021] Chairperson, PyCon Taiwan 2021
- Coordinated 9 teams and hosted the first online PyCon TW with 550 participants
[Dec 2019 - Sep 2020] Program Chair, PyCon Taiwan 2020
- Coordinated around 20 team members and introduced community tracks and a speaker-dispatch program to increase the interaction between local communities.
[Jul 2019 โ Nov 2019] Program Committee Member, PyCon Taiwan 2019
- Contact keynote speakers and financial aid applicants
- Contribute to the post-event report generator
Talk and Tutorial
- Hold on! You have a data team in PyCon Taiwan!
- 2025/07/ ๐จ๐ฟ - EuroPython 2025
- ๆ่ไนๆ
- 2025/06/11 ๐น๐ผ - ๅทฅ็จๅธซ็ๆๅฐ็ด้ โ slide
- Airflow 3.0 The First Glance
- 2025/03/28 ๐น๐ผ ้ป้ๆตๆฒ้ฅ ้ ญ็ โ slide
- ่ธๅ
ฅ้ๆบ็็ฌฌไธๆญฅ
- 2025/03/16 ๐ป NetDB - Tech Day, Invited Talk โ slide
- Unleash the Chaos: Developing a Linter for Un-Pythonic Code!
- 2025/03/02 ๐ต๐ญ PyCon APAC 2025 โ slide
- 2024/09/21 ๐น๐ผ PyCon TW 2024 โ slide, ๐ฌrecording
- Unlocking Python's Core Magic
- 2024/09/28 ๐ฏ๐ต PyCon JP 2024 โ slide, ๐ฌrecording
- What If...? Running Airflow Tasks without the workers
- 2024/09/11 ๐บ๐ธ Airflow Summit 2024 โ slide, ๐ฌrecording
- Starts Airflow task execution directly from the triggerer
- 2024/05/08 ๐ป Airflow Town Hall โ slide
- Intro to Airflow - From Zero to Hero
- 2024/02/17 ๐ป ๆบไพ้ฉไฝ โ slide
- Atomic Commits: An Easy & Proven Way to Manage & Automate Release Process
- 2023/07/29 ๐น๐ผ COSCUP 2023 โ slide, ๐ฌrecording
- Python Table Manners
- 2020/11/07 ๐น๐ผ Taichung.py โ slide
- 2020/10/16 ๐น๐ผ Hualien.py โ slide
- 2020/08/31 ๐น๐ผ Kaohsiung.py โ slide
- 2020/07/24 ๐ป Euro Python 2020 โ slide, ๐ฌrecording
- 2019/11/17 ๐จ๐ฆ PyCon CA 2019 โ slide
- 2019/10/24 ๐น๐ผ Taipei.py
- commitizen-tools: What can we gain from crafting a git message convention?
- 2020/06/18 ๐น๐ผ Taipei.py โ slide
- 2020/04/25 ๐ป Remote Python Pizza 2020 โ slide
- How to get more than PyCon in a PyCon
- 2019/09/16 ๐ฏ๐ต PyCon JP 2019 - Peer Reviewed Lightning Talk โ slide
- X-Village - ็จไธๅฐๅ
ฉๅๆๆบๅๅ
ฉๅๆ็่ชฒ็จ
- 2019/03/24 ๐น๐ผ SITCON 2019 โ slide, ๐ฌrecording
- Intro to Python Data Science Tools
- CRUD in Flask
1 .2018/08/16 ๐น๐ผ X-Village - Web Course โ slide - ่ณ็ฎก่ฌๅบง (ไธๅ ดๅทฅ่ณ็ฎก็็ๆผ่ฌ)
- 2017/01/22 2018ๆๅคงๅทฅ่ณ็ฎก็ โ slide
- Bot Development
- 2016/12/08 ๐น๐ผ NCKU CSIE - Introduction to Knowledge Discovery and Data Engineering โ slide
- Keras Demo
- 2016/11/03 ๐น๐ผ ๆทฑๅบฆไนๅค โ slide
For more slides, please check my Speaker Deck.
Podcast / Show
Development Sprint
- Apache Airflow
- DurianPy
- PyCon APAC 2025
- PyCon TW 2024
- commitizen-tools
- PyCon US 2024
- COSCUP 2024
- PyCon TW 2023
- PyCon TW 2022
- PyCon TW 2021
- PyCon TW 2020
- PyCon CA 2019
Award
- Honorable Mention, 2013 Railway Application Section Problem Solving Competition
Publication
- Wei Lee, Chien-Wei Chang, Po-An Yang, Chi-Hsuan Huang, Ming-Kuang Wu, Chu-Cheng Hsieh, Kun-Ta Chuang "Effective Quality Assurance for Data Labels through Crowdsourcing and Domain Expert Collaboration" 21st International Conference on Extending Database Technology, Demo Track (EDBT-2018)
- I-Lin Wang, Wei Lee, Chiao-Yu Liao "Effective Heuristics for Scheduling Hump and Pullback Engines in Railroad Yard Operational Plans" Proceedings of the 10th Annual Conference of the Operations Research Society at Taiwan (ORSTW 2013)
Education
[2016-2018]
Master, Computer Science and Information Engineering
National Cheng Kung University, Tainan
GPA: 4.16/4.3
[2011-2015]
Bachelor, Industrial and Information Management
Double Major: Computer Science and Information Engineering
National Cheng Kung University, Tainan
GPA: 3.77/4.0 (CSIE GA: 3.87/4.0)
Tutorial and Study Note
Slide
- Git Tutorial
- example: Git-Tutorial-Sample