Skill

  • Programming Language: Python
  • Data Engineering: Snowflake, Redis, MySQL, PostgreSQL, Redshift
  • MLOps: Apache Airflow, DVC, dbt, Great Expectations
  • Backend Development: flask, Django, FastAPI
  • DevOps tools and others: Docker, Kubernetes, Jenkins, GitHub Actions, Git, AWS Services

Work Experience

[Feb 2023 - Present] Software Engineer, astronomer

  • apache-airflow
    • Fixed a circular import error prior to releasing new airflow providers (#31379)
    • Fixed an Amazon provider bug due to new airflow providers release (#31482)
    • Integrated the AsyncSensors logic into their Sensor counterpart and lessen the maintenance burden (#30014, #30227,#30231,#30235, #30250)
  • astronomer-providers
    • Reduced async operator overhead by adding checks before sending the task to triggerer (issue 1102)
    • Automated the deployment of integration tests and testing against the release of the airflow provider (#987, #1107, #1139, #1110)

[Apr 2017 - Feb 2023] Machine Leraning Engineer, Rakuten USA

  • Productionize machine learning projects
    • Implemented SQS and gRPC services for grouping emails with similar structures and extracting user-sensitive data to increase the amount of training data without violating customer privacy regulations.
  • Designed and Implemented a two-stage labeling system that automatically communicates between Amazon Mechanical Turk and in-house experts to generate high-quality labeled data and enhance merchandise taxonomy to increase customer conversion rate.
  • Migrated and automated the deployment process of AWS Lambda procedures that process customer lifetime value, reducing the effort of maintenance and deployment.
  • Build and maintain data pipelines on Apache Airflow
    • Implemented a pipeline that processes data larger than 10 GB to infer personalized preferences to help increase customer satisfaction.
    • Migrated legacy 1.x Airflow server on AWS EC2 to 2.0.2 Airflow on AWS MWAA, saving developers' effort on dealing with legacy dependencies issues, and created a development airflow environment for doing experiments without affecting the production pipeline.
    • Refactored data writing mechanism and reduced the data write time and AWS S3 cost.
    • Built alerts and dashboards to monitor pipelines metrics to minimize the effort of troubleshooting using DataDog, Prometheus, and Kibana.
  • Standardize and maintain software engineering practices
    • Created and maintain the project templates, with automatic code quality check, testing, containerization, project versioning, releasing, and deployment, and a standard workflow for existing projects to update tools, which reduced project creation time, the communication overhead during code review and provided an easy way for developers to introduce new standards.
    • Implemented a life-cycle configuration management tool and a workflow for creating Amazon Sagemaker notebook instances which saves data scientists' time on handling engineering problems.
    • Improved container build time and reduced execution time by 70\% for Jenkins CI/CD pipelines.
    • Maintain core package that's used among most existing projects
  • Optimized SQL in a data pipeline and reduced the execution time from infeasible to within half a day.
  • Cooperate with overseas teams in US, Ukraine, and India

[Jan 2019 - March 2019] Project Manager, DLT Lab

  • Containerized and fixed legacy projects in The Mosquito Man
  • Introduced code review culture to a newly formed team
  • Set up a drone CI/CD server and created CI pipelines for two ongoing projects

[May 2018 - Nov 2018] Chief Teaching Assistant, X-Village

  • Managed the executive team with 16 members
  • Organized two months of full-time courses and a one-semester 3 credit course
  • Reviewed the teaching proposal of the Python course, "Programming Design Foundation"
  • Designed exercises for "Data Structure," the first section of "Computer Science Foundations"
  • Lectured "Web Programming, Database/Cloud Computing," the fourth section of "Computer Science Foundations"

X-Village is an experimental education program aiming to equip students not major in computer science with computational thinking and to enhance future cooperation between computer science and other areas.

I was the executor of the program and the leader of the teaching assistant team. Besides, I designed a half-day exercise for Data Structure and lectured a four-hour web backend course for Web Programming, Database/Cloud Computing.

[July 2015 - July 2016] Substitute Military Service, K-12 Education Administration, Ministry of Education

  • Maintained legacy systems implemented in multiple languages, including C#, VBScript, PHP, etc.
  • Developed automation programs for generating reports which save 80% of human labor time
  • Delivered a human resource management system using django

Community Involvement

[Nov 2021 - Present] Vice-Chairperson, PyCon APAC 2022

  • Coordinated 3 squads, including planning, sponsorship, and social media
  • Hosted the first Ask Me Anything event for promoting Call for Proposals

[Oct 2020 - Nov 2021] Chairperson, PyCon Taiwan 2021

  • Coordinated 9 teams and hosted the first online PyCon TW with 550 participants

[Dec 2019 - Sep 2020] Program Chair, PyCon Taiwan 2020

  • Coordinated around 20 team members and introduced community tracks and a speaker-dispatch program to increase the interaction between local communities.

[Jul 2019 – Nov 2019] Program Committee Member, PyCon Taiwan 2019

Talk and Tutorial

For more slides, please check my Speaker Deck.

Award

  • Honorable Mention, 2013 Railway Application Section Problem Solving Competition

Publication

  1. Wei Lee, Chien-Wei Chang, Po-An Yang, Chi-Hsuan Huang, Ming-Kuang Wu, Chu-Cheng Hsieh, Kun-Ta Chuang "Effective Quality Assurance for Data Labels through Crowdsourcing and Domain Expert Collaboration" 21st International Conference on Extending Database Technology, Demo Track (EDBT-2018)
  2. I-Lin Wang, Wei Lee, Chiao-Yu Liao "Effective Heuristics for Scheduling Hump and Pullback Engines in Railroad Yard Operational Plans" Proceedings of the 10th Annual Conference of the Operations Research Society at Taiwan (ORSTW 2013)

Education

[2016-2018]
Master, Computer Science and Information Engineering
National Cheng Kung University, Tainan
GPA: 4.16/4.3

[2011-2015]
Bachelor, Industrial and Information Management
Double Major: Computer Science and Information Engineering
National Cheng Kung University, Tainan
GPA: 3.77/4.0 (CSIE GA: 3.87/4.0)

Additional Experience

Open Source Contributions

Web Service

Chat Bot

Utility

Tutorial and Study Note

Slide

Books

MOOCs

Comments

Do you like this article? What do your tink about it? Leave you comment below