Location
Corsair - Ho Chi Minh city
Position
Data Analytics Engineer | Applied ML
Description
- Collaborated with Marketing, Finance, Sales, Customer Service and Operations teams to gather requirements and design AI driven data solutions
- Configured and deployed Dagster on Docker/Linux servers, managing orchestration environments and ensuring reliable scheduling for data pipelines
- Designed and built data pipelines using Dagster, DBT, and Snowflake. Implemented Kimball dimensional models to support enterprise analytics
- Developed backend services with Python (FastAPI), Langchain, LangGraph for AI and data applications
- Automated monitoring with email alerts and dashboards (Dagster, DBT, Snowflake, Power BI, Grafana)
- Highlighted Project: Sentiment Analysis
- Built an end-to-end pipeline to process unstructured data sources (Zendesk tickets, Amazon reviews, Reddit comments) into structured datasets using Dagster, DBT, and Snowflake
- Developed an LLM powered chatbot (LangGraph + FastAPI + Postgres) for customer feedback insights
- Designed prompt engineered workflows for classification, summarization, and sentiment analysis
- Highlighted Project: Corsair Marketing Intelligence Solution
- Designed automated pipelines integrating Amazon Vendor Central, Keepa, Rainforest, and Stackline APIs to capture product details, reviews, promotions, and dynamic pricing data
- Implemented CDC-enabled ETL workflows with Elementary-Data for data quality monitoring, and explored interactive analytics using Snowflake + Streamlit
- Created feature engineering workflows to train ML models (XGBoost) for forecasting Corsair and competitor market shares
- Visualized forecasting outputs in Power BI dashboards to support decision-making
- Highlighted Project: Customer Life Value Prediction
- Integrated Shopify API sales data with Oracle ERP to create a unified customer dataset
- Developed a CLV model using BG/NBD (purchase frequency) and Gamma/Gamma (monetary value) to probabilistically forecast future revenue
- Improved segmentation by distinguishing high-value repeat buyers from potential churners, enabling targeted retention strategies
Skills
- Coding languge: Python, SQL, JavaScript
- Database: PostgreSQL, Redis, Oracle
- Data Warehouse: Snowflake, Google BigQuery
- API: REST, GraphQL, Webhook
- ETL Tools: Orchestration (Dagster), Intergration (Airbyte), Transformations (DBT)
- AI techniques: LLMs (OpenAI, Meta Llama), Langchain, RAG, Prompt engineering
- Framework: FastAPI, Streamlit
- BI Tools: Power BI
- Others: Docker, Linux, Git, Postman, AWS, Jira, Excel
- Soft skills: teamwork (Agile - Scrum)