Location

Corsair - Ho Chi Minh city

Position

Data Analytics Engineer | Applied ML

Description

  • Collaborated with Marketing, Finance, Sales, Customer Service and Operations teams to gather requirements and design AI driven data solutions
  • Configured and deployed Dagster on Docker/Linux servers, managing orchestration environments and ensuring reliable scheduling for data pipelines
  • Designed and built data pipelines using Dagster, DBT, and Snowflake. Implemented Kimball dimensional models to support enterprise analytics
  • Developed backend services with Python (FastAPI), Langchain, LangGraph for AI and data applications
  • Automated monitoring with email alerts and dashboards (Dagster, DBT, Snowflake, Power BI, Grafana)
  • Highlighted Project: Sentiment Analysis
    • Built an end-to-end pipeline to process unstructured data sources (Zendesk tickets, Amazon reviews, Reddit comments) into structured datasets using Dagster, DBT, and Snowflake
    • Developed an LLM powered chatbot (LangGraph + FastAPI + Postgres) for customer feedback insights
    • Designed prompt engineered workflows for classification, summarization, and sentiment analysis
  • Highlighted Project: Corsair Marketing Intelligence Solution
    • Designed automated pipelines integrating Amazon Vendor Central, Keepa, Rainforest, and Stackline APIs to capture product details, reviews, promotions, and dynamic pricing data
    • Implemented CDC-enabled ETL workflows with Elementary-Data for data quality monitoring, and explored interactive analytics using Snowflake + Streamlit
    • Created feature engineering workflows to train ML models (XGBoost) for forecasting Corsair and competitor market shares
    • Visualized forecasting outputs in Power BI dashboards to support decision-making
  • Highlighted Project: Customer Life Value Prediction
    • Integrated Shopify API sales data with Oracle ERP to create a unified customer dataset
    • Developed a CLV model using BG/NBD (purchase frequency) and Gamma/Gamma (monetary value) to probabilistically forecast future revenue
    • Improved segmentation by distinguishing high-value repeat buyers from potential churners, enabling targeted retention strategies

Skills

  • Coding languge: Python, SQL, JavaScript
  • Database: PostgreSQL, Redis, Oracle
  • Data Warehouse: Snowflake, Google BigQuery
  • API: REST, GraphQL, Webhook
  • ETL Tools: Orchestration (Dagster), Intergration (Airbyte), Transformations (DBT)
  • AI techniques: LLMs (OpenAI, Meta Llama), Langchain, RAG, Prompt engineering
  • Framework: FastAPI, Streamlit
  • BI Tools: Power BI
  • Others: Docker, Linux, Git, Postman, AWS, Jira, Excel
  • Soft skills: teamwork (Agile - Scrum)