• Developed a more scalable and adaptive multi-task deep learning model (decaNLP) for email question/answering on Yahoo! mail data than traditional editorial rule-based extraction baseline
• Built a reusable distributed data processing pipeline with Spark, Hadoop, Hive, Pig
• Trained the model on 100 million records of email data and achieved 90% F1 score