Skip to content

Flight Cancellation Prediction using ML

Project Overview

  • Length: 1 month of preparation and 7 hours of Hackathon
  • Content: MBA students teamed up with Data Scientists from GoodData. We were given public domestic flight data and 7 hours at the Hackathon to provide any insights that we’d like.
  • Data set: 47 million rows, 4GB, CSV file
  • Tools: We used R to build prediction model, Excel to deliver insights from historical data, and Google Slides for the presentation.

Our approach

  • Team meeting – MBA students: Determined our analysis focus (flight delays? cancellation? others?) and developed hypothesis.
  • Team meetings – MBA students and Data Scientists: Discussed about our analysis focus, agreed on the tools and amount of data we would use during the Hackathon
  • Hackathon:
    • I presented my idea of building a prediction model of flight cancellation.
    • My team agreed, and GoodData members helped to clean the data and build a multiple regression prediction model.
    • I came up with the creative recommendations about how flights search engines like Skyscanner could monetize from this model.
    • We presented our insights in terms of “Descriptive”, “Predictive” and “Prescriptive” Analysis.

Snapshot of the data set


Final presentation