2018 Columbia Data Science Hackathon
The Columbia Data Science Society, in collaboration with the Tow Center for Digital Journalism, proudly hosted the fourth annual Columbia Data Science Hackathon. We were excited by what you could do in collaboration with other students and mentors using novel datasets provided by our corporate sponsors. We hope you enjoyed the hackathon as much as we did, and we hope to see you again next year!
Tow Center for Digital Journalism
Columbia Tech Ventures
Almost every day, the White House publishes a 4-6 stories under the “West Wing Reads” banner. This dataset brings those stories together, it includes the titles, publications, date of publication, as well as the entities mentioned within the stories, alongside other metadata.
The core of the dataset is a list of all inventions disclosed to Columbia Tech Ventures by inventors dating back to the 1980s. Many of these inventions were discovered in the course of grant-supported research described in academic publications.
Qu Capital provided two time series datasets - tick-level data for bitcoin on a major cryptocurrency exchange and a parsed corpus of reddit comments from select subreddits.
Joseph D. Jamail Lecture Hall
Saturday, September 29, 2018
Data Never Sleeps
Team Members: Kedi Cui, Zhe Liu, Yang Song, Xiangtian Deng
Constructed a bitcoin trading algorithm by using an ensemble of machine learning algorithms - XGBoost, ARIMA, LSTM, and NLP - to predict market price.
Black and White
Team Members: Quan Yuan, Xiaowo Sun, Jie Li, Xiaofan Zhang
Implemented a combination of machine learning, NLP, and time series to do feature engineering, predictive modelling, and designing an arbitrage strategy.
Team Members: Jinhao Zhang, Mingfeng Li, Yinan Ling, Nan You
Identified high-value inventions from patent data using feature selection and neural network.
Knowledge Trumps All
Team Members: Thompson Bliss, Jacob Klein, Alex Kim, Patrick Lewis
Created a web app that quantifies the differences in writing style between news publishers and predicts whether an article will be promoted by the west wing.