Machine Learning for Interviews
Nov
15
7:00 PM19:00

Machine Learning for Interviews

Our next workshop in the fall interview series is Machine Learning for Interviews! We'll cover a lot of the basic information about machine learning in order for you to ace your next interview for software/data science/quant roles. The workshop will take place on Zoom.

CDSS Columbia Data Science Society is inviting you to a scheduled Zoom meeting.

Topic: Machine Learning for Interviews
Time: Nov 10, 2021 08:00 PM Eastern Time (US and Canada)

PRESENTATION SLIDES:

https://docs.google.com/presentation/d/1okARJneTGdiPbdZ-c71t3Z3fqW0TAJh_/edit#slide=id.p1

View Event →
Coding for Interviews
Oct
14
7:00 PM19:00

Coding for Interviews

Looking to learn more about coding for technical interviews in data science roles? CDSS is hosting another interview series in which we'll cover the foundational topics for data science roles so that you can ace your next interview. The first installment in this series is going to be an overview of topics in coding and advice about how to approach problems in technical coding interviews. Open to all skill levels!

At the end, we'll have a Q&A session about what it's like to work at Google, Facebook, the United States Census Bureau, and Wolfram Alpha as an intern.

ZOOM LINK TO JOIN: https://columbiauniversity.zoom.us/j/99905990417

View Event →
CDSS E-Board Applications
Oct
1
to Oct 13

CDSS E-Board Applications

APPLICATION DEADLINE EXTENDED TO OCTOBER 13th,

Do you want to be part of an incredible, fun and rewarding team? Columbia Data Science Society (CDSS) is recruiting new Executive Board members!
We welcome students from all backgrounds and experience levels within Columbia’s undergraduate and graduate programs. We are looking for candidates that have a passion for data science as well as willingness to organize and promote data-science related events—from coding workshops to our annual hackathon. If you are interested, please fill out the form linked below:
https://forms.gle/eT3gRxcx7TEAz2T3A

We will be conducting application review and interviews on a rolling basis until October 13th 11:59 pm.

If you have any questions, feel free to email cdss_executives@columbia.edu. We are all excited to get to know and meet you!

View Event →
Data Science in Government, Politics, and Nonprofits
Apr
6
7:00 PM19:00

Data Science in Government, Politics, and Nonprofits

Join CDSS and our panelists for a discussion on working in the intersection of data science and politics, government, and nonprofits. Through Q&A, attendees will be able to learn about panelists’ experiences, as well as how they can discover opportunities and pursue a career in these fields themselves.

Panelists:
- Ariana Soto serves as Coding it Forward’s Director of Strategic Initiatives and has been with the team since 2018. As Director of Strategic Initiatives she leads a wide range of student-facing programs including the Civic Digital Fellowship, Civic Innovation Corps, and First Act Fund. She also oversees Coding it Forward’s communications and recruiting efforts, including a bi-weekly jobs and internships newsletter that reaches thousands of subscribers. Ariana graduated from Harvard College in 2020 with a B.A. in Government and Computer Science and a certificate in Technology Science. Ariana has experience in local government working with data teams in New York City and her hometown of Los Angeles, CA.

- Joshua Kravitz recently served as a Deputy Data Director on Jon Ossoff’s campaign for U.S. Senate and as the Data Director for Sri Kulkarni’s congressional campaign in TX-22, where he used data to make campaign operations more efficient and scalable. He graduated in June 2020 from Stanford with a B.S. in computer science (focus: systems) and M.S. in statistics (focus: causal inference).

View Event →
SQL Deep Dive Workshop
Mar
23
7:00 PM19:00

SQL Deep Dive Workshop

Join CDSS as we take a deep dive into learning SQL, an essential skill for any data-related role. Structured query language or SQL is the standard language used in communicating with relational databases and is used to create, maintain, and retrieve data. This is a hands-on workshop for anyone looking to gain exposure to SQL! Hope to see you there

View Event →
Data Visualization in R
Mar
18
6:30 PM18:30

Data Visualization in R

Analyzing data is hard, but making it look good can be even harder. Whether it’s for your research or for an internship, making a great graph is extremely important. 


Join CDSS as we explore Data Visualization in R. Our workshop will be broken up into two sections: a lecture that will involve a gentle introduction to graphing in R with ggplot, and a hands-on portion where we use ggplot and other R libraries to analyze financial datasets. See you there!

View Event →
Students of Color in Data Science
Mar
11
to Mar 12

Students of Color in Data Science

Dates/Time:

Industry Panel: Thursday, March 11th 5:30 pm - 6:30 pm EST

Student Panel: Friday, March 12th 1:00 pm - 2:00 pm EST

Coffee Chats: Week of Monday, March 8th, booked by appointment

Coffee Chat Registration: Register Here

The week of March 8th, the Columbia Data Science Society is hosting an event series entitled Students of Color in Data Science. The goal of this event series is to both increase exposure of data science to students of color at Columbia as well as encourage community-building among students breaking into the field. 

The event series will have three parts: an industry panel on March 11th composed of representatives from NYT Data Science, a student-led panel on March 12th about how to study data science during your time at Columbia, and virtual “Coffee Chats” throughout the week which are 15-30 minute 1-on-1 informal chats, booked by appointment, between professionals and students to facilitate mentorship and sense of community among students of color pursuing data science or a related field.

The panelists and representatives all identify as people of color, but all of the events are open to all participants. The event details are below, and you must register to take part in the 1-on-1 Coffee Chats.

View Event →
Data Ontology 101
Feb
25
6:00 PM18:00

Data Ontology 101

You hear about Big Data all of the time: an organization has more data than can fit in a csv file and it’s too much for one computer to hold, and much more than one person can sort through manually. Before you start fitting your models or presenting your visualizations, how do you even find what you’re looking for? How do you organize Big Data?

In this workshop, we’ll give you the rundown on Data Ontologies, a way of efficiently organizing data such that you can automatically glean the nature and relationships of disparate data elements, and therefore easily find and collect what you’re looking for. Understanding the importance of ontologies is yet another important skill in the data scientist/data engineer toolkit, and it is becoming increasingly more common in organizations of all sizes. The workshop, led by the CDSS E-board, will be interactive, and will include a Q/A at the end. See you there!

JOIN EVENT HERE

View Event →
Women in Data Science Panel
Feb
24
8:00 PM20:00

Women in Data Science Panel

Join CDSS and female data scientists from Instagram, Wolfram Research, and Two Sigma in a conversation about being a woman in data science and their roles in general. With several different fields represented, we aim to inspire and support more female-identifying students to join the data science field. Link to Facebook Event: https://www.facebook.com/events/145064414126905

View Event →
Wolfram Workshop Series
Jan
28
to Jan 30

Wolfram Workshop Series

Interested in developing a new skill in data science and computation? Want to add a useful certification to your CV/resume? CDSS and Wolfram present a three-part workshop series introducing you to the Wolfram Language and preparing you for the Wolfram Technology Certified Level I Exam. Becoming comfortable in Wolfram Language can be very useful for success in STEM research and classes at Columbia University. Topics addressed will range from data visualization to machine learning. No previous experience is required to participate. It is highly recommended that you attend all three sessions to complete the certification. Please register at https://wolfr.am/ColumbiaDataScienceWorkshop beforehand so we can send you the link! Hope to see you there!

View Event →
Data Science in Biotechnology Panel w/ EVQLV
Dec
3
5:00 PM17:00

Data Science in Biotechnology Panel w/ EVQLV

Are you interested in a career at the intersection of biotechnology and analytics? Join CDSS on Thursday December 3 for our Data Science in Biotechnology Panel. We will be hosting co-founders Andrew Satz and Brett Averso of EVQLV. They will be talking about their experiences and how they have applied AI and Data Science across healthcare. They will also be mentioning their internship program for those who are interested. Find more info and the zoom link on our Facebook Page.

Facebook: https://fb.me/e/1H45WlC9I

View Event →
2020 Columbia Data Science Hackathon
Sep
19
to Sep 20

2020 Columbia Data Science Hackathon

We're excited to host our 6th annual Columbia Data Science Hackathon, hosted in a virtual format for the first time. Come work with novel datasets, present your findings to a panel of judges and engage with sponsors!
Register here: https://bit.ly/2QnaJFu

This hackathon is open to all participants, though we recommend some experience with data analysis or computer programming. Students from all schools are welcome- all over the world! Anywhere! Join in on the fun! Whether you want to compete for the prize money, build your portfolio, or just learn how a hackathon works, we welcome anyone who wants to join. Happy hacking! 

https://www.facebook.com/events/308038670274962

View Event →
Introduction to R Workshop
Mar
7
8:00 PM20:00

Introduction to R Workshop

Want to get a head start on learning the fundamentals of programming and performing data analysis in R? Do you want to figure out what "TidyVerse" even means? Come join CDSS in our Introduction to R Workshop on Thursday March 7 from 8-9pm in Math 203. Bring a laptop if you want to follow along. No prior experience in R is required, so all are welcome to join!

https://www.facebook.com/events/302771403729439/

View Event →
Data Science Case Study for Interview
Oct
30
9:00 PM21:00

Data Science Case Study for Interview

Data science interviews often include a case study. Many times, interviewees are expected to come up with a machine learning/statistical approach to solve the problem. This includes working with the interviewer to identify the data needed, the KPIs and the relevant algorithms. Come join CDSS in a workshop where we show you how to approach cases step-by-step and have the opportunity to practice one yourself! 

View Event →
Coding for Interview
Oct
25
8:00 PM20:00

Coding for Interview

As a data scientist, you are not only expected to know your expected values but also your expected runtime. Do you know basic computer science concepts like Big-O, standard data structures, and basic algorithms? Do you think about edge cases and test your code? Worry not! In this session, CDSS will go over important concepts you should know, walk through real interview questions, and share some tips regarding coding.

View Event →
Machine Learning for Interview
Oct
24
8:30 PM20:30

Machine Learning for Interview

Machine Learning is the foundation for predictive modeling and is an important component of interviews for Data Scientist roles. Join CDSS as we cover the range of concepts that often get tested and help you ace your technical interviews!

The workshop will cover a range of topics from ML fundamentals to algorithms and specific applications. We will also look at the different ways in which questions are framed in these interviews.

View Event →
SQL for Interviews
Oct
18
7:30 PM19:30

SQL for Interviews


Data science interviews often test your SQL skills, so to prepare you, CDSS is going to walk through everything you need to know to ace it! We'll start from basic SQL syntax and work our way up to more advaned functions, as well as walk through the approach to some typical interview questions.

Time: Thursday 10/18, 7:30 - 8:30 PM
Place: Barnard 302

Slides: https://github.com/jillianknoll/SQLinterviews

View Event →
Resume and Portfolio Workshop
Oct
17
8:00 PM20:00

Resume and Portfolio Workshop

Want to secure interviews from top tech companies? You need more than solid coding skills. Join our resume/portfolio workshop where we will show how to use Github to build outstanding repositories and personal websites. There will also be experienced CDSS board members giving personalized suggestions on your resume!

Time: Wednesday 10/17, 8:00 - 9:00 PM
Place: Kent 413

Slides: https://drive.google.com/file/d/1BSYgljgtKOVJMrhEuw8w4S2PAADUNBEP/view

View Event →
Data Manipulation in R Using dplyr
Sep
25
6:00 PM18:00

Data Manipulation in R Using dplyr

  • Brown Institute, School of Journalism (map)
  • Google Calendar ICS

Everyone and anybody is welcome to join. Show us that you are coming by clicking "Going" on this Facebook event!

Data manipulation and feature engineering are crucial steps in Data Science, making them critical skills that every data scientist must possess. Every language has its own tools for accomplishing these tasks and for R, it’s the powerful yet elegant dplyr library.

Get your laptops and join us for a hands-on workshop where we’ll cover the functionalities offered by dplyr for manipulating data. Knowledge of R is not required for this session. Please have R Studio installed on your computers. We will work on the Titanic dataset from Kaggle (https://www.kaggle.com/c/titanic/data). The dataset is also available here https://drive.google.com/open?id=1zd5mYiLFXjHNF5eT6yDQdfkSNdiBP5cz

Slides: https://docs.google.com/presentation/d/1JHf-MqTq_Nkt5Z6wscbruwZJeFA_OSQ3Q2Jx4P6omf0/edit?usp=sharing

View Event →
 CDSS X Dataiku: Data for Improved Cities (TechTalk + Recruiting!)
Apr
18
7:30 PM19:30

CDSS X Dataiku: Data for Improved Cities (TechTalk + Recruiting!)

When: Wednesday, April 18 2018 @ 7:30pm - 9:00pm

Where: Pupin 329, Columbia University

CDSS is hosting our last data science tech talk of the semester with Dataiku! Dataiku’s core product is a complete data science software tool aimed at shortening the time-consuming load-clean-train-test-deploy cycles of building predictive applications. The French-based startup scored a $28 milion Series B investment in late 2017, a super cool office space in downtown Manhattan, and is currently hiring!

At this event, Dataiku’s Lead Data Scientist, Jed Dougherty, will present a project analyzing the largest national dataset on evictions from Kansas City using Dataiku’s platform. Jed will also speak about the full-time and internship opportunities available at Dataiku.

Please RSVP to the event if you can.

View Event →
Foursquare Tech Talk and Q&A
Apr
11
7:30 PM19:30

Foursquare Tech Talk and Q&A

For more info.

About:
Since launching in 2009, Foursquare has collected 12 billion global check-ins, which have formed the cornerstone of its location intelligence. Using this data, Foursquare is able to detect a billion new place visits per month via the activity generated by users and business partners around the world. Foursquare now offers its proprietary location technology to hundreds of other companies, including Apple, Samsung, Microsoft, Twitter and AirBnB.

We will give an overview of Foursquare's data and how it has evolved, starting with the check-in and expanding to include continuously-detected visits on millions of smartphones. We will also describe Foursquare's core location technology, Pilgrim, how it works, and some of the data science challenges it has generated.

Speaker:
Adam Waksman is the Director of Engineering and Data Science at Foursquare, with oversight over Pilgrim. He has worked in various startup areas, including healthcare technology, fintech, and locations intelligence. Most notably, he was Chief Technology Officer at Epickk and was a day-one member at Arcesium. Prior to that he earned his Ph.D. at Columbia University; his academic papers in computer science and neuroscience have resulted in close to a thousand citations.

View Event →
Basketball Analytics - Using Machine Learning on Player Tracking Data
Apr
2
to Apr 9

Basketball Analytics - Using Machine Learning on Player Tracking Data

When: Monday, April 2 2018 @ 7:30pm

Where: Fayerweather 313

Join CDSS, Suraj Keshri, and Min-hwan Oh, two PhD students in the Operations Research department, for an exciting talk on advanced analytical techniques in basketball. The talk will discuss ongoing work on exploiting optical tracking data to develop new metrics to better characterize player strengths, including understanding defensive assignment and automatic event detection, and combining trajectory modeling with shot efficiency. Methodologically, this work relies on hidden Markov models, logistic regression, deep neural nets, and unidirectional and bidirectional Long Short Term Memory (LSTM) networks.

Please RSVP.

View Event →
CDSS x McKinsey/QuantumBlack
Mar
29
7:30 PM19:30

CDSS x McKinsey/QuantumBlack

For more info.

Come join McKinsey’s New Ventures, Advanced Analytics, and QuantumBlack practices on March 29th, 2018 at 7:30pm ET for a conversation about our work in data analytics and machine learning. We will explain the work these groups do and then talk through a recent project that leveraged machine learning to predict and prevent injuries for a professional sports team. We will also discuss the data analytics and machine learning roles at McKinsey and QuantumBlack. Following the talk, there will be a panel discussion to answer any questions. We look forward to meeting you!

Please arrive at 7:30pm sharp, as we’ll be beginning the presentation then.

Agenda:

7:30 – 8:30pm: Introductions, Analytics Case Presentation, Recruiting Process Overview

8:30 – 9:00pm: Q&A

Presenter Bios:

Muneeb Alam - Muneeb is an Analytics Fellow in McKinsey’s Public Sector Analytics group. He joined McKinsey right out of university, with a BA in astrophysics from Columbia and a Master’s in Analytics from Imperial College London. He’s served clients in corrections, tax, and education.

Daniel First – Daniel is a Data Scientist at QuantumBlack, a subsidiary of McKinsey that specializes in machine learning. After pursuing graduate studies at Columbia’s Data Science Institute as an NSF research fellow, he joined McKinsey initially as a management consultant, before moving over to his current role at QuantumBlack. His work has centered around collaborating with doctors and hospitals to design innovative, data-driven solutions to improve outcomes for patients, by forecasting and preventing medical risks. He has also published on the social and political implications of Artificial Intelligence. He holds a master’s degree in philosophy from the University of Cambridge and an undergraduate degree in neuroscience from Yale University. 

Ishneet Kaur – Ishneet is a Risk Advanced Analytics Fellow with experience in risk identification and stress testing. She interned with Risk Analytics in the Summer of 2016 before joining McKinsey full time in July of 2017. Ishneet holds a Masters in Applied Economics from Cornell University.

View Event →
CDSS x IBM: Science for Social Good
Mar
28
7:30 PM19:30

CDSS x IBM: Science for Social Good

For more info.

Join IBM Research on March 28th, 2018 at 7:30pm ET for a conversation about our work using Data Science for Social Good. Members of IBM Research will give an overview of the program, and walk through recent projects that leveraged machine learning to produce social good. Projects include using natural language processing-based methodology to accelerate the work-flow of policy experts at UNDP, Accelerate Science Discovery. Following the talk, there will be a panel discussion to answer any questions. We look forward to meeting you!

Speakers’ Bios:
Kush R. Varshney
Kush R. Varshney was born in Syracuse, NY in 1982. He received the B.S. degree (magna cum laude) in electrical and computer engineering with honors from Cornell University, Ithaca, NY, in 2004. He received the S.M. degree in 2006 and the Ph.D. degree in 2010, both in electrical engineering and computer science from the Massachusetts Institute of Technology (MIT), Cambridge. While at MIT, he was a National Science Foundation Graduate Research Fellow.

Dr. Varshney is a research staff member and manager with IBM Research AI at the Thomas J. Watson Research Center, Yorktown Heights, NY, where he leads the Learning and Decision Making group. He is the founding co-director of the IBM Science for Social Good initiative. He applies data science and predictive analytics to human capital management, healthcare, olfaction, computational creativity, public affairs, international development, and algorithmic fairness, which has led to recognitions such as the 2013 Gerstner Award for Client Excellence for contributions to the WellPoint team and the Extraordinary IBM Research Technical Accomplishment for contributions to workforce innovation and enterprise transformation. He conducts academic research on the theory and methods of statistical signal processing and machine learning. His work has been recognized through best paper awards at the Fusion 2009, SOLI 2013, KDD 2014, and SDM 2015 conferences.

Yaoli Mao
Yaoli Mao is a Ben Wood Research Fellow affiliated with the Institute for Learning Technologies and Ph.D. student in the Cognitive Science in Education program at Teachers College Columbia University. 
She conducts research using both quantitative and qualitative methods in the intersection of cognitive psychology, human-computer interaction and learning science. Yaoli is interested in social and cognitive-affective aspects of learning (engagement, boredom, and gaming etc.), learning strategies and behavior patterns. Her dissertation concerns collective intelligence, exploring knowledge sharing and learning among diverse expertise and human crowds’ intelligence can be properly evaluated, supported and elevated by machine learning and system design.

Jonathan Galsurkar
Jonathan will graduate in May with an MS in Data Science from Columbia University. He graduated summa cum laude in with a bachelor in Computer Science and Mathematics. Before coming to Columbia, he served as adjunct lecturer as well as developer for a health-tech company. Jonathan main areas of interests are machine learning and natural language processing, especially their utilization for social good. DSI students might recognize Jonathan from the 2017 Columbia Data Science Hackathon, in which his team came in first place. As a Science for Social Good Fellow at IBM, Jonathan’s work was focused on sentence/paragraph embedding and semantic searching techniques & applications.

View Event →