The Anatomy of the NBA Ecosystem
Feb
28
7:00 pm19:00

The Anatomy of the NBA Ecosystem

  • Roone Arledge Cinema, Lerner Hall

For more details and to RSVP, visit the Facebook Event

In a preview of his Sloan Analytics Conference talk, Evan Wasch, SVP of Basketball Strategy & Analytics at the NBA, will break down how the NBA League Office uses data science and analytics to make complex changes to the game and league. A full preview of the talk can be found below:

Behind the scenes, the NBA game is comprised of an intricate web of decision points or “levers,” all designed to maximize the quality of the core basketball product. From in-game aspects like game rules and format to competitive considerations like scheduling, playoff structure, and draft lottery, each lever of the ecosystem is interconnected and even small changes can have significant ripple effects. This presentation will discuss these complex interactions and give a look under the hood on how the NBA uses analytics to make key strategic decisions related to its core product.

Recommendation Systems and Deep Learning at eBay
Feb
23
7:00 pm19:00

Recommendation Systems and Deep Learning at eBay

  • Broadway Room, Lerner Hall

RSVP to the Facebook Event!

Recommender systems have become an integral method for discovery of content on the web, such as music, movies, books, search queries, social media content, and consumer products. In the context of e-commerce websites like eBay, they can be a critical part of a user's’ shopping experience, helping the buyers find the best products for them from among the millions that are available. A technique called collaborative filtering is the foundation of modern recommender systems, where behavioral signals such as item clicks and purchases are used to predict whether a user will find an item relevant. In this talk, we give an overview of standard collaborative filtering techniques, describe the challenges of applying collaborative filtering in a semi-structured marketplace such as eBay, and present how we are leveraging deep learning techniques to overcome the distinctive challenges of building recommender systems for eBay listings.

Speaker Bio
Daniel Galron is a research scientist & engineer working at eBay since 2014. He earned a PhD in computer science from NYU in 2012, where he worked on machine learning methods for machine translation.

Data Science Case Studies with KPMG
Feb
16
7:00 pm19:00

Data Science Case Studies with KPMG

  • Hamilton 703

Data Scientists at KPMG work to bring AI, Machine Learning, Advanced Statistical Modeling, and Optimization to our clients. In this talk, we provide examples where we have applied Data Science and Analytics to scheduling a major sports conference, taught computers to read complex financial documents, and provided innovative solutions to our Fortune 500 clients.

Speakers: 
John Lee (Director, Data & Analytics): https://www.linkedin.com/in/john-lee-16156417
Arthur Franke (Manager, Data & Analytics): https://www.linkedin.com/in/arthur-franke-271a1643

Job Links:
http://us-jobs.kpmg.com/careers/SearchResults/data%20scientist
http://us-jobs.kpmg.com/careers/SearchResults/%22big%20data%20software%20engineer%22

Women in Data Careers Panel
Feb
7
7:30 pm19:30

Women in Data Careers Panel

  • Roone Arledge Cinema in Lerner Hall

Thinking about a career in data science? Curious to know what kinds of jobs you can have as a data scientist in Finance, Healthcare, Retail, Media, and more? Join us for the Women in Data Science Career Panel this Thursday! We have invited an accomplished group of ladies from different areas of data science to share their experiences and give us some tips on job search, work life, and how you too can shape the future of data science.

Non-CUID guests must RSVP to on Eventbrite

More info on the Facebook Event

Data Science in FinTech
Jan
30
6:00 pm18:00

Data Science in FinTech

  • Davis Auditorium, Schapiro (CEPSR)

Back by popular demand, we've gathered a diverse group of panelists from Betterment, Bloomberg, Kensho, and OnDeck to give their perspectives on the role that data science plays in the FinTech industry. The panelists, as well as representatives from some of companies, will be available for networking at the conclusion of the discussion. To RSVP and learn more about the panelists, visit the Facebook event!

Introduction to D3
Nov
22
9:00 pm21:00

Introduction to D3

  • math 203

Ever wanted to get started on how to make interactive data visualizations?

D3.js is a JavaScript library that lets you create dynamic, interactive data visualizations for the web. It's an incredibly flexible library, handling endless types of customizable visualizations. If you want to learn the basics of D3.js, join CDSS as we welcome Woojin Kim for a workshop.

 

Woojin Kim is a graduate student in Data Science at Columbia University, graduating this December. He used data visualization and D3 to showcase his work and won a couple of hackathons like the ones CDSS organized. Checkout his profile and projects at <http://woojink.com/>.

Data Visualization with Tableau
Nov
21
7:00 pm19:00

Data Visualization with Tableau

  • Mudd 227

Tableau is one of the leading visualization software products in the business world today. If you want to learn how to make engaging, interactive visualizations with a simple click and drag interface, join CDSS as we welcome Tableau expert Adam McCann for a workshop followed by a Q&A session.

Adam McCann is a Specialist Leader with Deloitte Consulting in their Analytics and Information Management group where he specializes in predictive analytics and data visualization. At Deloitte, he leads a team of consultants at a national intelligence agency developing predictive models and business intelligence solutions. Adam also teaches an information visualization course at Maryland Institute College of Art. His data visualization blog duelingdata.com covers topics ranging from movies, sports and politics. He is also 2015-2017 Tableau Zen Master, a title recognizing the top Tableau practitioners in the world for their mastery of the product.

As a Columbia student, you can download Tableau for free! Just follow the instructions at the link below. Please download Tableau prior to the workshop so you can follow along!

www.tableau.com/academic/students

Nov
17
9:00 pm21:00

Scalable Analytics with Spark

Apache Spark is one of the most useful big data frameworks. Come learn how you can experiment it and harness it for large scale analytics! We’ll cover a conceptual introduction to Spark and the basics of the Python interface, PySpark.

Data Visualization Series - Workshop with Juan Francisco
Nov
16
7:00 pm19:00

Data Visualization Series - Workshop with Juan Francisco

  • Pupin 203

So much of the information we encounter every day is hard to conceptualize. It’s so big and complicated that a visual rendering represents it the best. Being a good data designer is crucial to being able to tell the story behind the data.

Last semester, we hosted Juan Francisco Saldarriaga, Mapping and Data Visualization Specialist. He gave us a talk about data visualization techniques, processes, and methods. 
This semester, Juan Francisco is back for a workshop! This time we are going to use Processing as a coding environment to master new data visualization techniques. Juan Francisco will guide us through the process of using basic programming skills and concepts to create visually compelling charts and graphs. We will use real data from Citibike to visually analyze the imbalances in the system stations. Students are required to download Processing before the workshop and to have basic knowledge of programming concepts (variables, loops, functions), however, no prior experience with Processing is required for the workshop.

Hacking Data Science Interviews
Nov
15
9:00 pm21:00

Hacking Data Science Interviews

What even is a data science interview? Do they want me to be a developer, or an analyst, or a unicorn? At this workshop we'll go over what to expect in data science interviews, focusing especially on in person tech interviews. From the basic layout to the right answer to (almost) every "what data structure can you use to make this faster" question, you'll be ready to land the internship or job of your data science filled dreams.

Chris Mulligan is currently a Quantitative Researcher at Two Sigma Investments, LP. He builds models to make predictions of financial markets using untraditional data, which is a sentence he never imagined he’d say about himself. Prior to Two Sigma Chris completed data science internships at Kickstarter, Facebook, and The New York Times, as well as 7 years in political data analysis and modeling, most recently as Director of Analytics at YouGov. Chris received BA and MA degrees in computer science and statistics from Columbia in 2015, where he was a TA for COMS3157 AP and STAT4400 StatML, and cofounded CDSS.

Nov
10
8:00 pm20:00

Data Science at Google

  • Pupin 214

Mike Jaron, current QMSS student and Google Data Scientist, will give a talk about his day-to-day work, the culture of data science at Google, and his thoughts on the future trends regarding Data Science, followed by a Q&A session. 

Mike works on the Human/Social Dynamics program, where he specializes in natural language processing and data analysis in Python and R. Bring any questions you have about data science at Google, Mike's career, and anything in between!

RSVP to the FB event here.

Data: Past, Present, and Future
Oct
17
7:00 pm19:00

Data: Past, Present, and Future

  • 750 CEPSR (Schapiro Research Building)

Professors Jones and Wiggins will be holding a discussion and Q+A on the past, present, and future of data in our lives. Each will speak briefly on how students, scholars, and citizens make sense of data in science, public policy, and our personal lives. We invite Columbia University students (all divisions) to RSVP and to offer questions via this form.

Discussion and student questions will guide the direction of the course "Data: Past, Present, and Future" to be taught by professors Jones and Wiggins in Spring 2017, with the support of Columbia's Collaboratory program and the Leibniz Fund.

Participants: 

Professor Matt Jones, James R. Barker Professor of Contemporary Civilization, Department of History

Professor Chris Wiggins, Associate Professor, Department of Applied Physics and Applied Mathematics

Moderator: 

David Madigan, Professor of Statistics, EVP and Dean of the Faculty of Arts and Sciences

Please RSVP on Facebook and the Google form above to guarantee entry. This is an event you won't want to miss! 

 

Network to Get Work
Oct
4
7:00 pm19:00

Network to Get Work

  • Roone Arledge Cinema - Lerner Hall

Looking for a job? Trying to launch a business? Need guidance on what to do with your career? Networking is one of the best ways to make professional connections that can assist with employment, starting a business, finding a mentor, and so many other opportunities. But where do you start? How do you build a network? How do you network successfully? 

In this interactive presentation, you will learn how to use LinkedIn to find the right types of connections and how to contact them in a way that gets you a high response rate, how to find those key individuals at companies that you definitely want to speak with, how to get and approach one-one meetings, how to navigate networking events, and simple body language and human interaction methods that will take your interpersonal skills to the next level.

CDSS Data Hackathon 2016
Sep
30
Oct 1

CDSS Data Hackathon 2016

  • Columbia University

We are very excited be hosting our second annual data science hackathon this Fall! Please check back at the start of September for updates and registration info. 

Introduction to Data Science in Python
Sep
26
9:00 pm21:00

Introduction to Data Science in Python

Location: 602 Hamilton

Python is one of the most powerful, easy to learn, and flexible programming languages out there. Why not learn to do data science using Python? We'll be covering the essential tools for doing data science in Python. Hopefully you'll find the material useful leading up to our data science hackathon!

(Note: this is a sibling event to Introduction to Data Science in R. While they won't be 100% analogous, they will be comparable.)

Tentative Topics: Manipulating Data (pandas), Plotting Data (seaborn), Machine Learning Options (scikit-learn)
(Feel free to comment in the event to request specific topics!)

Introduction to Data Science in R
Sep
22
8:00 pm20:00

Introduction to Data Science in R

Location: 602 Hamilton

Did you enjoy our workshop on Introduction to Programming in R? (If not, you can check out the code we wrote here.) Great: come learn more about data science specifics within R! This will help you build the strong foundation of tools necessary to become a proficient data scientist. Hopefully you'll find the material useful leading up to our Data Science hackathon!

(Note: this is a sibling event to Introduction to Data Science in Python. While they won't be 100% analogous, they will be comparable.)

Tentative Topics: Manipulating Data (dataframes + dplyr), Plotting Data (ggplot2), Machine Learning (lm + caret + e1071)
(Feel free to comment in the event to request specific topics!)

Sep
12
8:30 pm20:30

CDSS Town Hall

Location: Broadway Room, Lerner Hall

Interested in joining the Columbia Data Science Society Executive Board? Want to learn more about our events for this year or even propose an event yourself? Come meet current board members to discuss formal Executive Board recruitment for the academic year. We are also happy to chat about data science at Columbia and relevant courses being offered this semester. Hope to see you there!

RSVP here: https://www.facebook.com/events/829352830534917

Sep
9
12:00 pm12:00

Activities Fair

Location: Low Plaza

Come to our table at this year's Activities Fair to learn more about us. You can meet current members, discuss recruitment, and hear about some of the events we will be organizing this year. We are also happy to chat about data science at Columbia and relevant courses being offered this semester. Hope to see you there!

Apr
20
9:00 pm21:00

Next Level Data Visualization with Juan Francisco

Location: Hamilton 703

So much of the information we encounter every day is hard to conceptualize. It’s so big and complicated that a visual rendering represents it the best. Being a good data designer is crucial to be able to tell the story behind the data.

Come push you data visualization techniques, processes and methods to the next level with Juan Francisco Saldarriaga

Juan Francisco Saldarriaga is a Mapping and Data Visualization Specialist, and an Architectural Designer and Urban Planner living and working in New York. Juan Francisco has a Masters of Science in Urban Planning from Columbia University and a Masters of Architecture also from Columbia. For his undergrad, Juan Francisco studied philosophy at the Université de Paris IV (Sorbonne) and at the Universidad de Los Andes, Bogotá.

Check some of its work: http://cargocollective.com/juanfrans/

Food will be served!

Apr
19
8:30 pm20:30

Data Science at Commonwealth Bank

Location: 702 Hamilton

Come hear from Randy Carnevale, the Director of Decision Sciences at Commonwealth Bank! He'll be speaking about a variety of data science initiatives at Commonwealth Bank including graph databases and cloud-based web scraping.

Speaker Bio: https://www.linkedin.com/in/randycarnevale

Apr
12
5:00 pm17:00

Tech in Fintech

Location: Lerner Auditorium

Tech in Fintech will expose Columbia University’s best and brightest engineers and data scientists to the broad area known as “fintech,” one of the fastest growing industries with its base right here in New York City. Tech in Fintech will operate in a panel format, consisting of 4-5 speakers comprised of senior-level technical professionals at leading financial technology companies, hosted by a well-regarded Columbia University professor. Panelists will be announced in the coming weeks. The panel will be followed by a networking reception in the same location with panelists and alumni.

Mar
29
8:00 pm20:00

Building an Open Source Desktop App

Location: Hamilton 702

Today, Javascript has pretty much taken over your desktop. Many of the apps you use, from Slack to Spotify to Sunrise (alas), are written in Javascript. It's now easy to develop powerful native desktop apps with the latest Javascript frameworks and modern web tech. This tech talk will introduce you to new ways to build desktop applications with React, Observables, and Electron, all of which have become really popular of late. ADI presents a talk by Evan Morikawa, an engineer at Nylas. Nylas is a San Francisco-based startup building N1, an extensible email client originally forked from Atom (GitHub's hackable text editor) and now one of the most popular open-source projects on GitHub. This desktop email client is built entirely with modern web tech and allows anyone to build plugins to dramatically enhance what you can do with email. You can check it out at https://www.nylas.com/n1 and see the source code at https://github.com/nylas/N1.

Image Recognition Through Deep Learning with Clarifai
Mar
28
8:00 pm20:00

Image Recognition Through Deep Learning with Clarifai

Location: Hamilton 703

Come learn about deep learning and image recognition with Matthew Zeiler, foremost expert in machine learning and artificial intelligence. He is the CEO and Founder of Clarifai, a company that specializes in visual recognition and beats the accuracy and speed of the largest tech companies! 

Facebook event: https://www.facebook.com/events/775666379234837/

Mar
24
5:45 pm17:45

Women in Data Career Panel

Location: Lerner Roone Cinema

Are you thinking about a career in data science? Are you curious to know what kinds of jobs you can have as a data scientist in Finance, Healthcare, Retail, Media, and more? Join us for the Women in Data Science Career Panel on Thursday March 24th at 5:45pm! We have invited an accomplished group of ladies from different industries and areas of data science to share their experiences and give us some tips on job search, work life, and how you, too, can shape the future of data science.

Our panelists are: Jiaqi Liu - Data Science at Capital One https://www.linkedin.com/in/jiaqi-liu-4873b745 Keira Zhou - Data Engineering at Capital One https://www.linkedin.com/in/keiraqz Tran Ly - Data Strategist at NY Presbyterian Hospitalhttps://www.linkedin.com/in/tran-ly-mph-ma-60b78b3 Iva Vukicevic - Data Scientist at Macy's https://www.linkedin.com/in/ivavuk Audrey Holmes - Data Scientist at Audible https://www.linkedin.com/in/audrey-holmes-564a2142 Jiun Kim - Data Scientist at Audible https://www.linkedin.com/in/jiunkim.

There will be a networking dinner following the panel, by application only. Please fill out this form (https://docs.google.com/forms/d/1SuBP7cR21ycO32_Hri9-rMQygwSWgFmoaY4bzv-eaBc/viewform?c=0&w=1&usp=mail_form_link) if you are interested in meeting the panelists in person over catered dinner! 

Mar
22
8:00 pm20:00

Undergrad Data Science Panel

Location: Math 417

know where to start? Join CDSS E-board members for a panel discussion on what courses and opportunities students should pursue in this growing field. The panelists will include a current sophomore, junior, and senior. Please feel free to submit questions for discussion on this event page. See you on Tuesday!

Intended audience: This event is targeted towards undergraduates.

Invite your friends on Facebook at https://www.facebook.com/events/1702153660065889/.

Mar
21
9:00 pm21:00

Dimensionality Reduction with Principal Component Analysis

Location: Hamilton 304

Big data is big... really big and also has lots of noise. How do we reduce the dimensionality of these massive datasets to something tractable? Join ADI and learn how to reduce the size of high dimensional datasets using PCA, a popular technique in ML.

No prerequisites necessary though some linear algebra background is useful and we'll take a glance at some Python.

What will I do?
You'll learn about dimensionality reducation, why it matters, a common technique called PCA and how to use it in Python. Then you can apply it to all the other algorithms you've seen in the Accessible ML series.

Who should come to this event?
Anyone with an interest in machine learning is welcome — no prior experience necessary! We'll start with basic stats and make you a dimensionality reduction pro! Impress your friends with your godly PCA abilities.

What should I bring?
PCA's heavy on concepts so that's going to be our focus. We'll also have a code demo but I don't expect you to follow the code as much as see the results. That said, bring a laptop if you want to run them on your own machine, in which case we recommend that you install jupyter notebook (http://jupyter.readthedocs.org/en/latest/install.html) on it prior to the event.

Mar
2
9:00 pm21:00

Accessible ML: K-Means Clustering

Location: Hamilton 304

Date and time: Wednesday 3/2/2016 9:00 PM

Unsupervised learning requires us to detect underlying patterns in the data without training our models beforehand. Join ADI and CDSS and learn how to use the k-means clustering algorithm to reconstruct images from corrupted datasets! Some statistics understanding is useful, as is experience with Python. We recommend that you bring a laptop and install Jupyter notebook (http://jupyter.readthedocs.org/en/latest/install.html) so you can follow along with the code during the workshop.

What will I do?

You'll write a program in Python which runs the k-means clustering algorithm on an image. Then, you'll be able to reconstruct the image using only the clusters obtained from the data! By the end of the presentation, you'll be able to start applying k-means on a wide variety of unsupervised learning problems.

Who should come to this event?

Anyone with an interest in machine learning is welcome — no prior experience necessary! Some stats background is helpful, but not required. The code will be written in Python.

What should I bring?

Please bring a laptop -- we recommend that you install Jupyter notebook (http://jupyter.readthedocs.org/en/latest/install.html) on it prior to the event.

How can I contact the event organizers?

If you have any questions, feel free to reach out to Kristy (kristy@adicu.com), or Piyali (pm2678@columbia.edu)!

Mar
1
9:00 pm21:00

Introduction to Natural Language Processing

Location: Hamilton 702

Date and time: Tuesday 3/1/2016 9:00 PM

Join us for a workshop on natural language processing, a field of computer science that approaches problems using textual data in a computational way. In this workshop, we'll go over some fundamental concepts and techniques used in the exciting field of Natural Language Processing!

Who should come to this event?

Anyone remotely interested in learning about natural language processing!

What should I bring?

Bring a laptop!

What should I install beforehand?

Ideally, have Python and Anaconda or Pip (preferably Anaconda) installed. If you want to get ahead, install the nltk, re, and matploblib libraries, but we'll also go through set up!

What topics should I know before coming?

Familiarity with some Python will allow you to get the most from this talk!

Can I use a PC, a Chromebook, or an X computer?

Bring any of the above!

What will I learn?

You'll learn about regular expressions, parts of speech tagging, sentiment analysis, other NLP techniques, and how to apply them in python using the nltk and re modules.

Is there a way to contact any of the event organizers if I have questions not listed here?

Of course! If you have any questions, feel free to reach out to lesley@adicu.com!

Feb
29
8:00 pm20:00

Accessible ML: Time Series Forecasting

Location: Hamilton 304

Date and time: Monday 2/29/2016 8:00 PM

Time series data presents its own unique challenges and insights. Join ADI and CDSS and learn how to forecast time series data! We will be modeling electricity prices using weather data. Some statistics understanding is useful, as is experience with python. We recommend that you bring a laptop and install jupyter notebook (http://jupyter.readthedocs.org/en/latest/install.html) so you can follow along with the code during the workshop.

What will I do?

You'll write a program in Python which runs forecasting techniques on energy and weather data. By the end of the presentation, you'll be able to start using ARIMA regressions on time series data.

Who should come to this event?

Anyone with an interest in machine learning is welcome — no prior experience required! Some stats background is helpful, but not required. The code will be written in Python.

What should I bring?

It is very helpful to bring a laptop. We recommend that you install Jupyter notebook (http://jupyter.readthedocs.org/en/latest/install.html) it. 

How can I contact the event organizers?

If you have any questions, feel free to reach out to Sunny (sb3436@columbia.edu), one of the event organizers.