So, you want to be a data scientist, but how fast can you answer this question?
If I have two decks of cards each with half blue cards and half green cards, should I draw from the deck of 10 or 100 to maximize the probability of drawing two green cards in a row in the first two draws?
Did it take you longer than you would like? No worries, CDSS has you covered! Technical interviews for data science positions are notorious for throwing hardball statistics questions at you to test not only your math skills, but also your ability to think under pressure. :scream: At Probability and Statistics for Interviews, CDSS will walk you through a repeatable process of figuring out the right answer for these kinds of questions. Whether you’re a math wizard or are taking your first probability class, come to our event for an hour of fun and collaborative problem solving!
When : Thursday, November 9 at 8:00pm to 9:00pm
Where : Pupin 214
Are you looking to learn the basics of the pandas library for Python for your next data science project or an upcoming interview? Do you want to find out why Pandas is useful for data science and how it can be used most efficiently? Join us for a short session on the basics of dataframes, file I/O, cleaning and viewing data and preparing dataframes for scikit-learn. We will guide you through the basics of pandas using the Kaggle Titanic dataset.
When : Wednesday, November 8 at 9:00pm to 10:00pm
Where : Hamilton 517
Getting some experience working on real problems and real datasets is a crucial step towards becoming a data scientist, and the best way to do that while you're in school is a summer internship. But finding the perfect internship is tricky, so let us help you! During this panel, you'll hear from some current CDSS board members about their internship experiences at Facebook, Digital Reasoning, and more- from applying and interviewing to preparing for your first day and landing that return offer. Then we'll open up the floor to you, so come with with questions!
When : Wednesday, October 1 at 9:00pm to 10:00pm
Where : Hamilton 517
Come on out to the annual CDSS Town Hall meeting! We'll be covering how to join the executive board, events we're planning this semester, and hackathon details. We also want to hear from y'all, so come with questions about anything and everything, as well as ideas for events you want to see this semester.
Dinner will be provided! RSVP to the Facebook event.
For more details and to RSVP, visit the Facebook Event
In a preview of his Sloan Analytics Conference talk, Evan Wasch, SVP of Basketball Strategy & Analytics at the NBA, will break down how the NBA League Office uses data science and analytics to make complex changes to the game and league. A full preview of the talk can be found below:
Behind the scenes, the NBA game is comprised of an intricate web of decision points or “levers,” all designed to maximize the quality of the core basketball product. From in-game aspects like game rules and format to competitive considerations like scheduling, playoff structure, and draft lottery, each lever of the ecosystem is interconnected and even small changes can have significant ripple effects. This presentation will discuss these complex interactions and give a look under the hood on how the NBA uses analytics to make key strategic decisions related to its core product.
RSVP to the Facebook Event!
Recommender systems have become an integral method for discovery of content on the web, such as music, movies, books, search queries, social media content, and consumer products. In the context of e-commerce websites like eBay, they can be a critical part of a user's’ shopping experience, helping the buyers find the best products for them from among the millions that are available. A technique called collaborative filtering is the foundation of modern recommender systems, where behavioral signals such as item clicks and purchases are used to predict whether a user will find an item relevant. In this talk, we give an overview of standard collaborative filtering techniques, describe the challenges of applying collaborative filtering in a semi-structured marketplace such as eBay, and present how we are leveraging deep learning techniques to overcome the distinctive challenges of building recommender systems for eBay listings.
Daniel Galron is a research scientist & engineer working at eBay since 2014. He earned a PhD in computer science from NYU in 2012, where he worked on machine learning methods for machine translation.
Data Scientists at KPMG work to bring AI, Machine Learning, Advanced Statistical Modeling, and Optimization to our clients. In this talk, we provide examples where we have applied Data Science and Analytics to scheduling a major sports conference, taught computers to read complex financial documents, and provided innovative solutions to our Fortune 500 clients.
John Lee (Director, Data & Analytics): https://www.linkedin.com/in/john-lee-16156417
Arthur Franke (Manager, Data & Analytics): https://www.linkedin.com/in/arthur-franke-271a1643
Thinking about a career in data science? Curious to know what kinds of jobs you can have as a data scientist in Finance, Healthcare, Retail, Media, and more? Join us for the Women in Data Science Career Panel this Thursday! We have invited an accomplished group of ladies from different areas of data science to share their experiences and give us some tips on job search, work life, and how you too can shape the future of data science.
Non-CUID guests must RSVP to on Eventbrite
More info on the Facebook Event
Back by popular demand, we've gathered a diverse group of panelists from Betterment, Bloomberg, Kensho, and OnDeck to give their perspectives on the role that data science plays in the FinTech industry. The panelists, as well as representatives from some of companies, will be available for networking at the conclusion of the discussion. To RSVP and learn more about the panelists, visit the Facebook event!
Ever wanted to get started on how to make interactive data visualizations?
Woojin Kim is a graduate student in Data Science at Columbia University, graduating this December. He used data visualization and D3 to showcase his work and won a couple of hackathons like the ones CDSS organized. Checkout his profile and projects at <http://woojink.com/>.
Tableau is one of the leading visualization software products in the business world today. If you want to learn how to make engaging, interactive visualizations with a simple click and drag interface, join CDSS as we welcome Tableau expert Adam McCann for a workshop followed by a Q&A session.
Adam McCann is a Specialist Leader with Deloitte Consulting in their Analytics and Information Management group where he specializes in predictive analytics and data visualization. At Deloitte, he leads a team of consultants at a national intelligence agency developing predictive models and business intelligence solutions. Adam also teaches an information visualization course at Maryland Institute College of Art. His data visualization blog duelingdata.com covers topics ranging from movies, sports and politics. He is also 2015-2017 Tableau Zen Master, a title recognizing the top Tableau practitioners in the world for their mastery of the product.
As a Columbia student, you can download Tableau for free! Just follow the instructions at the link below. Please download Tableau prior to the workshop so you can follow along!
So much of the information we encounter every day is hard to conceptualize. It’s so big and complicated that a visual rendering represents it the best. Being a good data designer is crucial to being able to tell the story behind the data.
Last semester, we hosted Juan Francisco Saldarriaga, Mapping and Data Visualization Specialist. He gave us a talk about data visualization techniques, processes, and methods.
This semester, Juan Francisco is back for a workshop! This time we are going to use Processing as a coding environment to master new data visualization techniques. Juan Francisco will guide us through the process of using basic programming skills and concepts to create visually compelling charts and graphs. We will use real data from Citibike to visually analyze the imbalances in the system stations. Students are required to download Processing before the workshop and to have basic knowledge of programming concepts (variables, loops, functions), however, no prior experience with Processing is required for the workshop.
What even is a data science interview? Do they want me to be a developer, or an analyst, or a unicorn? At this workshop we'll go over what to expect in data science interviews, focusing especially on in person tech interviews. From the basic layout to the right answer to (almost) every "what data structure can you use to make this faster" question, you'll be ready to land the internship or job of your data science filled dreams.
Chris Mulligan is currently a Quantitative Researcher at Two Sigma Investments, LP. He builds models to make predictions of financial markets using untraditional data, which is a sentence he never imagined he’d say about himself. Prior to Two Sigma Chris completed data science internships at Kickstarter, Facebook, and The New York Times, as well as 7 years in political data analysis and modeling, most recently as Director of Analytics at YouGov. Chris received BA and MA degrees in computer science and statistics from Columbia in 2015, where he was a TA for COMS3157 AP and STAT4400 StatML, and cofounded CDSS.
Mike Jaron, current QMSS student and Google Data Scientist, will give a talk about his day-to-day work, the culture of data science at Google, and his thoughts on the future trends regarding Data Science, followed by a Q&A session.
Mike works on the Human/Social Dynamics program, where he specializes in natural language processing and data analysis in Python and R. Bring any questions you have about data science at Google, Mike's career, and anything in between!
RSVP to the FB event here.
Professors Jones and Wiggins will be holding a discussion and Q+A on the past, present, and future of data in our lives. Each will speak briefly on how students, scholars, and citizens make sense of data in science, public policy, and our personal lives. We invite Columbia University students (all divisions) to RSVP and to offer questions via this form.
Discussion and student questions will guide the direction of the course "Data: Past, Present, and Future" to be taught by professors Jones and Wiggins in Spring 2017, with the support of Columbia's Collaboratory program and the Leibniz Fund.
Professor Matt Jones, James R. Barker Professor of Contemporary Civilization, Department of History
Professor Chris Wiggins, Associate Professor, Department of Applied Physics and Applied Mathematics
David Madigan, Professor of Statistics, EVP and Dean of the Faculty of Arts and Sciences
Please RSVP on Facebook and the Google form above to guarantee entry. This is an event you won't want to miss!
Looking for a job? Trying to launch a business? Need guidance on what to do with your career? Networking is one of the best ways to make professional connections that can assist with employment, starting a business, finding a mentor, and so many other opportunities. But where do you start? How do you build a network? How do you network successfully?
In this interactive presentation, you will learn how to use LinkedIn to find the right types of connections and how to contact them in a way that gets you a high response rate, how to find those key individuals at companies that you definitely want to speak with, how to get and approach one-one meetings, how to navigate networking events, and simple body language and human interaction methods that will take your interpersonal skills to the next level.
Location: 602 Hamilton
Python is one of the most powerful, easy to learn, and flexible programming languages out there. Why not learn to do data science using Python? We'll be covering the essential tools for doing data science in Python. Hopefully you'll find the material useful leading up to our data science hackathon!
(Note: this is a sibling event to Introduction to Data Science in R. While they won't be 100% analogous, they will be comparable.)
Tentative Topics: Manipulating Data (pandas), Plotting Data (seaborn), Machine Learning Options (scikit-learn)
(Feel free to comment in the event to request specific topics!)
Location: 602 Hamilton
Did you enjoy our workshop on Introduction to Programming in R? (If not, you can check out the code we wrote here.) Great: come learn more about data science specifics within R! This will help you build the strong foundation of tools necessary to become a proficient data scientist. Hopefully you'll find the material useful leading up to our Data Science hackathon!
(Note: this is a sibling event to Introduction to Data Science in Python. While they won't be 100% analogous, they will be comparable.)
Tentative Topics: Manipulating Data (dataframes + dplyr), Plotting Data (ggplot2), Machine Learning (lm + caret + e1071)
(Feel free to comment in the event to request specific topics!)
Location: Broadway Room, Lerner Hall
Interested in joining the Columbia Data Science Society Executive Board? Want to learn more about our events for this year or even propose an event yourself? Come meet current board members to discuss formal Executive Board recruitment for the academic year. We are also happy to chat about data science at Columbia and relevant courses being offered this semester. Hope to see you there!
Location: Low Plaza
Come to our table at this year's Activities Fair to learn more about us. You can meet current members, discuss recruitment, and hear about some of the events we will be organizing this year. We are also happy to chat about data science at Columbia and relevant courses being offered this semester. Hope to see you there!
Location: Hamilton 703
So much of the information we encounter every day is hard to conceptualize. It’s so big and complicated that a visual rendering represents it the best. Being a good data designer is crucial to be able to tell the story behind the data.
Come push you data visualization techniques, processes and methods to the next level with Juan Francisco Saldarriaga
Juan Francisco Saldarriaga is a Mapping and Data Visualization Specialist, and an Architectural Designer and Urban Planner living and working in New York. Juan Francisco has a Masters of Science in Urban Planning from Columbia University and a Masters of Architecture also from Columbia. For his undergrad, Juan Francisco studied philosophy at the Université de Paris IV (Sorbonne) and at the Universidad de Los Andes, Bogotá.
Check some of its work: http://cargocollective.com/juanfrans/
Food will be served!