Latest about COVID-19 and W&M's Path Forward.

Info for... William & Mary
William & Mary W&M menu close William & Mary

B.S. in Data Science

Program Overview

The B.S. in Data Science will require a minimum of 40 credits. The curriculum includes three tracks: Data Application, Algorithms, and Spatial Data Analytics. The degree program culminates in a capstone experience. Each track will further strengthen and deepen students’ understanding in data science.

The focus of the core curriculum is to provide students with a solid foundation in data science. Students learn data science theory and applications, including critical evaluation of how data can be used to solve novel problems, deliberation (considering the ethical, moral, and societal implications of data science), and communication. Through the core curriculum students learn the basics of programing, modeling, machine learning, data visualization, database structures, and ethics in data science. Students also will take one course in linear algebra and two courses in mathematical statistics. The curriculum provides opportunities for students to use their skills and knowledge to manage and analyze large data sets efficiently and effectively and to identify and answer novel questions in a variety of settings.

Students will choose a track area to gain knowledge, skills, and abilities that are more specific to particular career aspirations. They are required to take three courses from one of the following tracks: Data Application, Algorithms, or Spatial Data Analytics. Coursework for the Data Application track focuses on teaching additional skills (e.g., data with time dependencies) and providing a more in-depth understanding of analytical and data visualization tools commonly used by data scientists employed by the private industry or government. Coursework for the Algorithms track focuses on expanding students’ abilities to develop new software or algorithms for the ingestion or analysis of large sources of frequently near-real-time data (this track is not available for students already majoring in Computer Science). Coursework for the Spatial Data Analytics track focuses on integration of analytical and visualization tools that data scientists typically use when working with data that have spatial dependencies.

In the capstone experience, each student will work closely with a program faculty member to conduct a substantial research project that focuses on synthesis and critical analysis, problem solving in an applied and/or academic setting, creation of original material or original scholarship, and effective communication with diverse audiences.

I. Core Courses
Data Science Courses
  1. CSCI 140 Programming for Data Science, OR CSCI 141 Computational Problem Solving (both 4 credits)
  2. DATA 146 Intro to Data Science – 3 credits
  3. DATA 202 Ethics in Data Science, OR PHIL 215 Right & Wrong in the Contemporary World, OR PHIL 303 Ethics, OR PHIL 330 Ethics and Data Science (all 3 credits)
  4. DATA 211 Data Visualization – 3 credits
  5. DATA 310 Applied Machine Learning – 3 credits
  6. DATA 311 Databases – 3 credits

Capstone Course

Students must select one of following courses (also fulfills COLL 400 requirement). Capstone courses do not count towards credits needed to fulfill track area requirements:

DATA 410 Advanced Applied Machine Learning - 3 credits*
DATA 431 Spatial Data Discovery – 3 credits*
DATA 440 Special Topics in Data Science – 3 - 4 credits
DATA 442 Neural Networks and Deep Learning - 3 credits*
DATA 444 Agent-Based Modeling - 3 credits*
DATA 490 Independent Research in Data Science – 3 - 4 credits
GOVT 403 Legal Data in Comparative Context – 4 credits*

*also fulfills writing requirement

Mathematics Requirements

MATH 211 Linear Algebra – 3 credits (prerequisites MATH 111 AND MATH 112, OR MATH 131 AND MATH 132)
MATH 351 Probability & Statistics for Scientists – 3 credits (OR MATH 451 Probability – 3 credits)
MATH 352 Statistical Data Analysis – 3 credits (OR MATH 452 Mathematical Statistics – 3 credits) 

II. Track Areas

Students are required to select a track at the time of major declaration. A track is constituted of three additional methods-oriented courses. Courses selected to fulfill Track Area requirement do not count toward credits needed to fulfill the Capstone requirement.

1. Data Application

The purpose of this track is to prepare students for positions in which they will conduct predictive analyses using large, potentially near real-time data sets from a wide range of sensors and sources. The coursework will allow students to build data pipelines to ingest large quantities of data into computational environments quickly and efficiently, integrate these data into common frames of reference, process the data using statistical and computational modeling techniques, and update models dynamically based on real-time information. Students will be well trained for entry level jobs in government or private industry in which they formulate novel questions that can be explored with big data.

Students must select three of the following courses:
DATA 330 Applied Time Series Analyses – 3 credits
DATA 340 Topics in Data Science – 3 credits
DATA 410 Advanced Applied Machine Learning - 3 credits
DATA 431 Spatial Data Discovery – 3 credits
DATA 440 Special Topics in Data Science – 3-4 credits
DATA 442 Neural Networks and Deep Learning - 3 credits
DATA 444 Agent-Based Modeling - 3 credits
2. Algorithms

The purpose of this track is to prepare students for positions in which they support the development of new software or algorithms for the ingestion or analysis of large sources of frequently near-real-time data. It provides students with a depth of knowledge on computational efficiency, and teaches the basic theory of how computational bottlenecks might be overcome. Students will be well trained for entry level data engineering jobs focused on the implementation and maintenance of large-scale data warehouses.  This track is not available for students already majoring in Computer Science.

Students must take the following three courses:
CSCI 241 Data Structures – 3 credits
CSCI 243 Discrete Structures of Computer Science, OR MATH 214 Foundations of Mathematics (both 3 credits)
CSCI 303 Algorithms – 3 credits

3. Spatial Data Analytics

The purpose of this track is to prepare students for positions that require the large-scale analysis of data with a geospatial component, including both satellite and survey information. Students will be exposed to novel modeling techniques that incorporate spatial dependencies, data warehousing and processing techniques unique to spatial data, and techniques for the visualization of spatial data sources. Coursework will train students for entry-level positions in which they use Geographic Information Systems (GIS) software packages and spatial data to formulate and answer questions.

Students must select three of the following courses:
GIS 201 Introduction to Geographic Information Systems and Spatial Analysis – 3 credits
GIS 405 – Geovisualization & Spatial Design – 3 credits
GIS 410 – Introduction to Remote Sensing – 3 credits
GIS 420 – Advanced GIS – 3 credits
DATA 431 Spatial Data Discovery – 3 credits

For more information, contact a Data Science advisor.