This course extends reproducible research craft into applied data science for infectious disease: ingesting messy surveillance data, wrangling and validating it, and communicating results with honest visualizations. It sequences the site’s Programming and Computing library around real infectious-disease datasets.

The course syllabus is shown below.

Draft syllabus. This is a scaffold for the concentration. Course number, credit hours, dates, and specific assignments are placeholders and will be finalized before the course is offered.


Course title and instructors

Title: Data Science for Infectious Disease
Course Number: BIO 3xx (proposed; confirm with the Department of Biology)
Semester: TBD
Credit Hours: 3
Meeting Time: TBD

Course Director: Michael E. DeWitt, MS
Email: medewitt@wakehealth.edu or dewime23@wfu.edu

Course description

Working with real infectious-disease data means dealing with files in awkward formats, records that are missing or duplicated, and results that must be communicated without overstating what the data support. This course teaches the applied data-handling side of the work. Students ingest and validate data from files and APIs, handle formats, missingness, and secrets safely, structure a reproducible project under version control with tested and documented code, build clear and honest visualizations that show uncertainty, and reason about performance and numerical stability as analyses scale. The material comes from the site’s Programming and Computing library, sequenced around real datasets.

This course overlaps with Research Tools and Methods, which covers reproducible research craft. The intended split is clean: Research Tools and Methods teaches the craft and reproducibility habits, while this course focuses on applied data handling and visualization with infectious-disease data. Where a course renumber or absorption is proposed, the overlap should be resolved so the two courses do not duplicate content.

Learning outcomes

Upon successful completion of this course, students will be able to:

Textbook and other resources

There is no single required textbook. Recommended references include:

Additional readings will be assigned throughout the course.

Site resources

This course draws on IDEEEP content pages as assigned readings and lab material:

New concept pages on data-visualization principles and on tidy and relational data are planned and will be linked here once published.

Course structure and schedule

This course meets over 15 weeks and combines lecture with computer labs on real infectious-disease datasets. The schedule below is a draft outline of topics.

WeekTopic
1Introduction: the infectious-disease data pipeline
2Data representation and formats
3Tidy and relational data
4Data ingestion from files
5Ingestion from APIs and handling secrets safely
6Validation and missingness
7Reshaping and manipulating data
8Project workflow and structure
9Version control with Git
10Testing and debugging scientific code
11Reproducibility
12Principles of honest data visualization
13Visualizing uncertainty
14Performance, numerical stability, and scaling to HPC
15Project presentations and wrap-up

Note: Specific dates will be provided at the beginning of the semester. Topics may be adjusted based on class progress and student interests.

Grades and assignments

ActivityWeight
Participation and lab discussion20%
Computer labs and assignments30%
Exam(s)20%
Final project30%

Final project: Students will build a reproducible analysis of a real infectious-disease dataset from ingestion through validation to visualization, with tested code under version control and honest communication of uncertainty.

Course policies

Attendance: Regular attendance is expected, particularly for discussion sessions. Please alert the instructor if you are unable to attend for any reason.

Late/Makeup work: Assignments are due on the dates provided. We recognize that extenuating circumstances arise, and assignments may be submitted up to 2 days late without penalty. If you need an extension, contact the instructor as soon as possible and before the due date.

Artificial intelligence: Artificial intelligence tools and large language models such as ChatGPT, Claude, and Gemini are now part of the academic and professional landscape and we encourage you to find ways to use them to enhance your learning. However, if you use these tools, you must cite your sources and provide a detailed description of the tools you used to complete the assignment. In no way can these tools take the place of your own work and understanding of the material. They should be used to supplement your learning, not replace it. You are ultimately responsible for your work including content and the use of valid citations and references. Using these tools without proper attribution is plagiarism and will be treated as such.

Department/School/University policies

Academic Integrity: Wake Forest University is committed to a culture of academic integrity. As a part of this community, you share the responsibility for creating a place of honesty, intellectual curiosity, and individual accountability. As you committed to with your honor pledge signature, you agree “not to deceive any member of the community; not to steal, cheat, or plagiarize on academic work; and not to engage in any other form of academic misconduct.” If you have questions about documenting your work, working with external sources, or working with peers on assigned work, consult with me as soon as possible. Instances of academic dishonesty will be referred to the Honor and Ethics Council.

Accessibility: Wake Forest University provides reasonable accommodations to students with disabilities. If you are in need of an accommodation, please contact me privately as early in the term as possible. Retroactive accommodations will not be provided. Students requiring accommodations must also consult the Center for Learning, Access, and Student Success (118 Reynolda Hall, 336-758-5929, class.wfu.edu).

Accommodations for Religious or Spiritual Practices: Wake Forest University benefits from the multitude of faiths and spiritual identities held by members of our learning community. Should you need accommodations this semester, email me as soon as possible to ensure we have time to develop equitable alternatives.

Class recordings: In case any class recordings are provided, they are reserved only for students in this class for educational purposes and are protected under FERPA. The recordings should not be shared outside the class in any form.

Syllabus change notice

This syllabus and the dates herein are subject to change.