Syllabus

Course Information

  • Location: In person for Fall 2023
  • Course time: Tuesdays and Thursdays from 1:30-2:50pm (Eastern Daylight Time zone)
  • Course location: W4030
  • Assignments: Four projects

Instructor

I am an Associate Professor in the Department of Biostatistics at the Bloomberg School of Public Health and Department of Biomedical Engineering in the Whiting School of Engineering at Johns Hopkins University, a faculty member of the Johns Hopkins Data Science Lab, and have affiliations with the Malone Center for Engineering in Healthcare, Center for Computational Biology, the Department of Genetic Medicine, and the Department of Biochemistry and Molecular Biology.

My research focuses on developing fast, scalable, statistical methodology and open-source software for genomics and biomedical data analysis for human health and disease. My research is problem-forward: I develop statistical methods and software that are motivated by concrete problems, often with real-world, noisy, messy data. I’m also interested in developing theory for how to incorporate design thinking (alongside statistical thinking) in practice of data analysis.

If you want, you can find me on Twitter. I’m also a co-host of the The Corresponding Author podcast, member of the Editorial Board for Genome Biology, an Associate Editor for Reproducibility at the Journal of the American Statistical Association, and co-founder of R-Ladies Baltimore.

Teaching Assistants

Joe Sartini (jsartin1@jhu.edu) is a third year Ph.D. student in Biostatistics, with interest in models for precision medicine applications. Currently, the focus of his work is extracting meaningful insights from time-series produced by Continuous Glucose Monitoring and other devices worn by Type 2 diabetics. Outside of research, he enjoys participating in endurance sports and weightlifting.

Angela Zhao (azhao29@jh.edu) is a second year ScM student in Biostatistics interested in functional data analysis and stochastic models. Her main projects involve using physical activity data derived from wearable devices to predict time-to-event outcomes. For leisure, she enjoys bouldering and stand-up comedy.

Instructor and TA office hours are announced on CoursePlus. If there are conflicts and/or need to cancel office hours, announcements will be made on CoursePlus.

Learning Objectives

Upon successfully completing this course, students will be able to:

  1. Write unix code with command-line tools
  2. Install and configure software necessary for a statistical programming environment and with version control
  3. Write code for advanced programming topics in R, Python, SQL, and tidyverse
  4. Build and organize a software package with documentation for publishing on the internet
  5. Write code using functional or other programming paradigms and discuss strategies for getting data from APIs or working with large data
  6. Build an interactive web application using Shiny or other dashboard tools

Course logistics

All communication for the course is going to take place on one of three platforms

Courseplus

The primary communication for the class will go through Courseplus. That is where we will post course announcements, share resources, host most of our asynchronous course discussion, and as the primary means of communication between course participants and course instructors

Important

If you are registered for the course, you should have access to Courseplus now. Once you have access you will also be able to find all material and dates/times of drop-in hours. Any zoom links will be posted on Courseplus.

The course will make use of the CoursePlus Discussion Forum in order to ask and answer questions regarding any of the course materials. The Instructor and the Teaching Assistant will monitor the discussion boards and answer questions when appropriate.

GitHub

You can access all course materials (e.g. lectures, project assignments) here

Lectures

This is course has only one section ending in .01, which means it is an Onsite Synchronous course. This means you are expected to attend class in person. While, I will record the lectures via a zoom recording, I do not plan to post the recordings on CoursePlus. If you have an unexpected / emergency event that comes up and you are unable to attend the lecture in person, you can email me to ask for the recording. I just ask that you briefly provide 1 sentence explanation.

Attendance is not taken, but I strongly encourage you to attend class to ask questions and participate in class exercises. You will get as much out of the course as you put into it.

Getting help

In order of preference, here is a preferred list of ways to get help:

  1. We strongly encourage you to use Courseplus to ask questions first, before joining office hours. The reason for this is so that other students in the class (who likely have similar questions) can also benefit from the questions and answers asked by your colleagues.
  2. You are welcome to join office hours to get more group interactive feedback.
  3. If you are not able to make the office hours, appointments can be made by email with the instructor.

Textbook and Other Course Material

There is no required textbook. We will make use of several freely available textbooks and other materials. All course materials will be provided. We will use the R and Python software for data analysis, which is freely available for download.

Software

We will make heavy use of R in this course, so you should have R installed. You can obtain R from the Comprehensive R Archive Network. There are versions available for Mac, Windows, and Unix/Linux. This software is required for this course.

It is important that you have the latest version of R installed. For this course we will be using R version 4.3.1. You can determine what version of R you have by starting up R and typing into the console R.version.string and hitting the return/enter key. If you do not have this version of R installed, go to CRAN and download and install the latest version.

We will also make use of the RStudio interactive development environment (IDE). RStudio requires that R be installed, and so is an “add-on” to R. You can obtain the RStudio Desktop for free from the RStudio web site. In particular, we will make heavy use of it when developing R packages. It is also essential that you have the latest release of RStudio. You can determine the version of RStudio by looking at menu item Help > About RStudio. You should be using RStudio version 2023.09.1+494 (2023.09.1+494) or higher, which requires R version 3.3.0 or higher.

Projects

There will be 4 assignments, due every 2–3 weeks. Projects will be submitted electronically via the Drop Box on the CoursePlus web site (unless otherwise specified).

The project assignments will be due on

  • Project 1: Friday November 10, 11:59pm
  • Project 2: Tuesday November 28, 11:59pm
  • Project 3: Tuesday December 12, 11:59pm
  • Project 4: Friday December 22, 11:59pm

Project collaboration

Please feel free to study together and talk to one another about project assignments. The mutual instruction that students give each other is among the most valuable that can be achieved.

However, it is expected that project assignments will be implemented and written up independently unless otherwise specified. Specifically, please do not share analytic code or output. Please do not collaborate on write-up and interpretation. Please do not access or use solutions from any source before your project assignment is submitted for grading.

Exams

There are no exams in this course.

Grading

Grades in the course will be based on Projects 1–4. Grades for the projects and the final grade will be issued via the CoursePlus grade book.

Relative weights

The grades are based on four projects. The breakdown of grading will be

  • 25% for Project 1
  • 25% for Project 2
  • 25% for Project 3
  • 25% for Project 4

Policy for submitted projects late

Important

The instructor and TA(s) will not accept email late day policy requests.

This is the policy for late submissions that applies to Projects 1-4.

  • Each student will be given three “free late days” for the rest of the course.
  • A late day extends the individual project deadline by 24 hours without penalty.
  • The late days can be applied to just one project (e.g. two late days for Project 2), or they can be split across the two projects (one late day for Project 2 and one late day for Project 3). This is entirely left up to the discretion of the student.
  • A max of two “free late days” can be applied to any one project.
  • Free late days are intended to give you flexibility: you can use them for any reason no questions asked.
  • You do not get any bonus points for not using your late days.

Although the each student is only given a total of three “free late days”, we will be accepting homework from students that pass this limit.

  • We will deduct 5% off the 100% starting point for each day the assignment is late.
  • If you use two “free late days” for project, but need a 3rd day, there will be no penalty for the first two late days and there will be a 5% penalty for the 3rd late day.
  • If you do not have any more late days for the term, we will deduct 5% for the assignment that is <24 hours late, 10% points for the assignment that is 24-48 hours late, and 15% points for the assignment that is 48-72 hours late.
Important

We will not grade assignments that are more than 3 days (or more than 72 hours) past the original due date.

Regrading Policy

It is very important to us that all assignments are properly graded. If you believe there is an error in your assignment grading, please send an email to the instructor within 7 days of receiving the grade. No re-grade requests will be accepted orally, and no regrade requests will be accepted more than 7 days after you receive the grade for the assignment.

Use of AI tools

Use of AI tools (including ChatGPT, Bard, Microsoft Copilot, etc) to assist in completing this assignment/exam is permitted with your writing and/or programming. Be aware, however, that such tools often introduce errors or fabricate information; it is your responsibility to ensure the factual accuracy of whatever you claim as your writing/code. I recommend using such tools particularly for learning to code, just make sure the code does what it is supposed to, and that you understand what the code does.

With respect to writing, as with all sources, proper references and use of quotation marks should be used (if precise language generated by the software is used). The reference must include the website and specific prompts used to generate the referenced output.

Academic Ethics and Student Conduct Code

Students enrolled in the Bloomberg School of Public Health of The Johns Hopkins University assume an obligation to conduct themselves in a manner appropriate to the University’s mission as an institution of higher education. A student is obligated to refrain from acts which he or she knows, or under the circumstances has reason to know, impair the academic integrity of the University. Violations of academic integrity include, but are not limited to: cheating; plagiarism; knowingly furnishing false information to any agent of the University for inclusion in the academic record; violation of the rights and welfare of animal or human subjects in research; and misconduct as a member of either School or University committees or recognized groups or organizations.

Students should be familiar with the policies and procedures specified under Policy and Procedure Manual Student-01 (Academic Ethics), available on the school’s portal.

The faculty, staff and students of the Bloomberg School of Public Health and the Johns Hopkins University have the shared responsibility to conduct themselves in a manner that upholds the law and respects the rights of others. Students enrolled in the School are subject to the Student Conduct Code (detailed in Policy and Procedure Manual Student-06) and assume an obligation to conduct themselves in a manner which upholds the law and respects the rights of others. They are responsible for maintaining the academic integrity of the institution and for preserving an environment conducive to the safe pursuit of the School’s educational, research, and professional practice missions.

Course code of Conduct

We are committed to providing a welcoming, inclusive, and harassment-free experience for everyone, regardless of gender, gender identity and expression, age, sexual orientation, disability, physical appearance, body size, race, ethnicity, religion (or lack thereof), political beliefs/leanings, or technology choices. We do not tolerate harassment of course participants in any form. Sexual language and imagery is not appropriate for any work event, including group meetings, conferences, talks, parties, Twitter and other online media. This code of conduct applies to all course participants, including instructors and TAs, and applies to all modes of interaction, both in-person and online, including GitHub project repos, Slack channels, and Twitter.

Course participants violating these rules will be referred to leadership of the Department of Biostatistics and the Title IX coordinator at JHU and may face expulsion from the class.

All class participants agree to:

  • Be considerate in speech and actions, and actively seek to acknowledge and respect the boundaries of other members.
  • Be respectful. Disagreements happen, but do not require poor behavior or poor manners. Frustration is inevitable, but it should never turn into a personal attack. A community where people feel uncomfortable or threatened is not a productive one. Course participants should be respectful both of the other course participants and those outside the course.
  • Refrain from demeaning, discriminatory, or harassing behavior and speech. Harassment includes, but is not limited to: deliberate intimidation; stalking; unwanted photography or recording; sustained or willful disruption of talks or other events; inappropriate physical contact; use of sexual or discriminatory imagery, comments, or jokes; and unwelcome sexual attention. If you feel that someone has harassed you or otherwise treated you inappropriately, please alert Stephanie Hicks.
  • Take care of each other. Refrain from advocating for, or encouraging, any of the above behavior. And, if someone asks you to stop, then stop. Alert Stephanie Hicks if you notice a dangerous situation, someone in distress, or violations of this code of conduct, even if they seem inconsequential.

Need Help?

Please speak with Stephanie Hicks or one of the TAs. You can also reach out to Karen Bandeen-Roche, chair of the department of Biostatistics or Margaret Taub, Ombudsman for the Department of Biostatistics.

You may also reach out to any Hopkins resource for sexual harassment, discrimination, or misconduct:

License and attribution

This Code of Conduct is distributed under a Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) license. Portions of above text comprised of language from the Codes of Conduct adopted by rOpenSci and Django, which are licensed by CC BY-SA 4.0 and CC BY 3.0. This work was further inspired by Ada Initiative’s ‘’how to design a code of conduct for your community’’ and Geek Feminism’s Code of conduct evaluations and expanded by Ashley Johnson and Shannon Ellis in the Jeff Leek group.

Disability Support Service

Students requiring accommodations for disabilities should register with Student Disability Service (SDS). It is the responsibility of the student to register for accommodations with SDS. Accommodations take effect upon approval and apply to the remainder of the time for which a student is registered and enrolled at the Bloomberg School of Public Health. Once a student has been approved for accommodations, the student will receive formal notification and the student will be encouraged to reach out to the instructor.

If you have questions about requesting accommodations, please contact BSPH.dss@jhu.edu.

Prerequisites

Prerequisite for the course is Biostatistics 140.776 or knowledge of material from 140.776 is assumed.

If you did not take the above course, please contact course instructor to get permission to enroll.

General Disclaimers

This syllabus is a general plan, deviations announced to the class by the instructor may be necessary.

Typos and corrections

Feel free to submit typos/errors/etc via the github repository associated with the class: https://github.com/stephaniehicks/jhustatprogramming2023. You will have the thanks of your grateful instructor!