Following are a long list of activities that can help you dive more deeply into Data, Code, and/or Ethics. These will form the core of our work together this week.

Each activity is optional. You choose which ones you do. And that includes other activities you come up with that aren't listed here. (Though if you find a good one, please share with the rest of us!)

Each activity is labeled with one or more categories: Data, Code, and/or Ethics. This will help you strike the right balance between the three, whatever the best balance is for you. (I.e., if you are trying to focus equally on each, or if there is one particular area you are keen to really "level up" on.)

Each activity is also labeled with information about its difficulty level. This is meant to help you both budget your time and set your expectations. Some of them are things you could do with first-year college students (or possibly even secondary students). Feel free to "steal" those and use them at your home institution. Others are things that will require you to put your graduate training and/or professional experience to work. And some require a bit of technical, social, or other background knowledge.

As discussed on the syllabus, choose the amount and kind of work that is most relevant to you. Leverage our course community for help as you go, as well as to share your insights along the way and/or when you finish.

Finally, don't forget that there are many more activities than I expect anyone would do in a week. That's meant to give you a lot of choices and ideas. I expect everyone to do a minority of the things listed here, possibly a very small minority. If you find there are things here at the end of the week that you wish you would have had time to do, this site will stay up for a while. You are more than welcome to follow up with these activites (and add them to syllabi, if you like) after DPL is officially over.

Required activities

Introduce yourself! (DO THIS FIRST!)

Create a video of no more than 30 seconds introducing yourself to the group and upload to our #data2020 channel in Slack. We have a tendency to go long with these, but keep in mind how many of us there are, and how many opportunities we'll have to go deep with each other throughout the week. Just give us enough to know how to say your name, what you look like, what you sound like, and why you're excited to be here. We'll fill in the rest as we go!

There is no need to go high-budget. A vertical selfie video taken on your phone is just fine for this.

Oh, and don't forget to fill our your Slack profile with a photo, the name you'd like us to call you, and any contact info you're comfortable sharing with the group.

Learn to Code (DATA | CODE)

Using the excellent book R for Data Science, spend some time learning to code in the "Tidyverse" version of the R programming language. I am asking that everyone do at least one of the following:

  • Make it to the end of Chapter 1, having hand-typed (not copied and pasted) and run all of the example code and attempted all of the exercises.
  • HOWEVER, if this is a major challenge to you, I simply ask that you spend about an hour a day, every day this week, going as far as you can at whatever pace is comfortable.

This is usually the most challenging and the most rewarding part of this DPL course, so please devote some time and serious effort to it, and reach out on Slack whenever you need help.

Before you begin, be sure to have the following installed in this order. (If you need help, we can definitely take care of that on Day 1. Just reach out in Slack. The first "office hours" meeting is also a good time to walk through this.)

  • The R programming language ― alongside Python, one of the main languages for work in data science and computational statistics.
  • RStudio ― the most common and most comprehensive development environment for R.

Once you've installed R and RStudio, open up R for Data Science, and start working through it. The first two chapters are short introductions. Once you start coding in Chapter 3, you may find the following video helpful. In it, I show you how to setup your environment, and then simply walk through the textbook from the beginning of Chapter 3 through the first set of exercises. You don't need to watch it, but if you hit any snags, check it out. It may answer your questions.

I have specifically chosen Tidyverse R and this textbook because 1) I use it every day and am well equipped to help, 2) the combination of the book and the framework are the most beginner-friendly of all the languages/teaching materials I have encountered, 3) the online community of Tidyverse R users is the friendliest and most helpful (and the most gender-diverse) of any I've encountered, and 4) the visualization-first approach is both more immediately rewarding for beginners and  more sound for aspiring data analysts than the approaches taken in most coding pedagogy.

The goal here is not to become a professional coder or data scientist by the end of the week. However, if you put in a solid effort over the week, by the end you will have a stronger understanding about how code and data work in the tools we use daily. You'll also know if coding is for you! If not, now you know and you can cogently explain why. If it is, you'll have your foot in the door, and you'll have a good idea where to go to take the next step.

Informed decisions about using code and informed decisions about whether or not to make a long-term investment in learning to code. That's the goal.

Other activites

Adtech analysis (DATA | CODE)

Note: this is a more technically advanced activity than most of the others.

The goal of this activity is to learn about how online advertising works behind the scenes, particularly when it comes to the collection and mining of your personal data (usually without your knowing it). This will result in an analysis like one of the following:

To conduct one of these kinds of analyses, follow Bill Fitzgerald's instructions for using an intercepting proxy to intercept and analyze background web connections. Then visit several websites you spend time on regularly and collect and analyze the background web connections through which your personal data flows while on these sites. This can be a more manual analysis, like Bill's, or a more statistical, like mine (all the code you need can be downloaded here), or a novel approach that you come up with yourself. Write up, make a video about, or otherwise share your results along with the implications of those results for life on the web.

Advertiser profiles (DATA)

This is a relatively easy activity and can readily be done with students.

Download your Twitter advertiser list, explore your Google advertiser profile, or find out what Facebook knows about you. Optionally, ask a friend/colleague or two to download theirs so you can compare notes. Is your data accurate? How did they find out that information about you? Would you be comfortable with them selling that data to other companies and/or cross-referencing it with your credit card, your purchasing history, your web browsing history, your political affiliation, your past places of residence, that data from your family and social network connections? (They do all of those things.) In most cases, companies give you discounts (or free services) in exchange for your data. What would it cost to keep your data private? Is it even possible? What data are you comfortable giving away? What data do you absolutely want to keep private?

Write about, make a video about, or otherwise reflect on, what you learned. It can be informative, so your audience knows what you were able to find, or it can contain recommendations for better privacy, including information about the trade-offs involved.

Course material revamp (DATA | CODE | ETHICS)

Difficulty varies according to situational specifics.

Take an existing assignment, syllabus, or other educational practice for an upcoming course/event and rework it in light of something learned in this course. This could relate to data, code, ethics, or some combination. It could be anything from replacing a reading to adding/replacement a unit of study to creating an entire new course or faculty development seminar. This is best done later in the week, and likely requires significant time to be budgeted for it.

Covid–19 educational practices (ETHICS)

Difficulty varies according to situational specifics.

The novel coronavirus pandemic has upended many of our plans, and the politics of pandemic response has left schools and universities in the policy crosshairs, particularly in the United States, with faculty, staff, and students in a position of great vulnerability.

Consider a policy or practice that is being required, encouraged, or considered by your institution or government and reflect on the following:

  • Is there data to back it up?
  • Does it run counter to available data or the recommendations of experts?
  • Who is left most vulnerable by the policy/practice?
  • What wiggle room exists within the (proposed) requirements?
  • What policies or practices can you advocate for at your institution/government (digital or otherwise) that have a chance at succeeding and help bring policies/practices more in line with public health data or more in line with the goal of providing high-quality educational opportunities in a safe(r) environment?

Supporting readings:

Crap detection/digital polarization (DATA)

The hardest part of this activity is finding the right claim to investigate and then limiting the time you spend on it. How far can you get in 60 seconds? 90 seconds? 5 minutes?

Examine your social media feed(s) for a controversial or suspect claim. Then follow Mike Caulfield's Four Moves and a Habit guide to ascertain the facts in that particular matter, and if appropriate, when and where the narrative changed on its way to your feed. Share what you found in Slack, or in your social feed.

Mapping your digital trace (DATA | CODE)

This is easy, but takes time and can get frightening, especially if you are concerned about privacy but haven't specifically reflected on it in your own life before. This is a good activity to do with a partner and/or to compare notes on during an "office hours" meeting.

On a piece of paper (or a whiteboard), start by writing down "email address". Then think of who has your email address: your workplace, government agencies, stores you frequent, political candidates/parties, etc. Write each of them down and draw a line between them and "email address". Then take each of those new "nodes" and ask what other information do they have about me? — phone number, address, income, family members, purchase habits, browsing habits, credit cards, etc. Add those new nodes to your map and draw lines between each of them and all the entities on your map that know them. (For example, my bank would have lines to my email, phone, address, income, and purchase habits. The Internal Revenue Service would have lines to all of those except my purchase habits. Certain stores would have everything but my income. Etc.)

Once you've done a couple rounds of expansion, reflect on how interconnected your personal data is. What would happen if one of those entities were to sell your data to another of those entities? Which of those entities know the most about you? Which are the most trustworthy or secure? Do any stand out as knowing the most about you but being among the least trustworthy/secure?

Finally reflect on two big action steps:

  • What can I do personally to make this map less scary? (Split your online activity among two email addresses; provide a phone number of 999.999.9999, a name of "First Last" (literally), or an email of "name@youdontneedthisinfo.com" where they don't actually need the information to provide the service; drop the service entirely; etc.)
  • What can I do at my institution to make this map less scary for the people I'm responsible for? In other words, do you require students/faculty/staff to compromise their privacy or security? Are they informed in proportion with the risks, or empowered not to take those risks if they don't want to?

Mindfulness ― media consumption (DATA | ETHICS)

This activity is very easy.

After reading about the Attention Economy (and possibly about mindfulness in Net Smart), make a written log of your media consumption for at least two entire days. Every time you check social media, write a blog post, watch TV or Netflix or Hulu, read a book, or even text a friend, write it down ― preferably by hand. Do you notice any patterns? Which things did you do more often (or for longer amounts of time) than you expected? Less often? Reflect on the patterns you noticed and how (digital) media preys on our attention, as well as how we can regain control.

Supportive readings:

Mindfulness ― social media creation (DATA | ETHICS)

This activity is technically easy, but may be personally challenging, depending on your habits. However, the more challenging it is, the more you are likely to take away from it ... if you stick with it for at least two days.

Don't post anything to social media for at least two days. Instead write it down; in the case of photos, print them out and put them in your notebook with the text (don't worry about videos in your notebook). After your time off, take stock of what you produced in your notebook: What do you still want to post and share? What could you easily throw away? What would you like to keep in a diary or journal, but not share publicly? Also, consider any psychological impact of this shift in social media activity. Were there things you wanted to post, but simply decided not to write down? Did the act of writing it down change the nature of your "posts"? Finally, consider what this activity revealed to you about your social media habits. How much of it is deliberate, and how much second-nature? Does it make you want to change anything about how you use social media in the future?

Supportive readings:

Representation (DATA | CODE)

This activity is very easy and fits well in the undergraduate classroom. Inspired by Safiya Umoja Noble's Algorithms of Oppression and the UMW Domain of One's Own Curriculum. This activity works well with a partner, but can also be done individually.

Perform Google image searches for the following terms: teacher, professor, doctor, nurse, baby, teenager, criminal. What is striking about the results? Are you surprised? How are those results "chosen"? Are they meaningful at all? If we wanted to alter the results, what would it take?

Then choose other terms that you think might show the same kinds of patterns in their search results, and search for them. Were your predictions correct? Why (or why not)? (Noble, Tufekci, and I all discuss the algorithmic framework behind these search results in our books.)

What did you find to be most striking about these results? How does this change the way you interact with search engines in your daily life, your research? How might you change the way you encourage your students/faculty/colleagues to use these tools?

Twitter archive analysis (DATA | ETHICS)

This activity is a bit more challenging than some of the others. Manually reading through the downloaded data is not hard, BUT this is a great opportunity for an end-of-week project, where you take what you learn about coding and apply it to this data.

Download your Twitter archive (or other social media platform that allows you to download your content in a single archive). Dig through it and see what you find. Who did you tweet to/about most often? What words or topics came up the most? How has your usage changed over time? (volume, topic, conversations, etc.) You can do this manually by simply opening the archive in Excel or a text editor, or you can use one of the resources below to do a statistical analysis of your tweets' content.

What did you discover about yourself, about the platform, or about social media in general?

Supplemental resources

Thanks to @bady_qb for making the header photo available freely on unsplash.