Task 1

Task 1 : Obtain and review raw data

One day, my old running friend and I were chatting about our running styles, training habits, and achievements, when I suddenly realized that I could take an in-depth analytical look at my training. I have been using a popular GPS fitness tracker called Runkeeper for years and decided it was time to analyze my running data to see how I was doing.

Since 2012, I've been using the Runkeeper app, and it's great. One key feature: its excellent data export. Anyone who has a smartphone can download the app and analyze their data like we will in this notebook.

Runner in blue

After logging your run, the first step is to export the data from Runkeeper (which I've done already). Then import the data and start exploring to find potential problems. After that, create data cleaning strategies to fix the issues. Finally, analyze and visualize the clean time-series data.

I exported seven years worth of my training data, from 2012 through 2018. The data is a CSV file where each row is a single training activity. Let's load and inspect it.

Instructions :

Load pandas and the training activities data.

  • Import pandas under the alias pd.

  • Use the read_csv() function to load the dataset (runkeeper_file) into a variable called df_activities. Parse the dates with the parse_dates parameter and set the index to the Date column using the index_col parameter.

  • Display 3 random rows from df_activities using the sample() method.

  • Print a summary of df_activities using the info() method.

Data Set :

Last updated

Was this helpful?