Skip to content

Commit b692476

Browse files
authored
connected code, question and answer in challenge
Rephrased introduction, corrected code snippet and extended answer for Looping Over DataFrame challenge, following #347 (comment)
1 parent 4738cb4 commit b692476

1 file changed

Lines changed: 14 additions & 6 deletions

File tree

_extras/extra_challenges.md

Lines changed: 14 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -8,31 +8,39 @@ permalink: /extra_challenges/
88

99
A collection of challenges that have been either removed from or not (yet) added to the main lesson.
1010

11-
> ## Looping Over Dataframe
11+
> ## Looping Over DataFrame
1212
>
1313
> (Please refer to lesson `06-loops-and-functions.md`)
1414
>
1515
> The file `surveys.csv` in the `data` folder contains 25 years of data from surveys,
16-
> starting from 1977. We can load the data and print all the years surveyed using a `for` loop:
16+
> starting from 1977. We can extract data corresponding to each year in this DataFrame
17+
> to individual CSV files, by using a `for` loop:
1718
>
1819
> ~~~
1920
> import pandas as pd
2021
>
2122
> # Load the data into a DataFrame
2223
> surveys_df = pd.read_csv('data/surveys.csv')
2324
>
24-
> # Loop through a sequence of years and print the year
25+
> # Loop through a sequence of years and export selected data
2526
> start_year = 1977
2627
> end_year = 2002
2728
> for year in range(start_year, end_year+1):
28-
> print(year)
29+
>
30+
> # Select data for the year
31+
> surveys_year = surveys_df[surveys_df.year == year]
32+
>
33+
> # Write the new DataFrame to a CSV file
34+
> filename = 'data/surveys' + str(year) + '.csv'
35+
> surveys_year.to_csv(filename)
2936
> ~~~
3037
> {: .language-python}
3138
>
3239
> What happens if there is no data for a year in a sequence? For example,
33-
> imagine we used `1976` as the year in `surveys_df[surveys_df.year == year]`
40+
> imagine we used `1976` as the `start_year`
3441
>
3542
> > ## Solution
36-
> > An empty file with only the headers
43+
> > We get the expected files for all years between 1977 and 2002,
44+
> > plus an empty `data/surveys1976.csv` file with only the headers.
3745
> {: .solution}
3846
{: .challenge}

0 commit comments

Comments
 (0)