Hide

Problem C
Windchill

Languages en sv

As everyone knows, few things are as delightful as a cold gust of wind on a hot summer day. But it turns out that a cold gust of wind is incredibly dangerous!! It is not without reason that it has been listed alongside hurricanes and tornadoes in datasets for rare natural disasters! Given that such dangerous events as gusts of wind can occur in everyday life, you have been tasked with developing a program that analyzes weather data from the past five days for a given location and determines whether any type of extreme weather will occur. If extreme weather is expected, it is also necessary to classify which of the 36 different extreme weather categories may occur.

For this task we have given you training data and it’s not allowed to find your own training data on the internet.

Input

Download the file with test data. This can be found at the bottom under "attachments". You will receive a file with weather data for a specific location over the past five days. Each weather entry consists of multiple rows, where each row represents a time period (e.g., an hour) and contains relevant meteorological variables such as temperature, humidity, wind speed, pressure, etc. The data is structured as a table where each row corresponds to a time period, and each column represents a meteorological variable. More precisely, the input will contain the columns "dayXtmin", "dayXtavg", "dayXtmax", "dayXprcp", "dayXsnow", "dayXwdir", "dayXwspd", "dayXwpgt", "dayXpres", and "dayXtsun". These describe the minimum temperature, average temperature, maximum temperature, precipitation, snowfall, wind direction, wind speed, maximum wind gust, air pressure, and total sunshine duration, respectively. X in the above will take the values 1-5, where day1 is five days before the day you need to determine whether extreme weather will occur, and day5 is one day before that. Note that some of these values may be missing, which can happen, for example, if a particular measurement was not possible at that location at that time. In the training data, all columns will be preceded by a "label" that indicates which of the 36 categories occurred (0-35) or -1 if none of these categories happened.

Output

For each test case, your program should produce a line with 36 numbers separated by spaces. Each number should be either 0 or 1:

0 indicates that the specific extreme weather category will not occur.
1 indicates that the specific extreme weather category will occur.

These numbers represent whether extreme weather will occur within any of the 36 categories. Note that you may indicate that zero or more categories will occur.

Example

Output:

  
0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

Scoring

Your solution will be evaluated based on how well it predicts extreme weather categories compared to the correct data. Since it is much more severe to fail to predict extreme weather that occurs than to predict it when it does not, false negative predictions carry a higher penalty than false positive predictions.

The scoring is calculated as follows:

For each incorrect prediction where you guess that it will occur, a penalty of 3.5 points is given.
For each incorrect prediction where you guess that it will not occur, a penalty of 140 points is given.

The average penalty ($penalty$) is calculated by summing all penalty points for a wrong prediction and then divided by the total amount of testcases.
If $penalty$ is bigger than or equal to $130$ you get $0$ points, otherwise you get the following:

Your final points are calculated as following:

\[ \text{Points} = \max (0, \min (100, 101 - (\frac{\text{130}}{\text{130-penalty}})^2)) \]

The scoring function may be adjusted during the competition.

At the end of the competition, all solutions will be retested on the remaining 70% of the data. Your final score at the end of the competition will only be based on the remaining 70% of the data; the 30% tested during the competition will have no effect. It is guaranteed that the 30% tested during the competition were chosen uniformly at random and are entirely disjoint from the 70% tested at the end. Therefore, the results on the 30% tested during the competition should be seen as a strong indicator of how well your solution performs. At the same time, it is detrimental to overfit your solution to the test data.

Attachments

baseline2.py baseline.py testdata.csv trainingdata.csv

Problem CWindchill

Input

Output

Example

Scoring

Problem C
Windchill