Can your data-science skills help Envision Racing, take home even more trophies?
Dataset Description
Overview
In the heat of a Formula E race, teams need fast access to insights that can help drivers make split-second decisions and cross the finish line first. Can your data-science skills help Envision Racing, one of the founding teams in the championship, take home even more trophies?
To do so, you will have to build a machine learning model that predicts the Envision Racing drivers’ lap times for the all-important qualifying sessions that determine what position they start the race in. Winning races involve a combination of both a driver’s skills and data analytics. To help the team you’ll need to consider several factors that affect performance during a session, including weather, track conditions, and a driver’s familiarity with the track.
Genpact, a leading professional services firm that focuses on digital transformation, is collaborating with Envision Racing, a Formula E racing team and digital hackathon platform MachineHack, a brainchild of Analytics India Magazine, is launching ‘Dare in Reality’.’ This two-week hackathon allows data science professionals, machine learning engineers, artificial intelligence practitioners, and other tech enthusiasts to showcase their skills, impress the judges, and stand a chance to win exciting cash prizes.
Genpact (NYSE: G) is a global professional services firm that makes business transformation real, driving digital-led innovation and digitally-enabled intelligent operations for our clients.
Dataset Description
- train.csv - 10276 rows x 25 columns (Includes target column as LAP_TIME)
Attributes - NUMBER: Number in sequence
- DRIVER_NUMBER: Driver number
- LAP_NUMBER: lap number
- LAP_TIME: Lap time in seconds
- LAP_IMPROVEMENT: Number of Lap Improvement
- CROSSING_FINISH_LINE_IN_PIT
- S1: Sector 1 in [min:sec.microseconds]
- S1_IMPROVEMENT: Improvement in sector 1
- S2: Sector 2 in [min:sec.microseconds]
- S2_IMPROVEMENT: Improvement in sector 2
- S3: Sector 3 in [min:sec.microseconds]
- S3_IMPROVEMENT: Improvement in sector 3
- KPH: speed in kilometer/hour
- ELAPSED: Time elapsed in [min:sec.microseconds]
- HOUR: in [min:sec.microseconds]
- S1_LARGE: in [min:sec.microseconds]
- S2_LARGE: in [min:sec.microseconds]
- S3_LARGE: in [min:sec.microseconds]
- DRIVER_NAME: Name of the driver
- PIT_TIME: time taken to car stops in the pits for fuel and other consumables to be renewed or replenished
- GROUP: Group of driver
- TEAM: Team name
- POWER: Brake Horsepower(bhp)
- LOCATION: Location of the event
- EVENT: Free practice or qualifying
test.csv - 420 rows x 25 columns(Includes target column as LAP_TIME)
submission.csv -Please check the Evaluation section for more details on how to generate a valid submission.
The challenge is to predict the LAP_TIME for the qualifying groups of locations 6, 7 and 8.
Knowledge and Skills
- Multivariate Regression
- Big dataset, underfitting vs overfitting
- Optimizing RMSLE to generalize well on unseen data
Related Datasets
-
Formula 1 Drivers Dataset
@kaggle