Baselight

Envision Racing

Can your data-science skills help Envision Racing, take home even more trophies?

@kaggle.pavan9065_envision_racing

About this Dataset

Envision Racing

Overview

In the heat of a Formula E race, teams need fast access to insights that can help drivers make split-second decisions and cross the finish line first. Can your data-science skills help Envision Racing, one of the founding teams in the championship, take home even more trophies?

To do so, you will have to build a machine learning model that predicts the Envision Racing drivers’ lap times for the all-important qualifying sessions that determine what position they start the race in. Winning races involve a combination of both a driver’s skills and data analytics. To help the team you’ll need to consider several factors that affect performance during a session, including weather, track conditions, and a driver’s familiarity with the track.

Genpact, a leading professional services firm that focuses on digital transformation, is collaborating with Envision Racing, a Formula E racing team and digital hackathon platform MachineHack, a brainchild of Analytics India Magazine, is launching ‘Dare in Reality’.’ This two-week hackathon allows data science professionals, machine learning engineers, artificial intelligence practitioners, and other tech enthusiasts to showcase their skills, impress the judges, and stand a chance to win exciting cash prizes.

Genpact (NYSE: G) is a global professional services firm that makes business transformation real, driving digital-led innovation and digitally-enabled intelligent operations for our clients.

Dataset Description

  • train.csv - 10276 rows x 25 columns (Includes target column as LAP_TIME)
    Attributes
  • NUMBER: Number in sequence
  • DRIVER_NUMBER: Driver number
  • LAP_NUMBER: lap number
  • LAP_TIME: Lap time in seconds
  • LAP_IMPROVEMENT: Number of Lap Improvement
  • CROSSING_FINISH_LINE_IN_PIT
  • S1: Sector 1 in [min:sec.microseconds]
  • S1_IMPROVEMENT: Improvement in sector 1
  • S2: Sector 2 in [min:sec.microseconds]
  • S2_IMPROVEMENT: Improvement in sector 2
  • S3: Sector 3 in [min:sec.microseconds]
  • S3_IMPROVEMENT: Improvement in sector 3
  • KPH: speed in kilometer/hour
  • ELAPSED: Time elapsed in [min:sec.microseconds]
  • HOUR: in [min:sec.microseconds]
  • S1_LARGE: in [min:sec.microseconds]
  • S2_LARGE: in [min:sec.microseconds]
  • S3_LARGE: in [min:sec.microseconds]
  • DRIVER_NAME: Name of the driver
  • PIT_TIME: time taken to car stops in the pits for fuel and other consumables to be renewed or replenished
  • GROUP: Group of driver
  • TEAM: Team name
  • POWER: Brake Horsepower(bhp)
  • LOCATION: Location of the event
  • EVENT: Free practice or qualifying

test.csv - 420 rows x 25 columns(Includes target column as LAP_TIME)

submission.csv -Please check the Evaluation section for more details on how to generate a valid submission.

The challenge is to predict the LAP_TIME for the qualifying groups of locations 6, 7 and 8.

Knowledge and Skills

  • Multivariate Regression
  • Big dataset, underfitting vs overfitting
  • Optimizing RMSLE to generalize well on unseen data

Share link

Anyone who has the link will be able to view this.