Baselight

Envision Racing

Can your data-science skills help Envision Racing, take home even more trophies?

@kaggle.pavan9065_envision_racing

Loading...
Loading...

About this Dataset

Envision Racing

Overview

In the heat of a Formula E race, teams need fast access to insights that can help drivers make split-second decisions and cross the finish line first. Can your data-science skills help Envision Racing, one of the founding teams in the championship, take home even more trophies?

To do so, you will have to build a machine learning model that predicts the Envision Racing drivers’ lap times for the all-important qualifying sessions that determine what position they start the race in. Winning races involve a combination of both a driver’s skills and data analytics. To help the team you’ll need to consider several factors that affect performance during a session, including weather, track conditions, and a driver’s familiarity with the track.

Genpact, a leading professional services firm that focuses on digital transformation, is collaborating with Envision Racing, a Formula E racing team and digital hackathon platform MachineHack, a brainchild of Analytics India Magazine, is launching ‘Dare in Reality’.’ This two-week hackathon allows data science professionals, machine learning engineers, artificial intelligence practitioners, and other tech enthusiasts to showcase their skills, impress the judges, and stand a chance to win exciting cash prizes.

Genpact (NYSE: G) is a global professional services firm that makes business transformation real, driving digital-led innovation and digitally-enabled intelligent operations for our clients.

Dataset Description

  • train.csv - 10276 rows x 25 columns (Includes target column as LAP_TIME)
    Attributes
  • NUMBER: Number in sequence
  • DRIVER_NUMBER: Driver number
  • LAP_NUMBER: lap number
  • LAP_TIME: Lap time in seconds
  • LAP_IMPROVEMENT: Number of Lap Improvement
  • CROSSING_FINISH_LINE_IN_PIT
  • S1: Sector 1 in [min:sec.microseconds]
  • S1_IMPROVEMENT: Improvement in sector 1
  • S2: Sector 2 in [min:sec.microseconds]
  • S2_IMPROVEMENT: Improvement in sector 2
  • S3: Sector 3 in [min:sec.microseconds]
  • S3_IMPROVEMENT: Improvement in sector 3
  • KPH: speed in kilometer/hour
  • ELAPSED: Time elapsed in [min:sec.microseconds]
  • HOUR: in [min:sec.microseconds]
  • S1_LARGE: in [min:sec.microseconds]
  • S2_LARGE: in [min:sec.microseconds]
  • S3_LARGE: in [min:sec.microseconds]
  • DRIVER_NAME: Name of the driver
  • PIT_TIME: time taken to car stops in the pits for fuel and other consumables to be renewed or replenished
  • GROUP: Group of driver
  • TEAM: Team name
  • POWER: Brake Horsepower(bhp)
  • LOCATION: Location of the event
  • EVENT: Free practice or qualifying

test.csv - 420 rows x 25 columns(Includes target column as LAP_TIME)

submission.csv -Please check the Evaluation section for more details on how to generate a valid submission.

The challenge is to predict the LAP_TIME for the qualifying groups of locations 6, 7 and 8.

Knowledge and Skills

  • Multivariate Regression
  • Big dataset, underfitting vs overfitting
  • Optimizing RMSLE to generalize well on unseen data

Tables

Submission

@kaggle.pavan9065_envision_racing.submission
  • 2.01 kB
  • 420 rows
  • 2 columns
Loading...
CREATE TABLE submission (
  "lap" BIGINT,
  "time" VARCHAR
);

Test

@kaggle.pavan9065_envision_racing.test
  • 39.25 kB
  • 420 rows
  • 25 columns
Loading...
CREATE TABLE test (
  "number" BIGINT,
  "n__driver_number" BIGINT  -- DRIVER NUMBER,
  "n__lap_number" BIGINT  -- LAP NUMBER,
  "lap_time" VARCHAR,
  "n__lap_improvement" BIGINT  -- LAP IMPROVEMENT,
  "n__crossing_finish_line_in_pit" VARCHAR  -- CROSSING FINISH LINE IN PIT,
  "n__s1" VARCHAR  -- S1,
  "n__s1_improvement" BIGINT  -- S1 IMPROVEMENT,
  "n__s2" VARCHAR  -- S2,
  "n__s2_improvement" BIGINT  -- S2 IMPROVEMENT,
  "n__s3" VARCHAR  -- S3,
  "n__s3_improvement" BIGINT  -- S3 IMPROVEMENT,
  "n__kph" DOUBLE  -- KPH,
  "n__elapsed" VARCHAR  -- ELAPSED,
  "n__hour" VARCHAR  -- HOUR,
  "s1_large" VARCHAR,
  "s2_large" VARCHAR,
  "s3_large" VARCHAR,
  "driver_name" VARCHAR,
  "pit_time" VARCHAR,
  "group" DOUBLE,
  "team" VARCHAR,
  "power" DOUBLE,
  "location" VARCHAR,
  "event" VARCHAR
);

Test Weather

@kaggle.pavan9065_envision_racing.test_weather
  • 14.01 kB
  • 167 rows
  • 11 columns
Loading...
CREATE TABLE test_weather (
  "time_utc_seconds" BIGINT,
  "time_utc_str" VARCHAR,
  "air_temp" BIGINT,
  "track_temp" BIGINT,
  "humidity" BIGINT,
  "pressure" BIGINT,
  "wind_speed" BIGINT,
  "wind_direction" BIGINT,
  "rain" BIGINT,
  "location" VARCHAR,
  "events" VARCHAR
);

Train

@kaggle.pavan9065_envision_racing.train
  • 425.71 kB
  • 10,276 rows
  • 25 columns
Loading...
CREATE TABLE train (
  "number" BIGINT,
  "n__driver_number" BIGINT  -- DRIVER NUMBER,
  "n__lap_number" BIGINT  -- LAP NUMBER,
  "lap_time" BIGINT,
  "n__lap_improvement" BIGINT  -- LAP IMPROVEMENT,
  "n__crossing_finish_line_in_pit" VARCHAR  -- CROSSING FINISH LINE IN PIT,
  "n__s1" VARCHAR  -- S1,
  "n__s1_improvement" BIGINT  -- S1 IMPROVEMENT,
  "n__s2" VARCHAR  -- S2,
  "n__s2_improvement" BIGINT  -- S2 IMPROVEMENT,
  "n__s3" VARCHAR  -- S3,
  "n__s3_improvement" BIGINT  -- S3 IMPROVEMENT,
  "n__kph" DOUBLE  -- KPH,
  "n__elapsed" VARCHAR  -- ELAPSED,
  "n__hour" VARCHAR  -- HOUR,
  "s1_large" VARCHAR,
  "s2_large" VARCHAR,
  "s3_large" VARCHAR,
  "driver_name" VARCHAR,
  "pit_time" VARCHAR,
  "group" DOUBLE,
  "team" VARCHAR,
  "power" DOUBLE,
  "location" VARCHAR,
  "event" VARCHAR
);

Train Weather

@kaggle.pavan9065_envision_racing.train_weather
  • 31.36 kB
  • 914 rows
  • 11 columns
Loading...
CREATE TABLE train_weather (
  "time_utc_seconds" BIGINT,
  "time_utc_str" VARCHAR,
  "air_temp" DOUBLE,
  "track_temp" DOUBLE,
  "humidity" BIGINT,
  "pressure" DOUBLE,
  "wind_speed" DOUBLE,
  "wind_direction" BIGINT,
  "rain" BIGINT,
  "location" VARCHAR,
  "event" VARCHAR
);

Share link

Anyone who has the link will be able to view this.