Baselight

OpenAI HumanEval Code Gen

Handcrafted Python Programming Problems for Accurate Model Evaluation

@kaggle.thedevastator_openai_humaneval_code_gen

Loading...
Loading...

About this Dataset

OpenAI HumanEval Code Gen


OpenAI HumanEval Code Gen

Handcrafted Python Programming Problems for Accurate Model Evaluation

By Huggingface Hub [source]


About this dataset

This dataset released by OpenAI, HumanEval, offers a unique opportunity for developers and researchers to accurately evaluate their code generation models in a safe environment. It includes 164 handcrafted programming problems written by engineers and researchers from OpenAI specificially designed to test the correctness and scalability of code generation models. Written in Python, these programming problems cover docstrings and comments full of natural English text which can be difficult for computers to comprehend. Each programming problem also includes a function signature, body as well as several unit tests. Placed under the MIT License, this HumanEval dataset is ideal for any practitioner looking to judge the efficacy of their machine-generated code with trusted results!

More Datasets

For more datasets, click here.

Featured Notebooks

  • 🚨 Your notebook can be here! 🚨!

How to use the dataset

The first step is to explore the data that is included in the set by viewing the columns included. This guide will focus on four key columns: prompt, canonical_solution, test and entry_point.

  • The prompt column contains natural English text describing the programming problem.
  • The canonical_solution column holds the correct solution to each programming problem as determined by OpenAI researchers or engineers who hand-crafted the dataset.
  • The test column contains unit tests designed to check for correctness when debugging or evaluating code generated by neural networks or other automated tools.
  • The entry_point column contains code for an entry point into each program which can be used as starting point while solving any programming problem from this dataset.

With this information we can now begin utilizing this data set for our own projects from building new case studies for specific AI algorithms to developing automated programs that generate compatible source code instructions based off open AI datasets like Human Eval!

Research Ideas

  • Training code generation models in a limited and supervised environment.
  • Benchmarking the performance of existing code generation models, as HumanEval consists of both the canonical solution for each problem and unit tests that can be used to evaluate model accuracy.
  • Using Natural Language Processing (NLP) algorithms on the docstrings and comments within HumanEval to develop better natural language understanding for programming contexts

Acknowledgements

If you use this dataset in your research, please credit the original authors.
Data Source

License

License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication
No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.

Columns

File: test.csv

Column name Description
prompt A description of the programming problem. (String)
canonical_solution The expected solution to the programming problem. (String)
test Unit tests to verify the accuracy of the solution. (String)
entry_point The entry point for running the unit tests. (String)

Acknowledgements

If you use this dataset in your research, please credit the original authors.
If you use this dataset in your research, please credit Huggingface Hub.

Tables

Test

@kaggle.thedevastator_openai_humaneval_code_gen.test
  • 83.24 KB
  • 164 rows
  • 5 columns
Loading...

CREATE TABLE test (
  "task_id" VARCHAR,
  "prompt" VARCHAR,
  "canonical_solution" VARCHAR,
  "test" VARCHAR,
  "entry_point" VARCHAR
);

Share link

Anyone who has the link will be able to view this.