1 Million Sudoku Games by Kaggle | Media and Entertainment

About this Dataset

1 Million Sudoku Games

Context

Sudoku is a popular number puzzle that requires you to fill blanks in a 9X9 grid with digits so that each column, each row, and each of the nine 3×3 subgrids contains all of the digits from 1 to 9. Sudoku-solving has gained much attention from various fields. As a deep learning researcher, I was inclined to investigate the possibilities of neural networks solving Sudoku. This dataset was prepared for that.

Content

There are dozens of source codes to generate Sudoku games available. I picked one of them, and ran the code. It took approximately 6 hours to generate 1 million games ( + solutions).

A Sudoku puzzle is represented as a 9x9 Python numpy array. The blanks were replaced with 0's. You can easily load and explore the data by running this.

import numpy as np
quizzes = np.load('sudoku_quizzes.npy') # shape = (1000000, 9, 9)
solutions = np.load('sudoku_solutions.npy') # shape = (1000000, 9, 9)
for quiz, solution in zip(quizzes[:10], solutions[:10]):
    print(quiz)
    print(solution)

** Updates for Version 3. **

I converted NumPy arrays to csv so they are easily accessible, irrespective of language. In each line, a Sudoku quiz and its corresponding solution are separated by a comma. You can restore the csv file content to Numpy arrays if needed as follows:

import numpy as np
quizzes = np.zeros((1000000, 81), np.int32)
solutions = np.zeros((1000000, 81), np.int32)
for i, line in enumerate(open('sudoku.csv', 'r').read().splitlines()[1:]):
    quiz, solution = line.split(",")
    for j, q_s in enumerate(zip(quiz, solution)):
        q, s = q_s
        quizzes[i, j] = q
        solutions[i, j] = s
quizzes = quizzes.reshape((-1, 9, 9))
solutions = solutions.reshape((-1, 9, 9))

Acknowledgements

I'm grateful to Arel Cordero, who wrote and shared this great Sudoku generation code. https://www.ocf.berkeley.edu/~arel/sudoku/main.html.

Inspiration

Check https://github.com/Kyubyong/sudoku to see if CNNs can crack Sudoku puzzles.
Also, reinforcement learning can be a promising alternative to this task.
Feel free to challenge Sudoku puzzles.

Tables

Sudoku

@kaggle.bryanpark_sudoku.sudoku

105.86 MB
1000000 rows
2 columns


CREATE TABLE sudoku (
  "quizzes" VARCHAR,
  "solutions" VARCHAR
);