1 Million Sudoku Games
1 million numpy array pairs of Sudoku games and solutions
@kaggle.bryanpark_sudoku
1 million numpy array pairs of Sudoku games and solutions
@kaggle.bryanpark_sudoku
Sudoku is a popular number puzzle that requires you to fill blanks in a 9X9 grid with digits so that each column, each row, and each of the nine 3×3 subgrids contains all of the digits from 1 to 9. Sudoku-solving has gained much attention from various fields. As a deep learning researcher, I was inclined to investigate the possibilities of neural networks solving Sudoku. This dataset was prepared for that.
There are dozens of source codes to generate Sudoku games available. I picked one of them, and ran the code. It took approximately 6 hours to generate 1 million games ( + solutions).
A Sudoku puzzle is represented as a 9x9 Python numpy array. The blanks were replaced with 0's. You can easily load and explore the data by running this.
import numpy as np
quizzes = np.load('sudoku_quizzes.npy') # shape = (1000000, 9, 9)
solutions = np.load('sudoku_solutions.npy') # shape = (1000000, 9, 9)
for quiz, solution in zip(quizzes[:10], solutions[:10]):
print(quiz)
print(solution)
** Updates for Version 3. **
I converted NumPy arrays to csv so they are easily accessible, irrespective of language. In each line, a Sudoku quiz and its corresponding solution are separated by a comma. You can restore the csv file content to Numpy arrays if needed as follows:
import numpy as np
quizzes = np.zeros((1000000, 81), np.int32)
solutions = np.zeros((1000000, 81), np.int32)
for i, line in enumerate(open('sudoku.csv', 'r').read().splitlines()[1:]):
quiz, solution = line.split(",")
for j, q_s in enumerate(zip(quiz, solution)):
q, s = q_s
quizzes[i, j] = q
solutions[i, j] = s
quizzes = quizzes.reshape((-1, 9, 9))
solutions = solutions.reshape((-1, 9, 9))
I'm grateful to Arel Cordero, who wrote and shared this great Sudoku generation code. https://www.ocf.berkeley.edu/~arel/sudoku/main.html.
Anyone who has the link will be able to view this.