A few days ago, during a game of Warhammer 40000, after my opponent rolled 30 dice at once and took forever to read them, he challenged me: “can you find a solution to automatically read the dice roll result?“
After thinking a bit, the only solution that seemed feasible to me (for this big amount of dice) is using a webcam to capture the image of the dice and processing it using computer vision algorithms. In this series of posts, we will proceed experimentally in search of the most effective solution for reading a dice roll.
Before moving on, a small premise: I have sometimes used computer vision for projects at work, but this does not make me an expert. So any suggestions are more than welcome! For me these posts will be an opportunity to learn something new.
What is computer vision?
Computer vision is the scientific field that deals with how computers can understand digital images or videos. In the context of computer vision, understanding means the transformation of raw image data (an array of color values) into high-level descriptions for purposes of control, classification, selection, etc.
In our experiment, for detecting and reading dice, we will use OpenCV.
OpenCV (Open source Computer Vision) is a cross-platform library mainly aimed at real-time computer vision. OpenCV is written in the C++ language, so we will use this language for our program. However, there are bindings for many other languages and therefore, if you want, it should not be too complicated to rewrite the program in your preferred language.
To install and configure OpenCV for your OS/language/IDE, follow the instructions in the Introduction to OpenCV page.
Our goal
Before starting, let’s clarify what is the final goal of the program we are about to write.
When rolling dice in Warhammer 40K to attack, to know how many hits are successful, you need to count how many dice are above a given score. So the output we want to get from the program will be a table that for each value from 2 to 6 tells us how many dice got that score or more.
Experiment setup
The setup for this reading dice experiment in OpenCV is very simple:
- we will place a webcam at a predefined height above a dark, uniform plane
- we will use a set of white dice with black dots
Since the height of the webcam is fixed and we’ll always use the same set of dice, as a consequence they will always appear approximately of the same size. This will be useful to filter out false positives in dice detection.
Let’s start!
The first method I want to try to read the dice consists of two “simple” steps:
- detect the squares that bounds each single die;
- for each of these squares, detect the dots that are inside and count them.
Let’s start by writing the basic structure of the program, to become familiar with OpenCV:
#include <vector>
#include <opencv2/core.hpp>
#include <opencv2/imgproc.hpp>
#include <opencv2/highgui.hpp>
using namespace cv;
using namespace std;
int main() {
VideoCapture capture;
capture.open(0);
namedWindow("Image");
while (1) {
Mat image;
capture.read(image);
imshow("Image", image);
int c = waitKey(1);
if (c != -1)
break;
}
}
In the first few lines we include all the libraries we need and tell the compiler to use the std and cv namespaces (so we won’t be forced to put the prefix cv:: in front of all OpenCV classes and functions).
Within the main function, the first thing we encounter is the VideoCapture class. This class provides an API for capturing video from cameras or for reading video files and image sequences. Using its method open passing 0 as the argument, we open the default camera (if you only have a webcam, it will be that one).
The function namedWindow creates a window that can be used to show an image. Windows are referred to by their names. For now, we want a window to show the original image captured by the camera.
Next, we need a loop that continuously captures frames from the camera. Within this loop, we will call the VideoCapture method read, which takes as its argument an object of type Mat in which the frame will be saved. The class Mat represents an n-dimensional single-channel or multi-channel array. It can be used for multiple purposes, but the important thing to know for now is that Mat objects are used to store image data.
The function imshow is used to show the captured image inside the window we created before.
At the end of the loop, we use the waitKey function to wait for a key press event: it return the code of the key pressed. If the argument of waitKey is greater than 0, the function will wait for that amount of milliseconds, then return -1 if no key event was detected. We will exit the loop as soon as any key is pressed. The waitKey function is the only one in OpenCV that can fetch and handle events, so it must be called periodically for normal event processing, or the program will fail to run correctly.
If we run the program now, we will get something like this:

Edge detection
Edge detection refers to the process of identifying and locating sharp discontinuities in an image. It is an essential step in image analysis and it is the key of solving many complex problems, for example object recognition, target tracking and so on.
To detect edges, the first step is to discard the color information and focus only on the intensity values. So, we will convert our initial image into a grayscale image. To do this, we need to add this code to the inside of the loop, after capture.read(image):
Mat grayImage;
cvtColor(image, grayImage, COLOR_BGR2GRAY);
cvtColor will convert the source image from 3-channels BGR (blue, green, red) to the destination grayscale image grayImage.
To show the computation steps, we will add another window:
namedWindow("Computation");
and near the other imshow call we will add:
imshow("Computation", grayImage);
In the new window we will see something like this:

Before carrying out the actual edge detection, we will perform two additional steps to improve the quality of the detection.
The first is blurring the image to reduces noise, allowing us to have a more stable result:
blur(grayImage, grayImage, Size(3, 3));
The blur function compute for each pixel the average of the surrounding square (in this case of a 3×3 area):

The second is using binary thresholding: for each pixel, if its value is greater than a threshold value, it is assigned maximum value (white, 255), else it is assigned minimum value (0, black):
threshold(grayImage, grayImage, 170, 255, THRESH_BINARY);
We set the threshold to 170, but you can try different values to find the optimal result.

This step is not normally necessary, but in this particular case, where the dice contrast a lot with the background, it further improves the ability to identify edges.
Finally, let’s call the function that will perform edge detection:
Canny(grayImage, grayImage, 80, 230);
Canny Edge Detection is a popular edge detection algorithm. You can have a look at the wikipedia page if you want to learn more about it.
The two numerical arguments of the function are the threshold of the hysteresis procedure of the algorithm. You can play with the two values to find out which ones work best for you.

Finding bounding rectangles
To find the bounding rectangles we need to follow these steps:
- finding contours: we will use the findContours function, which transforms the previously detected edges into a list of contours, where each contour is a list of points that connected together form an edge;
- finding the minimum area rectangles that enclose all the points of each contours using the minAreaRect function.
To get feedback on the algorithm behaviour, we will display the rectangles found and their quantity. Here is the code:
vector<vector<Point>> contours;
vector<Vec4i> hierarchy;
findContours(grayImage, contours, hierarchy, RETR_TREE, CHAIN_APPROX_SIMPLE);
vector<RotatedRect> diceRects;
for (int i = 0; i < contours.size(); i++) {
// For each contour, search the minimum area rectangle
RotatedRect rect = minAreaRect(contours[i]);
diceRects.push_back(rect);
// Draw square over original image
Point2f points[4];
rect.points(points);
for (int j = 0; j < 4; j++) {
line(image, points[j], points[(j + 1) % 4], Scalar(0, 0, 255), 2, LINE_AA);
}
}
// Write dice count on screen
char buffer[32];
sprintf_s(buffer, "Dice: %d", (int)diceRects.size());
putText(image, buffer, Point(20, 30), FONT_HERSHEY_DUPLEX, 0.8, Scalar(0, 255, 0), 1, LINE_AA);
If you run the program, you will get:

As you can see, we detect many more rectangles than we expected, for example the dots inside the dice and other rectangles in the background.
We must therefore filter the rectangles excluding those with a size too different from what we expect for a die, and those that have an aspect ratio too far from 1 (ie from a square):
vector<RotatedRect> diceRects;
for (int i = 0; i < contours.size(); i++) {
// For each contour, search the minimum area rectangle
RotatedRect rect = minAreaRect(contours[i]);
// Process only rectangles that are almost square and of the right size.
// Area size depends on webcam resolution. For me is 640x480 and a dice have
// a side length of about 55 pixels (an area of about 3000 square pixels).
float aspect = fabs(rect.size.aspectRatio() - 1);
if ((aspect < 0.25) && (rect.size.area() > 2000) && (rect.size.area() < 4000)) {
diceRects.push_back(rect);
// Draw square over original image
Point2f points[4];
rect.points(points);
for (int j = 0; j < 4; j++) {
line(image, points[j], points[(j + 1) % 4], Scalar(0, 0, 255), 2, LINE_AA);
}
}
}
If you run the program now, you will see that only the rectangles around the dice will be displayed. However, the dice count could still be wrong! This is due to the fact that for each die more contours are detected, so we will also have to filter out these excess rectangles. To do this, we will check the center of the rectangles: if a rectangle has its center too close to another rectangle, it is considered a duplicate and discarded:
vector<RotatedRect> diceRects;
for (int i = 0; i < contours.size(); i++) {
// For each contour, search the minimum area rectangle
RotatedRect rect = minAreaRect(contours[i]);
// Process only rectangles that are almost square and of the right size.
// Area size depends on webcam resolution. For me is 640x480 and a dice have
// a side length of about 55 pixels (an area of about 3000 square pixels).
float aspect = fabs(rect.size.aspectRatio() - 1);
if ((aspect < 0.25) && (rect.size.area() > 2000) && (rect.size.area() < 4000)) {
// Check if it's a duplicate rect
bool process = true;
for (int j = 0; j < diceRects.size(); j++) {
float dist = norm(rect.center - diceRects[j].center);
if (dist < 10) {
process = false;
break;
}
}
if (process) {
diceRects.push_back(rect);
// Draw square over original image
Point2f points[4];
rect.points(points);
for (int j = 0; j < 4; j++) {
line(image, points[j], points[(j + 1) % 4], Scalar(0, 0, 255), 2, LINE_AA);
}
}
}
}
Now, if you run the program, the bounding rectangles are correctly detected and you will get something like this:

Counting dots of each dice
To count the dots of a dice, we need to follow these steps for each die:
- extract the image of the die
- find the contours of the dots
- find and filter minimum area rects
- count the number of remaining rects
So, the structure of this part will be:
// Counting dots of each die
int diceCounts[6] = { 0, 0, 0, 0, 0, 0 };
for (int i = 0; i < diceRects.size(); i++) {
// Extract die image
// TODO
// Find contours
// TODO
// Find and filter minimum area rects
// TODO
// Save dots count
// TODO
}
Let’s start with the die image extraction:
// Extract die image
Mat rotation, rotated, cropped;
RotatedRect rect = diceRects[i];
rotation = getRotationMatrix2D(rect.center, rect.angle, 1.0);
warpAffine(grayImage, rotated, rotation, grayImage.size(), INTER_CUBIC);
getRectSubPix(rotated, Size(rect.size.width - 10, rect.size.height - 10), rect.center, cropped);
First, we use getRotationMatrix2D to calculate the rotation matrix from the rectangle data, then we apply this rotation using the warpAffine function. In this way the die is aligned with the borders of the image and can be easily extracted.
At this point, we can extract the die image with the function getRectSubPix, using rectangle center, width and height as parameters. Actually, we extract a slightly smaller rectangle to avoid processing the edges of the die.
We can now proceed with contours detection:
// Find contours
vector<vector<Point>> dieContours;
vector<Vec4i> dieHierarchy;
threshold(cropped, cropped, 64, 255, THRESH_BINARY);
findContours(cropped, dieContours, dieHierarchy, RETR_TREE, CHAIN_APPROX_SIMPLE);
Before calling the findContours function that we have already seen before, we will apply the threshold function to the image. We have to do this because the rotation we applied earlier leaves the edges of the dots very blurred and difficult to detect.
Now that we have the contours, we can find all minimum area rectangles that surround them. This part is almost identical to the one where we detect dice bounding rectangles:
// Find and filter minimum area rects
vector<RotatedRect> dotsRects;
for (int i = 0; i < dieContours.size(); i++) {
// For each contour, try to fit a rectangle
RotatedRect dotRect = minAreaRect(dieContours[i]);
// Process only rectangles that are almost square and of the right size.
float aspect = fabs(dotRect.size.aspectRatio() - 1);
if((aspect < 0.4) && (dotRect.size.area() > 8) && (dotRect.size.area() < 150)) {
// Check if it's a duplicate rect
bool process = true;
for (int j = 0; j < dotsRects.size(); j++) {
float dist = norm(dotRect.center - dotsRects[j].center);
if (dist < 10) {
process = false;
break;
}
}
if (process) {
dotsRects.push_back(dotRect);
}
}
}
All we have to do now is saving the dots count:
// Save dots count
if (dotsRects.size() >= 1 && dotsRects.size() <= 6) {
diceCounts[dotsRects.size() - 1]++;
}
Final result
To show the result of dice reading, we must simply add these lines:
for (int i = 1; i < 6; i++) {
int count = 0;
for (int j = i; j < 6; j++) {
count += diceCounts[j];
}
sprintf_s(buffer, "%d+: %d", (i+1), count);
putText(image, buffer, Point(20, 55 + 25 * i), FONT_HERSHEY_DUPLEX, 0.8, Scalar(0, 255, 0), 1, LINE_AA);
}
and we will get something like this:

Conclusions
This method for reading dice in OpenCV works very well when the dice are separated from each other. However, in a real dice roll, it will often happen that some dice remain in contact with others. In this case, the difficulty in distinguishing one dice from the other leads to a wrong reading:
In a future post we will try to solve this problem, or if we don’t succeed, completely change the reading method.
The complete code
#include <vector>
#include <opencv2/core.hpp>
#include <opencv2/imgproc.hpp>
#include <opencv2/highgui.hpp>
using namespace cv;
using namespace std;
int main() {
namedWindow("Image");
namedWindow("Computation");
VideoCapture capture;
capture.open(0);
while (1) {
Mat image;
capture.read(image);
Mat grayImage;
cvtColor(image, grayImage, COLOR_BGR2GRAY);
blur(grayImage, grayImage, Size(3, 3));
threshold(grayImage, grayImage, 170, 255, THRESH_BINARY);
Canny(grayImage, grayImage, 80, 230);
vector<vector<Point>> contours;
vector<Vec4i> hierarchy;
findContours(grayImage, contours, hierarchy, RETR_TREE, CHAIN_APPROX_SIMPLE);
vector<RotatedRect> diceRects;
for (int i = 0; i < contours.size(); i++) {
// For each contour, search the minimum area rectangle
RotatedRect rect = minAreaRect(contours[i]);
// Process only rectangles that are almost square and of the right size.
// Area size depends on webcam resolution. For me is 640x480 and a dice have
// a side length of about 55 pixels (an area of about 3000 square pixels).
float aspect = fabs(rect.size.aspectRatio() - 1);
if ((aspect < 0.25) && (rect.size.area() > 2000) && (rect.size.area() < 4000)) {
// Check if it's a duplicate rect
bool process = true;
for (int j = 0; j < diceRects.size(); j++) {
float dist = norm(rect.center - diceRects[j].center);
if (dist < 10) {
process = false;
break;
}
}
if (process) {
diceRects.push_back(rect);
// Draw square over original image
Point2f points[4];
rect.points(points);
for (int j = 0; j < 4; j++) {
line(image, points[j], points[(j + 1) % 4], Scalar(0, 0, 255), 2, LINE_AA);
}
}
}
}
// Write dice count on screen
char buffer[32];
sprintf_s(buffer, "Dice: %d", (int)diceRects.size());
putText(image, buffer, Point(20, 30), FONT_HERSHEY_DUPLEX, 0.8, Scalar(0, 255, 0), 1, LINE_AA);
// Counting dots of each die
int diceCounts[6] = { 0, 0, 0, 0, 0, 0 };
for (int i = 0; i < diceRects.size(); i++) {
// Extract die image
Mat rotation, rotated, cropped;
RotatedRect rect = diceRects[i];
rotation = getRotationMatrix2D(rect.center, rect.angle, 1.0);
warpAffine(grayImage, rotated, rotation, grayImage.size(), INTER_CUBIC);
getRectSubPix(rotated, Size(rect.size.width - 10, rect.size.height - 10), rect.center, cropped);
// Find contours
vector<vector<Point>> dieContours;
vector<Vec4i> dieHierarchy;
threshold(cropped, cropped, 64, 255, THRESH_BINARY);
findContours(cropped, dieContours, dieHierarchy, RETR_TREE, CHAIN_APPROX_SIMPLE);
// Find and filter minimum area rects
vector<RotatedRect> dotsRects;
for (int i = 0; i < dieContours.size(); i++) {
// For each contour, try to fit a rectangle
RotatedRect dotRect = minAreaRect(dieContours[i]);
// Process only rectangles that are almost square and of the right size.
float aspect = fabs(dotRect.size.aspectRatio() - 1);
if((aspect < 0.4) && (dotRect.size.area() > 8) && (dotRect.size.area() < 150)) {
// Check if it's a duplicate rect
bool process = true;
for (int j = 0; j < dotsRects.size(); j++) {
float dist = norm(dotRect.center - dotsRects[j].center);
if (dist < 10) {
process = false;
break;
}
}
if (process) {
dotsRects.push_back(dotRect);
}
}
}
// Save dots count
if (dotsRects.size() >= 1 && dotsRects.size() <= 6) {
diceCounts[dotsRects.size() - 1]++;
}
}
for (int i = 1; i < 6; i++) {
int count = 0;
for (int j = i; j < 6; j++) {
count += diceCounts[j];
}
sprintf_s(buffer, "%d+: %d", (i+1), count);
putText(image, buffer, Point(20, 55 + 25 * i), FONT_HERSHEY_DUPLEX, 0.8, Scalar(0, 255, 0), 1, LINE_AA);
}
imshow("Computation", grayImage);
imshow("Image", image);
int c = waitKey(1);
if (c != -1)
break;
}
}
3 Comments
Spencer Walker-Fooks · January 3, 2020 at 6:26 am
Looking at the Canny Edge image, and with the assumptions you are making I would think it possible to read merely from the number of pits, without needing to use dice detection as anything except verification? You could also use a pre-known order of pits (1 pit looks like x, 6 dice pits look like y) to detect pit combinations?
Davide Pesce · January 3, 2020 at 10:44 am
There are various alternatives for achieving the desired result. As you said, one possibility is to count how many pits are inside each detected rectangle. Since there is no way to directly check if a point is contained within a RotatedRect, you need to convert the rectangle to a contour and then use the pointPolygonTest function to check it. In the tests I did, this method was slightly slower than what I followed, but your mileage may vary.
You can use template matching (and in fact the next post in this series will be on how to use it to detect dice).
Paul Phillips · December 17, 2020 at 10:50 pm
I came across your article regarding computer vision dice counting while trying to find examples of using computer video to do image change detection. I’m trying to figure out a way to have a computer watch a grided area for changes and then announce the most recent change’s relationship to the center of the image. Imagine a large board covered with small square boxes and camera is drectly above the board. You then toss in a ping pong ball and it falls into a box. I want to have the computer tell me how far from the centermost area of the board the pingpong ball is and in which direction. So it would for example announce “5 down and 3 to the right” if that’s where the newest change is. But it should be smart enough to ignore that ball and give newest change when a new ball is thrown. Do you think that would be difficult to do? I’m not sure where to even begin, like for example should I try to do that with tensorflows, or a raspberry pi and CV libraries or what? I know almost nothing about computer learning.
Thanks
Paul
Comments are closed.