download
match5
Match5, a self learning game


Introduction

Match5 is a board game.
It may be played in two modes:
    1. player against computer
    2. player against player
The computer is self-learning.
In case of a loss, the board state is archived to avoid the same error a second time.
Each time the player wins, it becomes a bit harder to win again.
Actually, the player competes with him/her self.

The game

The board counts 9 x 9 fields. A field may hold a red or a blue ball.
The computer plays with red, the player with blue.
Alternately they place a ball of their color in an empty field.
Winner is the first to place a horizontal, vertical or diagonal row of five balls of his color.
Look at the picture below, where the computer plays red and the player blue.
It is clear that the computer still has zero knowledge and is beaten easily by blue.

Installation

Match5 is written for Windows. The programming language is Delphi.
There is no installation procedure.
The Windows registry is not changed.
Simply copy match5.exe to a map of choice.

Playing the game

In single player mode, during the games, the computer builds the so called "knowledge" table in the computer memory.
Regarding this knowledge file there are three possibilities (as displayed at the right top of the window):
    - new.....a new file is under construction
    - owner.....an existing file has been opened and the password has been verified
    - user...an existing file has been opened but the password is not verified
If the password is left blank, it always verifies.
The owner state enables the change and storage of the knowledge data.
Attaching a password to your files makes sure others cannot alter your data.
However, everybody may play against your knowledge files and view the patterns stored so far.

The game itself is either in menu or playing mode.
The right start - stop bottom button switches between these modes.
In menu mode the player may
    - select a name or password
    - select 1 or 2 player mode
    - save / load knowledge data to / from disk (menu buttons at top of page)
    - read help information
    - view the hall-of-Fame with the best players
    - navigate through the error patterns in the knowledge table
    - erase knowledge data (all or just one entry)
    - select check mutations to show knowledge mutations during the game
In menu mode, press start to start a game.
In single player mode, the computer plays with red and the player with blue.

Moving with mouse

Move pointer over field and press mousebutton to place a blue ball.

Moving with keyboard

Use cursorkeys to position rectangle over a field. Then press space or return.

Level

Playing a game is no fun without some sort of reward.
Here, the reward is the achieved level, which is simply the number of error patterns stored,
roughly it is the number of times the player has won from the computer.
Also, a Hall_of_Fame lists the best 100 players.

Files

Default, the knowledge data is saved in map match5data which is created automatically
in the map of match5.exe.
The knowledge files alway have the name of the player, followed by the extension .fivo
The Hall-of-Fame has filename match5_hall_of_fame without extension and is stored in the
same map as the knowledge data.

A .fivo file may be erased by
    - opening the file as owner
    - erasing the knowledge data by pressing the X all (erase all) button
    - storing the data
above action also removes the player from the Hall-of-fame.

Opening the program from a .fivo file

When clicking on a .fivo file, the first time Windows will ask for an application.
Supply the match5.exe program.
This also results in .fivo files showing the match5 icon.

Revisions

The first release of match5 was 1.0
Version 1.1 added buttons to erase entries in the knowledge table or erase the complete table.
The data was separated from the program by placing data files in map match5data
Also the "show mutations" checkbox was added.
In version 1.2 a minor error was corrected, which caused sporadic erasure of an error pattern in the knowledge table.

The Self Learning Algorithm

On purpose, the algorithm has been kept simple.
The computer plays in a defensive way, it's goal is not to lose.
The computer never looks even a single move ahead and never calculates a move.
By using the archived error patterns it is only capable of recognizing threats and take action.
The computer is only capable of detecting a row of five balls of the same color.

Assume, that five blue balls make a row, the computer detects a lost game.

During the game, all moves are stored in a table (the "movetable").
In case of five blue balls in a row, the program removes the last placed blue ball, as this caused the loss.
The pattern becomes:
    BBBB0
    
B is a blue ball and 0 is an empty field.
During a next game, when the pattern becomes BBBB0 the computer must respond by placing
a red ball in this empty field.
The pattern then becomes:
    BBBBR
    
Which information do we have to store?
The size of the pattern, including all blue and empty fields and also the field to place the red ball.
This response field for red we code as X.
The pattern to archive becomes:
    BBBBX
    
which means: if a row of four blue balls is followed by an empty field, the computer has to place a red ball
in this empty field, to avoid loss.

Error patterns are stored in left-top aligned form.
When comparing this patterns against actual board states, the patterns are shifted horizontally and vertically
and also eight different symmetries are tested.
More about the symmetries later.

Please look at the next game.
Blue tries to beat the computer using the same strategy as before however, the computer now recognizes the pattern
BBBB0 and reacts with BBBBR.
But blue places a ball in an empty field left of the row, winning again.
    BBBBBR
    
As before, the computer removes the last blue move, which caused the loss.
However, this move was not the core reason for the loss, because the previous red move was not voluntary but forced
to avoid earlier loss. Therefore, this red move is removed as well together with the blue move preceding it.
Now the error pattern looks like:
    0BBB00
    
The last removed blue field is the place to drop a red ball to avoid loss.
The archived error pattern becomes:
    0BBBX0
    
which means: if a row of three blue ball has an empty field on one side and two empty fields on the
other side, the computer must respond with a red ball in the 'X' marked field.

This is the complete algorithm.
In case of a loss, remove the last blue move and in case of a preceding forced red move,
also remove this red move together with the preceding blue move and so on, until a non-forced
red move was found in the movetable.
The 'X' response field is the field of the last removed blue ball.

Below is a litte more complicated situation.
Some red balls are neglected, the blue balls are of more importance.
Assume that the computer already recognizes the error patterns BBBBX and 0BBBX0
    0 0 0 0 0 0 0 0 0   0 0 0 0 0 0 0 0 0   0 0 0 0 0 0 0 0 0   0 0 0 0 0 0 0 0 0
    0 0 0 0 0 0 0 0 0   0 0 0 0 0 0 0 0 0   0 0 0 0 0 0 0 0 0   0 0 0 0 0 0 0 0 0
    0 0 B 0 0 0 0 0 0   0 0 B 0 0 0 0 0 0   0 0 B 0 0 0 0 0 0   0 0 B 0 0 0 0 0 0
    0 0 B 0 0 0 0 0 0   0 0 B 0 0 0 0 0 0   0 0 B 0 0 0 0 0 0   0 0 B 0 0 0 0 0 0
    0 0 0 B B 0 0 0 0   0 0 B B B 0 0 0 0   0 0 B B B R 0 0 0   0 0 B B B R 0 0 0
    0 0 0 0 0 0 0 0 0   0 0 0 0 0 0 0 0 0   0 0 0 0 0 0 0 0 0   0 0 B 0 0 0 0 0 0
    0 0 0 0 0 0 0 0 0   0 0 0 0 0 0 0 0 0   0 0 0 0 0 0 0 0 0   0 0 0 0 0 0 0 0 0
    0 0 0 0 0 0 0 0 0   0 0 0 0 0 0 0 0 0   0 0 0 0 0 0 0 0 0   0 0 0 0 0 0 0 0 0
    0 0 0 0 0 0 0 0 0   0 0 0 0 0 0 0 0 0   0 0 0 0 0 0 0 0 0   0 0 0 0 0 0 0 0 0
    
    0 0 0 0 0 0 0 0 0   0 0 0 0 0 0 0 0 0   
    0 0 0 0 0 0 0 0 0   0 0 B 0 0 0 0 0 0   
    0 0 B 0 0 0 0 0 0   0 0 B 0 0 0 0 0 0   
    0 0 B 0 0 0 0 0 0   0 0 B 0 0 0 0 0 0   
    0 0 B B B R 0 0 0   0 0 B B B 0 0 0 0   
    0 0 B 0 0 0 0 0 0   0 0 B 0 0 0 0 0 0   
    0 0 R 0 0 0 0 0 0   0 0 R 0 0 0 0 0 0   
    0 0 0 0 0 0 0 0 0   0 0 0 0 0 0 0 0 0   
    0 0 0 0 0 0 0 0 0   0 0 0 0 0 0 0 0 0   
    
Read above board states left to right and top to bottom.
After the first blue move there are two 0BBBX0 error patterns, one horizontal and one vertical.
The computer responds with R to eliminate the threat of loss in the horizontal row.
Then blue makes a row of four vertically.
The computer recognizes this error pattern and responds with R at the bottom of the row.
Then blue wins by placing a ball at the top.

Sensing the lost game, the computer removes the last B and all forced R's before, together with the
preceding B's.
The last removed B field is coded X.
The next error pattern is added to the knowledge data (left top aligned)
       0
       B
       B
       X B B 0
       0
       0
     

Symmetry

When comparing errorpatterns with actual board states, eight symmetries are used.
See tables below:
     R R R R 0 0 0 0 0
     B 0 0 0 0 0 0 0 0
     B 0 0 0 0 0 0 0 0
     0 0 0 0 0 0 0 0 0
     0 0 0 0 0 0 0 0 0
     0 0 0 0 0 0 0 0 0
     0 0 0 0 0 0 0 0 0
     0 0 0 0 0 0 0 0 0
     0 0 0 0 0 0 0 0 0
     
     0 0 0 0 0 0 0 0 0
     0 0 0 0 0 0 0 0 0
     0 0 0 0 0 0 0 0 0
     0 0 0 0 0 0 0 0 0
     0 0 0 0 0 0 0 0 0
     R 0 0 0 0 0 0 0 0
     R 0 0 0 0 0 0 0 0
     R 0 0 0 0 0 0 0 0
     R B B 0 0 0 0 0 0
     
     0 0 0 0 0 0 0 0 0
     0 0 0 0 0 0 0 0 0
     0 0 0 0 0 0 0 0 0
     0 0 0 0 0 0 0 0 0
     0 0 0 0 0 0 0 0 0
     0 0 0 0 0 0 0 0 0
     0 0 0 0 0 0 0 0 B
     0 0 0 0 0 0 0 0 B
     0 0 0 0 0 R R R R
     
     0 0 0 0 0 0 B B R
     0 0 0 0 0 0 0 0 R
     0 0 0 0 0 0 0 0 R
     0 0 0 0 0 0 0 0 R
     0 0 0 0 0 0 0 0 0
     0 0 0 0 0 0 0 0 0
     0 0 0 0 0 0 0 0 0
     0 0 0 0 0 0 0 0 0
     0 0 0 0 0 0 0 0 0
     
    symmetry -0- symmetry -1- symmetry -2- symmetry -3-


     0 0 0 0 0 R R R R 
     0 0 0 0 0 0 0 0 B
     0 0 0 0 0 0 0 0 B
     0 0 0 0 0 0 0 0 0
     0 0 0 0 0 0 0 0 0
     0 0 0 0 0 0 0 0 0
     0 0 0 0 0 0 0 0 0
     0 0 0 0 0 0 0 0 0
     0 0 0 0 0 0 0 0 0
     
     0 0 0 0 0 0 0 0 0
     0 0 0 0 0 0 0 0 0
     0 0 0 0 0 0 0 0 0
     0 0 0 0 0 0 0 0 0
     0 0 0 0 0 0 0 0 0
     0 0 0 0 0 0 0 0 R
     0 0 0 0 0 0 0 0 R
     0 0 0 0 0 0 0 0 R
     0 0 0 0 0 0 B B R
     
     0 0 0 0 0 0 0 0 0
     0 0 0 0 0 0 0 0 0
     0 0 0 0 0 0 0 0 0
     0 0 0 0 0 0 0 0 0
     0 0 0 0 0 0 0 0 0
     0 0 0 0 0 0 0 0 0
     B 0 0 0 0 0 0 0 0
     B 0 0 0 0 0 0 0 0
     R R R R 0 0 0 0 0
     
     R B B 0 0 0 0 0 0
     R 0 0 0 0 0 0 0 0
     R 0 0 0 0 0 0 0 0
     R 0 0 0 0 0 0 0 0
     0 0 0 0 0 0 0 0 0
     0 0 0 0 0 0 0 0 0
     0 0 0 0 0 0 0 0 0
     0 0 0 0 0 0 0 0 0
     0 0 0 0 0 0 0 0 0
     
    symmetry -4- symmetry -5- symmetry -6- symmetry -7-
Symmetry -0- is the original pattern
Symmetry -1- to -3- is the result of rotation by 90, 180 or 270 degress around the center.
Symmetry -4- to -7- result from reflection of symmetries -0- tot -3- over a centered vertical line.

Note 1:
Patterns are stored left-top aligned.

Note 2:
When comparing error patterns against board states, horizontal- and vertical shifts are used.

Note 3:
Diagonal pattens are disjunct from horizontal and vertical patterns.
They are no symmetry of each other.
If the computer recognizes error pattern 0BBBX0 the following pattern may still be unknown:
    0  
      B 
        B  
          B  
            X 
              0 
    

Data structures

"Game", "Comp" and "EP" are 2 dimensional arrays with dimensions 9 * 9 (bytes).
Game holds the red and blue fields.
Erasing sets all fields to the value 0.
The fields of Game may have the value:
    0 : empty
    1 : red
    2 : blue
Comp and EP contain error patterns.
The fields of EP and Comp may be:
    0 : empty
    1 : red
    2 : blue
    3 : n/a (no influence)
Comp receives error patterns from the KN (knowledge) table.
Comp is compared with game to recognize an error pattern.

During the games, the error pattern is built in EP.
When losing a game, EP is stored in KN.
EP is erased by setting all fields to the value 3.

The movetable holds all moves of a game so far.

Associated with Game, Comp and EP are the records gameRect, compRect and EPrect,of type TEPrect
type TEPrect = record
                full  : boolean; //active area
                x1,y1 : byte;    //left top
                x2,y2 : byte;    //right bottom
                fx,fy : byte;    //focus (x,y) winner, stopper
                level : byte;
                white : byte;    //number of white fields
                red   : byte;    //          red
                blue  : byte;    //          blue
		   end;	
Their meaning is:
    full : true if the rectangle holds a pattern (a rectangle cannot be set to zero)
    x1,y1,x2,y2 : left-top and right-bottom of the pattern
    fx,fy : coordinates of the X field (focus, action)
    level : explanation follows
    white, red, blue : number of fields of each color, used to speed up the search

Entries in the movetable are of type TMove
    TMove = record
             x : byte;   //coordinate of game field
             y : byte;
             movetype : byte; //0:none; 1:red; 2:blue; 3:red stopper
            end;
    
Note:
movetype 3 is a forced (red) move of the computer.
It is a move in the 'X' field.

Error patterns and associated data are packed in array KN.
KN is a one-dimensional array of type DWORD (= cardinal).
Each DWORD codes a row of 9 fields, so 9 DWORDS are needed to hold an error pattern.
A field needs 2 bits of a DWORD.
Bits 0,1 hold field 0 of a row, bits 2,3 hold field 1 and so on.
Bits 0..17 contain the fields, but bits 18..31 are still unused.
Upper bits 24..31 of the DWORDs hold this information:
    row 0 - x2
    row 1 - y2
    row 2 - fx
    row 3 - fy
    row 4 - level
    row 5 - white
    row 6 - red
    row 7- blue
    row 8 - none
Notes:
- x1,y1 are alway 0 (left-top aligned) so their is no need to store this information
- fx,fy are the coordinates of the 'X' field
- white,red,blue are the counts of these fields
- entry n in KN occupies KN[9*(n-1)]....KN[9*(n-1) + 8]

Game levels

The level of the player is simply the number of entries in the KN array, so the number of stored error patterns.
The KN array also has a level for each entry.
This level indicates the number of removed blue balls.
So, BBBBX is a pattern of level 1.
0BBBX0 is a pattern of level 2.
The KN table sorts patterns from low to high level.
This ensures, that simple patterns are compared first.

Building the Error Patterns.

Error patterns are build during a game in array EP.
At the start of a game, each EP field is set to 3, meaning that it should be ignored.
The full flag is set false: no paterns is active.

EP is changed on trhree occasions.
1.
If blue wins.
The winning fields are copied to EP, the five corresponding EP fields are set to 2 (blue).
2.
If the computer does a forced move.
Such a move results from a compare between Comp and Game and the Comp pattern is copied to EP.
Each field in the Comp rectangle that is unequal to 3 is kopied to EP.
3.
All patterns in EP are deleted if the computer does a non-forced move.

After losing a game, the computer processes the EP array,
removing the last blue move as well as forced red moves and their preceding blue moves.
The resulting EP pattern is then left-top aligned and added to the KN array in a packed form.
In the mean time, the computer has become a bit smarter.

How the computer may win a game

The stored error patterns prevent the computer from losing, but automatically they supply information
for the computer how to win a game.
If an error pattern (blue balls and empty fields) shows up in the game with red balls instead of blue,
then a red move in the 'X' field may result in winning.
The compare function is designed to serve both cases: it tests for patterns in only a single color,
equal color instead of red or blue.
The resultcode of 0,1,or 2 indicates no-, a red- or a blue compare.

Junk removal

The computer never gets smarter than the player.
Possibly, the computer is instructed poorly, having copied fancy moves from the player.
There is a way however to eliminate this junk knowledge.
A new pattern entering the KN array is compared to all other KN entries.
If the new pattern is a subset of an existing pattern, this old pattern is erased.
Subset means here:
    - a 0 in the new pattern is also a 0 in the old pattern
    - a 2 in the new pattern is also a 2 in the old pattern
Note: erasing an existing pattern in KN decreases the playerlevel by 1.
If the erased pattern was of higher level then the new pattern entering the KN table, then junk was removed.

An example:
Imagine that the computer recognizes only error pattern BBBBX.
Now look at this game:
    0 0 0 0 0 0 0 0 0
    B B B B 0 0 0 0 0
    0 B 0 0 0 0 0 0 0
    0 0 B 0 0 0 0 0 0
    0 0 0 B 0 0 0 0 0
    0 0 0 0 0 0 0 0 0
    0 0 0 0 0 0 0 0 0
    0 0 0 0 0 0 0 0 0
    0 0 0 0 0 0 0 0 0
    
    0 0 0 0 0 0 0 0 0
    B B B B R 0 0 0 0
    0 B 0 0 0 0 0 0 0
    0 0 B 0 0 0 0 0 0
    0 0 0 B 0 0 0 0 0
    0 0 0 0 0 0 0 0 0
    0 0 0 0 0 0 0 0 0
    0 0 0 0 0 0 0 0 0
    0 0 0 0 0 0 0 0 0
    
    0 0 0 0 0 0 0 0 0
    B B B B R 0 0 0 0
    0 B 0 0 0 0 0 0 0
    0 0 B 0 0 0 0 0 0
    0 0 0 B 0 0 0 0 0
    0 0 0 0 B 0 0 0
    0 0 0 0 0 0 0 0 0
    0 0 0 0 0 0 0 0 0
    0 0 0 0 0 0 0 0 0
    
The EP data stored in KN becomes (level 2)
    B B B X 0
      B
        B
          B
            0
    
If during a later game this pattern
    B
      B
        B
          B
            X	  	  
    
shows up (level 1), which is a subset of the previous pattern, then the old pattern is erased
because it is superfluous.

In practise, a new pattern may erase an old pattern of higher level containing superfluous moves.
I never observed the opposite.
There certainly is room for more study and investigation in this field.
Also a more analytical approach may be interesting.

Intelligence

How "smart" the computer may become?
Not smarter than the player, one would say, as the moves of the player are copied.
I believe this is not true.
If the player does meaningless moves, by accident he will beat the computer now and then.
And at every loss the computer's knowledge increases.
Even in the case of moving in a complety random way, the level of the computer will rise,
be it very slowly.
Evolution does not need higher powers.
It amounts to storage(dna, documents), structures and selection.

This is the end of the description of my self-learning game.
I conclude with a snapshot of a (level 3) error pattern.
A dot indicates an empty field.
Fields outside the active rectangle are colored gray.

Have fun!

David Dirkse
oct. 2012