Go & Neural Net

By: Stenly • Essay • 758 Words • January 15, 2010 • 896 Views

Page 1 of 4

Join now to read essay Go & Neural Net

the networks

The authors tried a variety of networks. The paper diagrams one sample network, but many others were experimented with, and the sample network doesn't use all the techniques mentioned in the paper.

The networks were trained by the temporal difference algorithm TD(0) to predict, not the overall result of the game, but rather the owner of each point on the board at the end of the game. (The winner of a go game is the player who controls more points at the end of the game, after making a small adjustment to eliminate the first player's advantage.) That gives the networks more information to learn from.

Features of a go position are approximately translation-invariant. In other words, a configuration of stones is about as valuable in one place as another, all else being equal, although it may depend on how close it is to the edge of the board. A network which does not take this into account will have to learn the value of each configuration at each location it can appear.

Therefore the networks learned feature detectors which were scanned over their inputs. The layers of the network were explicit feature maps. The sample network has two hidden layers, connected in parallel rather than in series, which were added at different points during the network's training. Each layer is a feature map produced by scanning feature detectors over its inputs.

Some networks (but not the sample network) were forced to obey the symmetry of the go board by being constrained to learn symmetrical features. The paper says, "Although this is clearly beneficial during the evaluation of the network against its opponents, it appears to impede the course of learning."

The networks played by making a one-ply search, evaluating every possible move. Nici Schraudolph wrote me that, although he used incremental techniques (not mentioned in the paper) to evaluate networks quickly, they were still too slow for full 19x19 go.

training

Training by self-play alone was found to be slow. Training against a skilled opponent was faster because the networks could learn from their opponents. The opponents used were a random player, useful to start off training; Wally, a weak public domain program; and Many Faces of Go, a comparatively strong commercial program.

Like any learner, the networks learned best from opponents not too far from their own strength. Networks started out knowing nothing, and so needed weak opponents. Wally was modified to play a certain proportion of random moves, and the proportion was reduced as the network improved. Against Many Faces, games were played with standard go handicaps.

Because go is a deterministic game, there was a risk that a network might never explore some options, because it falsely thought it "knew better". That would leave blind spots in its understanding. The problem was solved by introducing randomness into the network's play, using Gibbs sampling.

Networks that played too much against one opponent risked over-fitting to that opponent, hurting their results against other opponents.

results

The

Download as (for upgraded members) txt (4.8 Kb) pdf (85.7 Kb) docx (12.1 Kb)

Continue for 3 more pages »

Read full document Save

Essay Preview

prev next

Report this essay

Related Essays

Teens on the Net - Are They Safe?

Teens On The Net вЂ“ Are They Safe? Ryan Halligan is remembered by many people, and is used as an example during public speeches at

2,603 Words | 11 Pages
How to Hack / Crack Yahoo,hotmail,aol Password [www.Hack-Zone1.Net]

HACK / CRACK Yahoo,Hotmail,AOL Password [WWW.HACK-ZONE1.NET] http://www.hack-zone1.net/ http://www.hack-zone1.net/ HACK-ZONE1GROUP will find any EMAIL ACCOUNT password for you. The list includes Yahoo, Hotmail, MSN, Gmail ,

450 Words | 2 Pages
Hack / Crack Yahoo,hotmail,aol Password (http://www.Hirehacker.Net/)

WWW.HIREHACKER.NET Some men who suspect their wives are cheating may be in denial and refuse to believe it. Well, for all those other men who

413 Words | 2 Pages
Florida Criminal Justice Network (cj Net)

Florida Criminal Justice Network (CJ Net) When the Florida Department of Law Enforcement (FDLE) decided that their system of communication needed a revamping they upgraded

1,252 Words | 6 Pages

Get Access to 89,000+ Essays and Term Papers
Join 240,000+ Other Students
High Quality Essays and Documents