- Code Refractory and Scratch of Classes Design
- Point of completion: Code refactored and classes design sketch
- Deadline: 4d - 3/1/2015
- Description: Refactor the code of the open source pacman in order to facilitate the development and the understanding of anyone who look at this code, also, a simple UML classes design can guide the next steps of the project.
- Connectors
- Point of completion: java-connection and LispConnection classes working good
- Deadline: 8d - 3/9/2015
- Description: This classes will encapsulate all the communication and changes in data format between the java and lisp programs.
- Feature and Reward Extractor
- Point of completion: FeatureExtractor class and getLastReward() method working good.
- Deadline: 15d - 3/24/2015
- Description: The FeatureExtractor.java class will calculate and provide a list of features in a specific state with a specific action to the QAgent in Lisp. For example, the number of ghosts in 1 step of distance. The method getLastReward() provide the difference in the pacman score after last action.
- Q-Learning Agent
- Point of completion: Q-Learning Algorithm controlling the Pacman.
- Deadline: 15d - 4/8/2015
- Description: This phase is the core of the project, here I am going to implement the Q-Learning Algorithm.
- Game Design Improvement and/or Search Algorithms to Improve Distance Features
- Point of completion: A version with the game design improved
- Deadline: 10d - 4/18/2015
- Description: Search Algorithms to Improve Distance Features in the pacman agent and/or Improve Ghosts AI and/or add new levels in the game and/or new audio or graphic features.
- Statistics
- Point of completion: Complete statistic evaluation comparing the random machine, the human player and the q learning algorithm evolution.
- Deadline: 5d - 4/23/2015
- Description: If the LRT Algorithm is good I will implement some chase and scattering techniques to the ghosts and make the game more interesting, if it is not good, I will see what is wrong and make changes.
- Paper
- Point of completion: Project Paper Complete
- Deadline: 11d - 5/5/2015
- Description: This paper is going to describe all the algorithms used in the project.
Monday, February 23, 2015
Scholarship Requires Planning
Scholarship Requires Context
Jarad Cannon, Kevin Rose, Wheeler Ruml. “Real-Time Motion Planning with Dynamic Obstacles” Proceedings of the Fifth Annual Symposium on Combinatorial Search (2012).
Description: Probably it is going to be the main guide for the project, because this article describe the exact kind of algorithm that I need to use and compare some of them, showing pros and cons of each.
Bulitko, Vadim, Yngvi Björnsson, and Ramon Lawrence. "Case-Based Subgoaling In Real-Time Heuristic Search For Video Game Pathfinding." (2014): arXiv. Web. 22 Feb. 2015.
Description: This article is going to be useful because it talk about subgoaling in the context that I am working with. I could assume each dot in the pacman game as a subgoal.
Maxim Likhachev, David Ferguson , Geoffrey Gordon, Anthony (Tony) Stentz, and Sebastian Thrun, "Anytime Dynamic A*: An Anytime, Replanning Algorithm," Proceedings of the International Conference on Automated Planning and Scheduling (ICAPS), June, 2005.
Description: The LRTA* algorithm is based in static obstacles, differently of the pacman game. So, the article describe one variation of this algorithm to deal with this situations.
I. Szita and A. Lorincz (2007) "Learning to Play Using Low-Complexity Rule-Based Policies: Illustrations through Ms. Pac-Man", Volume 30, pages 659-684
Description: This work describe some heuristics and the use of some algorithms in the pacman game.
Korf, R. E. 1990. Real-time heuristic search. Artificial Intelligence 42(2-3):189–211.
Description: This article is a great reference in the real time heuristic search field.
David M. Bond, Niels A. Widger, Wheeler Ruml, Xiaoxun Sun.”Real-Time Search in Dynamic Worlds” Proceedings of the Symposium on Combinatorial Search (SoCS-10), 2010.
Description: This article describe other kind of Real-Time Search in Dynamic Worlds algorithm, the
Garcıa, Adrián Ortega, and Juan Carlos Orendain Canales. "Agent Pac-Man: A Study in A* Search Method." Sistemas Inteligentes: Reportes Finales Ene-May 2014} (2014): 1. http://www.researchgate.net/profile/Gildardo_Sanchez-Ante/publication/262600223_Sistemas_Inteligentes_Reportes_Finales_Ene-May_2014/links/0f317538331511fb40000000.pdf#page=6
Description: This work describe the use of A* search method in the Pacman game and also a different behaivor to the ghosts.Q-Learning Algorithm
Reinforcement Learning to Train Ms. Pac-Man Using Higher-order Action-relative Inputs
Deep Learning for Reinforcement Learning in Pacman
Technical Note Q-Learning
Reinforcement Learning and Function Approximation∗
Q-Learning with Linear Function Approximation
Q-learning with linear function approximation
An Analysis of Reinforcement Learning with Function Approximation
Double Q-learning
Approximate dynamic programming and reinforcement learning∗
Combining Q-Learning with Artificial Neural Networks in an Adaptive Light Seeking Robot
Extra
Course berkley
Lecture
Slides
Books
Wiki
Tutorial
https://docs.google.com/document/d/1j6A34-NBcdTyjQJGvlcyn0tOgucY4wdiQMXj-rlQtzw/edit?usp=sharing
Sunday, February 22, 2015
Wednesday, February 18, 2015
Sunday, February 15, 2015
Concept Proof - Pacman LISP + JAVA
I have been doing some tests to see if the Java and Lisp would really work good together to the project using the library abcl. I had pretty good results, but probably I will have some trouble to convert java arrays in lisp lists and with the old design patterns adopted in the Open Source PACMAN. Although, it is part of the process of learning. This project already has "smart" ghosts that are unable to turn back in the path. I made some changes in the code/design and add one lisp function to control the Pacman. With the human behavior of "lose in a hurry" It is going randomly and also it is not allowed to turn back in the path.
You can find the source code on GitHub and a "Concept proof" video bellow:
You can find the source code on GitHub and a "Concept proof" video bellow:
Wednesday, February 11, 2015
Monday, February 9, 2015
Candidate topics for a Research/Programming Project
Content Based Recommendation System to Events
Choose which bar, show, theater, play or any sort of event, go or do not go can be a big issue for people who live in a big city with a lot of options, or even to people who are visiting a new city. There are some apps, as Foursquare, that could help in this situations, but none of them gives personal recommendation and none of them are based in events. This project would be the creation of a recommendation system to events, where the user could see their prefer kind of events, and a guess about how they would rate those events. In this way, the user would save time, instead of look at all events in the city to choose which one is the best, the user could simply see the top ten events for him. The system would provides personalized recommendation by matching user’s interests with description and attributes of this events. For example, the place, actors, artists, theme, day of week, time kind of music and etc. Moreover, the system would obtain the user’s interests using their likes or dislikes in previous events, their friends interests or even a initial interview. The machine learning would be an essential part of the system, for example, a decision tree could be used to learn a model that smartly chooses minimum set of question while learning user's preference in the initial interview. Moreover, for the recommendation of the event, some standard techniques of machine learning could be used, such as, logistic regression, support vector machines, decision tree or other. Considering the natural interaction required to this system, it should be web based.
Real-Time Learning with Dynamic Obstacles Applied in A PacMan
Real time learning algorithms are commonly used in situations that the agent has to interleaves planning and acting to have a quick answer on a path finding problem. Also, some of those algorithms include obstacles in their analysis, obstacles that can be dynamic or not. It can be used in many different situations, for example a flying drone or a robust robot motion planning in dynamic environments. The project to the course could be the application of some of these algorithms, such as, LSS-LRTA* or neural networks, to a PacMan game.
Articles that can help
http://rvsn.csail.mit.edu/location/Minkov_collaborative_recommendation.pdf http://www.aaai.org/ocs/index.php/SOCS/SOCS12/paper/viewFile/5412/5174 http://www.gamesitb.com/nnpathgraham.pdfhttp://www.gamesitb.com/pathgraham.pdf
Other Ideas
Smart Othello - Reversi GameTraffic Tweet Classifier
Sunday, February 8, 2015
Tic Tac Toe - Visualization, Analysis, and Statistics
In this picture is possible see the "Visualize" method working:
In this picture is possible see the "Analyze" method working:
Combining them we can see the method "demo-va" working:
Bellow, a image of the use of the method "stats" to display the statistics of the game played randomly showing the games.
Running the algorithm with two random players and 30000 games, we could see the statistic in the view of the first player. The result was: 58.43% of win, 28.74% Lost and 12.82% Draw. As expected for most of class it was about 60% of chance of winning. Bellow you can see a screen shot of the test.
and here the code.
Also, the most interesting part of the code was the method of analysis.
(defmethod analyze((l list))
(defparameter wonlist '( (nw n ne) (w c e) (sw s se) (nw w sw) (n c s) (ne e se) (nw c se) (ne c sw)))
(defparameter result 'd)
(defparameter xlist ())
(defparameter olist ())
(loop
for n from 0
for move in l
if (and (evenp n) (equal result 'd)) do (setf xlist (cons move xlist))
do (loop for lane in wonlist
if (equal (length(intersection xlist lane)) 3)
do (setf result 'w)
)
if (and (oddp n) (equal result 'd)) do (setf olist (cons move olist))
(loop for lane in wonlist
if (equal (length(intersection olist lane)) 3)
do (setf result 'l)
)
)
result
)
Sunday, February 1, 2015
Random Tic Tac Toe
It is a simulation of 5 games in a random tic tac toe:
Impressively, the first player won every game. This is the file of the code.
Subscribe to:
Posts (Atom)