Planning in multi-agent partially observable domains using sparse sampling

DSpace Home
→
Harvested articles مقالات مستوردة من مؤسسات وجامعات عالمية
→
ScholarsArchive@OSU
→
View Item

dc.contributor	Tadepalli, Prasad
dc.contributor	Fern, Alan
dc.contributor	Cull, Paul
dc.contributor	Poppino, Richard
dc.date	2005-10-17T16:02:41Z
dc.date	2005-10-17T16:02:41Z
dc.date	2005-09-22
dc.date	2005-10-17T16:02:41Z
dc.date.accessioned	2013-10-16T07:27:36Z
dc.date.available	2013-10-16T07:27:36Z
dc.date.issued	2013-10-16
dc.identifier	http://hdl.handle.net/1957/511
dc.identifier.uri	http://koha.mediu.edu.my:8181/xmlui/handle/1957/511
dc.description	Graduation date: 2006
dc.description	A large number of sequential decision-making problems in uncertain environments can be modeled as Markov Decision Processes (MDPs). In such settings, an agent can observe at each time step the state of the environment and then executes an action, causing a stochastic transition to a new state of the environment and receiving a reward accordingly. In a finite-horizon MDP, the goal of planning is to maximize the expected total payoff over the given horizon. MDPs can be solved using a number of different algorithms whose complexity is generally some low-order polynomial in the number of states and decision-making horizon. Interactive computer games constitute a great platform of development for AI research in learning and planning. Akin to the real-world problems they simulate, they introduce an additional level of complexity. As a matter of fact, in such settings the agent's sensors provide only partial information about the state of the environment, called an observation. These problems can be modeled as partially observable MDPs (POMDPs). At any point in time, the sequence of observations made by the agent so far determines a probability distribution over states, called a belief state. It has been shown that solving a POMDP can be reduced to solving the corresponding MDP on the set of belief states. This planning problem, however, becomes rapidly intractable in large state spaces with a substential number of observations. In this thesis, we adapt the work of Kearns, Mansour and Ng on sparse sampling algorithms to factored POMDP representations of multi-agent partially observable domains. Applying this algorithm to two domains based on popular video games, we show empirically how a randomly sampled look-ahead tree covering only a small fraction of the full look-ahead tree is sufficient to compute near-optimal policies in these settings. We compare the performance of this approach to the classical methods and conclude that sparse sampling dramatically reduces the running time of the planning algorithm and scales well with the number of enemy agents.
dc.language	en_US
dc.subject	Artificial intelligence
dc.subject	Planning
dc.subject	Multi-agent
dc.subject	Sparse sampling
dc.subject	Partially observable
dc.subject	Computer games
dc.subject	Bomberman
dc.subject	POMDP
dc.subject	Running time
dc.subject	Belief state
dc.subject	Sampling
dc.title	Planning in multi-agent partially observable domains using sparse sampling
dc.type	Thesis

Files in this item

Files	Size	Format	View
There are no files associated with this item.

This item appears in the following Collection(s)

ScholarsArchive@OSU

Show simple item record

Search DSpace

Advanced Search

Browse

All of DSpace
This Collection
- By Issue Date
- Authors
- Titles
- Subjects

Planning in multi-agent partially observable domains using sparse sampling

Files in this item

This item appears in the following Collection(s)

Search DSpace

Browse

All of DSpace

This Collection

My Account