Zhiming666's picture
Upload 157 files
0e072c0
raw
history blame contribute delete
No virus
4.51 kB
\section{Related Works}
This section presents a review of related works in the field of deep reinforcement learning, with a focus on the application of this technique to playing Atari games. We categorize the related works into five main fields: (1) deep learning for game playing, (2) deep reinforcement learning, (3) actor-critic algorithms, (4) deep Q-networks, and (5) applications of deep reinforcement learning.
\paragraph{Deep Learning for Game Playing}
The use of deep learning for game playing has seen significant progress in recent years. The seminal work of \citet{mnih2013playing} introduced a deep reinforcement learning approach to playing Atari games, which achieved superhuman performance on several games. \citet{qi2016pointnet} proposed a neural network that directly consumes point clouds, which provides a unified architecture for applications ranging from object classification to scene semantic parsing. \citet{chollet2016xception} proposed a novel deep convolutional neural network architecture inspired by Inception, which slightly outperforms Inception V3 on the ImageNet dataset. \citet{zhang2016understanding} established that state-of-the-art convolutional networks for image classification trained with stochastic gradient methods easily fit a random labeling of the training data. \citet{gulshan2016development} developed an algorithm based on deep machine learning that had high sensitivity and specificity for detecting referable diabetic retinopathy and diabetic macular edema in retinal fundus photographs from adults with diabetes.
\paragraph{Deep Reinforcement Learning}
Deep reinforcement learning is a subfield of machine learning that combines deep learning and reinforcement learning. \citet{mnih2013playing} introduced the concept of deep reinforcement learning to playing Atari games. \citet{mnih2016asynchronous} proposed a lightweight framework for deep reinforcement learning that uses asynchronous gradient descent for optimization of deep neural network controllers. \citet{haarnoja2018soft} proposed soft actor-critic, an off-policy actor-critic deep RL algorithm based on the maximum entropy reinforcement learning framework. \citet{rakkini2022comprehensive} surveyed the deployment of machine learning and deep learning, reinforcement methods on mitigating the selfish mining attacks in the blockchain.
\paragraph{Actor-Critic Algorithms}
Actor-critic algorithms are a class of reinforcement learning algorithms that use separate networks to estimate the value function and the policy. \citet{lillicrap2015continuous} presented an actor-critic, model-free algorithm based on the deterministic policy gradient that can operate over continuous action spaces. \citet{hasselt2015deep} proposed a specific adaptation to the DQN algorithm that reduces the observed overestimations and leads to much better performance on several games. \citet{xiong2018parametrized} proposed a parametrized deep Q-network (P-DQN) framework for the hybrid action space without approximation or relaxation, and combines the spirits of both DQN and DDPG by seamlessly integrating them.
\paragraph{Deep Q-Networks}
Deep Q-networks (DQNs) are a class of deep reinforcement learning algorithms that use a neural network to approximate the Q-value function. \citet{mnih2013playing} presented the first deep learning model to successfully learn control policies directly from high-dimensional sensory input using reinforcement learning. \citet{haensch2019the} analyzed and designed guidelines for how nonvolatile memory materials need to be reengineered for optimal performance in the deep learning space. \citet{zhu2020deep} showed that deep-learning CNN accurately stages disease severity on portable chest x-ray of COVID-19 lung infection.
\paragraph{Applications of Deep Reinforcement Learning}
Deep reinforcement learning has been applied to various fields beyond game playing. \citet{sridhar2020eeg} developed a deep learning network based on a sensory motor paradigm that employs a subject-agnostic Bidirectional Long Short-Term Memory (BLSTM) Network to assess cognitive functions. \citet{joonmyun2020application} surveyed application trends in deep learning-based AI techniques for autonomous things. \citet{sung2018facilitating} showed that students learning with the 3D experiential gaming system showed better learning achievements, problem-solving tendency, deep learning strategies, and deep learning motive than those who learned with the conventional technology-enh