Self-organization of action hierarchy and compositionality by reinforcement learning with recurrent neural networks

Han, Dongqi; Doya, Kenji; Tani, Jun

doi:info:doi/10.1016/j.neunet.2020.06.002

WEKO3

lat lon distance

[[sub_check.contents]]

[[sub_radio.contents]]

Field does not validate

[[sub_attr.contents]]　

インデックスツリー

アイテム

{"_buckets": {"deposit": "7834fe9c-71c6-4832-ba67-5cc64e7e684b"}, "_deposit": {"created_by": 29, "id": "1545", "owners": [29], "pid": {"revision_id": 0, "type": "depid", "value": "1545"}, "status": "published"}, "_oai": {"id": "oai:oist.repo.nii.ac.jp:00001545", "sets": ["26", "78"]}, "author_link": ["9352", "9353", "9354"], "item_10001_biblio_info_7": {"attribute_name": "Bibliographic Information", "attribute_value_mlt": [{"bibliographicIssueDates": {"bibliographicIssueDate": "2020-06-06", "bibliographicIssueDateType": "Issued"}, "bibliographicPageEnd": "162", "bibliographicPageStart": "149", "bibliographicVolumeNumber": "129", "bibliographic_titles": [{}, {"bibliographic_title": "Neural Networks", "bibliographic_titleLang": "en"}]}]}, "item_10001_creator_3": {"attribute_name": "Author", "attribute_type": "creator", "attribute_value_mlt": [{"creatorNames": [{"creatorName": "Han, Dongqi"}], "nameIdentifiers": [{"nameIdentifier": "9352", "nameIdentifierScheme": "WEKO"}]}, {"creatorNames": [{"creatorName": "Doya, Kenji"}], "nameIdentifiers": [{"nameIdentifier": "9353", "nameIdentifierScheme": "WEKO"}]}, {"creatorNames": [{"creatorName": "Tani, Jun"}], "nameIdentifiers": [{"nameIdentifier": "9354", "nameIdentifierScheme": "WEKO"}]}]}, "item_10001_description_5": {"attribute_name": "Abstract", "attribute_value_mlt": [{"subitem_description": "Recurrent neural networks (RNNs) for reinforcement learning (RL) have shown distinct advantages, e.g., solving memory-dependent tasks and meta-learning. However, little effort has been spent on improving RNN architectures and on understanding the underlying neural mechanisms for performance gain. In this paper, we propose a novel, multiple-timescale, stochastic RNN for RL. Empirical results show that the network can autonomously learn to abstract sub-goals and can self-develop an action hierarchy using internal dynamics in a challenging continuous control task. Furthermore, we show that the self-developed compositionality of the network enhances faster re-learning when adapting to a new task that is a re-composition of previously learned sub-goals, than when starting from scratch. We also found that improved performance can be achieved when neural activities are subject to stochastic rather than deterministic dynamics.", "subitem_description_type": "Other"}]}, "item_10001_publisher_8": {"attribute_name": "Publisher", "attribute_value_mlt": [{"subitem_publisher": "Elsevier"}]}, "item_10001_relation_14": {"attribute_name": "DOI", "attribute_value_mlt": [{"subitem_relation_type": "isIdenticalTo", "subitem_relation_type_id": {"subitem_relation_type_id_text": "info:doi/10.1016/j.neunet.2020.06.002", "subitem_relation_type_select": "DOI"}}]}, "item_10001_relation_16": {"attribute_name": "情報源", "attribute_value_mlt": [{"subitem_relation_name": [{"subitem_relation_name_text": "https://creativecommons.org/licenses/by-nc-nd/4.0/"}]}]}, "item_10001_relation_17": {"attribute_name": "Related site", "attribute_value_mlt": [{"subitem_relation_type_id": {"subitem_relation_type_id_text": "https://doi.org/10.1016/j.neunet.2020.06.002", "subitem_relation_type_select": "DOI"}}]}, "item_10001_rights_15": {"attribute_name": "Rights", "attribute_value_mlt": [{"subitem_rights": "© 2020 The Authors."}]}, "item_10001_source_id_9": {"attribute_name": "ISSN", "attribute_value_mlt": [{"subitem_source_identifier": "0893-6080", "subitem_source_identifier_type": "ISSN"}]}, "item_10001_version_type_20": {"attribute_name": "Author\u0027s flag", "attribute_value_mlt": [{"subitem_version_resource": "http://purl.org/coar/version/c_970fb48d4fbd8a85", "subitem_version_type": "VoR"}]}, "item_files": {"attribute_name": "ファイル情報", "attribute_type": "file", "attribute_value_mlt": [{"accessrole": "open_date", "date": [{"dateType": "Available", "dateValue": "2020-06-12"}], "displaytype": "detail", "download_preview_message": "", "file_order": 0, "filename": "Han-2020-Self-organization of action hierarchy.pdf", "filesize": [{"value": "2.5 MB"}], "format": "application/pdf", "future_date_message": "", "is_thumbnail": false, "licensefree": "Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International(https://creativecommons.org/licenses/by-nc-nd/4.0/)", "licensetype": "license_free", "mimetype": "application/pdf", "size": 2500000.0, "url": {"label": "Han-2020-Self-organization of action hierarchy", "url": "https://oist.repo.nii.ac.jp/record/1545/files/Han-2020-Self-organization of action hierarchy.pdf"}, "version_id": "e73da711-69a9-4871-8a9b-768ee3eee086"}]}, "item_language": {"attribute_name": "言語", "attribute_value_mlt": [{"subitem_language": "eng"}]}, "item_resource_type": {"attribute_name": "資源タイプ", "attribute_value_mlt": [{"resourcetype": "journal article", "resourceuri": "http://purl.org/coar/resource_type/c_6501"}]}, "item_title": "Self-organization of action hierarchy and compositionality by reinforcement learning with recurrent neural networks", "item_titles": {"attribute_name": "タイトル", "attribute_value_mlt": [{"subitem_title": "Self-organization of action hierarchy and compositionality by reinforcement learning with recurrent neural networks", "subitem_title_language": "en"}]}, "item_type_id": "10001", "owner": "29", "path": ["26", "78"], "permalink_uri": "https://oist.repo.nii.ac.jp/records/1545", "pubdate": {"attribute_name": "公開日", "attribute_value": "2020-06-12"}, "publish_date": "2020-06-12", "publish_status": "0", "recid": "1545", "relation": {}, "relation_version_is_last": true, "title": ["Self-organization of action hierarchy and compositionality by reinforcement learning with recurrent neural networks"], "weko_shared_id": 29}

Self-organization of action hierarchy and compositionality by reinforcement learning with recurrent neural networks

https://oist.repo.nii.ac.jp/records/1545

名前 / ファイル	ライセンス	アクション
Han-2020-Self-organization of action hierarchy (2.5 MB)	Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International(https://creativecommons.org/licenses/by-nc-nd/4.0/)

Item type

学術雑誌論文 / Journal Article(1)

公開日

2020-06-12

タイトル

言語

タイトル

Self-organization of action hierarchy and compositionality by reinforcement learning with recurrent neural networks

言語

eng

資源タイプ

資源タイプ識別子

http://purl.org/coar/resource_type/c_6501

資源タイプ

journal article

著者（英）

Han, Dongqi
Doya, Kenji
Tani, Jun

書誌情報

en : Neural Networks

巻 129, p. 149-162, 発行日 2020-06-06

抄録

内容記述タイプ

Other

内容記述

Recurrent neural networks (RNNs) for reinforcement learning (RL) have shown distinct advantages, e.g., solving memory-dependent tasks and meta-learning. However, little effort has been spent on improving RNN architectures and on understanding the underlying neural mechanisms for performance gain. In this paper, we propose a novel, multiple-timescale, stochastic RNN for RL. Empirical results show that the network can autonomously learn to abstract sub-goals and can self-develop an action hierarchy using internal dynamics in a challenging continuous control task. Furthermore, we show that the self-developed compositionality of the network enhances faster re-learning when adapting to a new task that is a re-composition of previously learned sub-goals, than when starting from scratch. We also found that improved performance can be achieved when neural activities are subject to stochastic rather than deterministic dynamics.

出版者

Elsevier

ISSN

収録物識別子タイプ

ISSN

収録物識別子

0893-6080

DOI

Versions

Ver.1

2023-06-26 11:48:29.881720

Show All versions

Cite as

エクスポート

OAI-PMH

JPCOAR
DublinCore
DDI

Other Formats

JSON
BIBTEX

インデックスリンク

インデックスツリー

アイテム

Self-organization of action hierarchy and compositionality by reinforcement learning with recurrent neural networks

× Han, Dongqi

× Doya, Kenji

× Tani, Jun

Versions

Share

Cite as

エクスポート