WEKO3
アイテム
{"_buckets": {"deposit": "7834fe9c-71c6-4832-ba67-5cc64e7e684b"}, "_deposit": {"created_by": 29, "id": "1545", "owners": [29], "pid": {"revision_id": 0, "type": "depid", "value": "1545"}, "status": "published"}, "_oai": {"id": "oai:oist.repo.nii.ac.jp:00001545", "sets": ["26", "78"]}, "author_link": ["9352", "9353", "9354"], "item_10001_biblio_info_7": {"attribute_name": "Bibliographic Information", "attribute_value_mlt": [{"bibliographicIssueDates": {"bibliographicIssueDate": "2020-06-06", "bibliographicIssueDateType": "Issued"}, "bibliographicPageEnd": "162", "bibliographicPageStart": "149", "bibliographicVolumeNumber": "129", "bibliographic_titles": [{}, {"bibliographic_title": "Neural Networks", "bibliographic_titleLang": "en"}]}]}, "item_10001_creator_3": {"attribute_name": "Author", "attribute_type": "creator", "attribute_value_mlt": [{"creatorNames": [{"creatorName": "Han, Dongqi"}], "nameIdentifiers": [{"nameIdentifier": "9352", "nameIdentifierScheme": "WEKO"}]}, {"creatorNames": [{"creatorName": "Doya, Kenji"}], "nameIdentifiers": [{"nameIdentifier": "9353", "nameIdentifierScheme": "WEKO"}]}, {"creatorNames": [{"creatorName": "Tani, Jun"}], "nameIdentifiers": [{"nameIdentifier": "9354", "nameIdentifierScheme": "WEKO"}]}]}, "item_10001_description_5": {"attribute_name": "Abstract", "attribute_value_mlt": [{"subitem_description": "Recurrent neural networks (RNNs) for reinforcement learning (RL) have shown distinct advantages, e.g., solving memory-dependent tasks and meta-learning. However, little effort has been spent on improving RNN architectures and on understanding the underlying neural mechanisms for performance gain. In this paper, we propose a novel, multiple-timescale, stochastic RNN for RL. Empirical results show that the network can autonomously learn to abstract sub-goals and can self-develop an action hierarchy using internal dynamics in a challenging continuous control task. Furthermore, we show that the self-developed compositionality of the network enhances faster re-learning when adapting to a new task that is a re-composition of previously learned sub-goals, than when starting from scratch. We also found that improved performance can be achieved when neural activities are subject to stochastic rather than deterministic dynamics.", "subitem_description_type": "Other"}]}, "item_10001_publisher_8": {"attribute_name": "Publisher", "attribute_value_mlt": [{"subitem_publisher": "Elsevier"}]}, "item_10001_relation_14": {"attribute_name": "DOI", "attribute_value_mlt": [{"subitem_relation_type": "isIdenticalTo", "subitem_relation_type_id": {"subitem_relation_type_id_text": "info:doi/10.1016/j.neunet.2020.06.002", "subitem_relation_type_select": "DOI"}}]}, "item_10001_relation_16": {"attribute_name": "情報源", "attribute_value_mlt": [{"subitem_relation_name": [{"subitem_relation_name_text": "https://creativecommons.org/licenses/by-nc-nd/4.0/"}]}]}, "item_10001_relation_17": {"attribute_name": "Related site", "attribute_value_mlt": [{"subitem_relation_type_id": {"subitem_relation_type_id_text": "https://doi.org/10.1016/j.neunet.2020.06.002", "subitem_relation_type_select": "DOI"}}]}, "item_10001_rights_15": {"attribute_name": "Rights", "attribute_value_mlt": [{"subitem_rights": "© 2020 The Authors."}]}, "item_10001_source_id_9": {"attribute_name": "ISSN", "attribute_value_mlt": [{"subitem_source_identifier": "0893-6080", "subitem_source_identifier_type": "ISSN"}]}, "item_10001_version_type_20": {"attribute_name": "Author\u0027s flag", "attribute_value_mlt": [{"subitem_version_resource": "http://purl.org/coar/version/c_970fb48d4fbd8a85", "subitem_version_type": "VoR"}]}, "item_files": {"attribute_name": "ファイル情報", "attribute_type": "file", "attribute_value_mlt": [{"accessrole": "open_date", "date": [{"dateType": "Available", "dateValue": "2020-06-12"}], "displaytype": "detail", "download_preview_message": "", "file_order": 0, "filename": "Han-2020-Self-organization of action hierarchy.pdf", "filesize": [{"value": "2.5 MB"}], "format": "application/pdf", "future_date_message": "", "is_thumbnail": false, "licensefree": "Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International(https://creativecommons.org/licenses/by-nc-nd/4.0/)", "licensetype": "license_free", "mimetype": "application/pdf", "size": 2500000.0, "url": {"label": "Han-2020-Self-organization of action hierarchy", "url": "https://oist.repo.nii.ac.jp/record/1545/files/Han-2020-Self-organization of action hierarchy.pdf"}, "version_id": "e73da711-69a9-4871-8a9b-768ee3eee086"}]}, "item_language": {"attribute_name": "言語", "attribute_value_mlt": [{"subitem_language": "eng"}]}, "item_resource_type": {"attribute_name": "資源タイプ", "attribute_value_mlt": [{"resourcetype": "journal article", "resourceuri": "http://purl.org/coar/resource_type/c_6501"}]}, "item_title": "Self-organization of action hierarchy and compositionality by reinforcement learning with recurrent neural networks", "item_titles": {"attribute_name": "タイトル", "attribute_value_mlt": [{"subitem_title": "Self-organization of action hierarchy and compositionality by reinforcement learning with recurrent neural networks", "subitem_title_language": "en"}]}, "item_type_id": "10001", "owner": "29", "path": ["26", "78"], "permalink_uri": "https://oist.repo.nii.ac.jp/records/1545", "pubdate": {"attribute_name": "公開日", "attribute_value": "2020-06-12"}, "publish_date": "2020-06-12", "publish_status": "0", "recid": "1545", "relation": {}, "relation_version_is_last": true, "title": ["Self-organization of action hierarchy and compositionality by reinforcement learning with recurrent neural networks"], "weko_shared_id": 29}
Self-organization of action hierarchy and compositionality by reinforcement learning with recurrent neural networks
https://oist.repo.nii.ac.jp/records/1545
https://oist.repo.nii.ac.jp/records/1545588eadba-f5ad-4b70-b9e0-175a0842d2cf
名前 / ファイル | ライセンス | アクション |
---|---|---|
Han-2020-Self-organization of action hierarchy (2.5 MB)
|
Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International(https://creativecommons.org/licenses/by-nc-nd/4.0/)
|
Item type | 学術雑誌論文 / Journal Article(1) | |||||
---|---|---|---|---|---|---|
公開日 | 2020-06-12 | |||||
タイトル | ||||||
言語 | en | |||||
タイトル | Self-organization of action hierarchy and compositionality by reinforcement learning with recurrent neural networks | |||||
言語 | ||||||
言語 | eng | |||||
資源タイプ | ||||||
資源タイプ識別子 | http://purl.org/coar/resource_type/c_6501 | |||||
資源タイプ | journal article | |||||
著者(英) |
Han, Dongqi
× Han, Dongqi× Doya, Kenji× Tani, Jun |
|||||
書誌情報 |
en : Neural Networks 巻 129, p. 149-162, 発行日 2020-06-06 |
|||||
抄録 | ||||||
内容記述タイプ | Other | |||||
内容記述 | Recurrent neural networks (RNNs) for reinforcement learning (RL) have shown distinct advantages, e.g., solving memory-dependent tasks and meta-learning. However, little effort has been spent on improving RNN architectures and on understanding the underlying neural mechanisms for performance gain. In this paper, we propose a novel, multiple-timescale, stochastic RNN for RL. Empirical results show that the network can autonomously learn to abstract sub-goals and can self-develop an action hierarchy using internal dynamics in a challenging continuous control task. Furthermore, we show that the self-developed compositionality of the network enhances faster re-learning when adapting to a new task that is a re-composition of previously learned sub-goals, than when starting from scratch. We also found that improved performance can be achieved when neural activities are subject to stochastic rather than deterministic dynamics. | |||||
出版者 | ||||||
出版者 | Elsevier | |||||
ISSN | ||||||
収録物識別子タイプ | ISSN | |||||
収録物識別子 | 0893-6080 | |||||
DOI | ||||||
関連タイプ | isIdenticalTo | |||||
識別子タイプ | DOI | |||||
関連識別子 | info:doi/10.1016/j.neunet.2020.06.002 | |||||
権利 | ||||||
権利情報 | © 2020 The Authors. | |||||
情報源 | ||||||
関連名称 | https://creativecommons.org/licenses/by-nc-nd/4.0/ | |||||
関連サイト | ||||||
識別子タイプ | DOI | |||||
関連識別子 | https://doi.org/10.1016/j.neunet.2020.06.002 | |||||
著者版フラグ | ||||||
出版タイプ | VoR | |||||
出版タイプResource | http://purl.org/coar/version/c_970fb48d4fbd8a85 |