Optimal mileage-based PV array reconfiguration using swarm reinforcement learning

•A new optimal mileage-based PV array reconfiguration (OMAR) is constructed.•The OMAR can maximize the total benefit instead of only the generation benefit.•The OMAR decomposition with two sub-problems reduces the optimization difficulty.•The swarm reinforcement learning is used to obtain high-quali...

Full description

Saved in:
Bibliographic Details
Published inEnergy conversion and management Vol. 232; p. 113892
Main Authors Zhang, Xiaoshun, Li, Chuanzhi, Li, Zilin, Yin, Xueqiu, Yang, Bo, Gan, Lingxiao, Yu, Tao
Format Journal Article
LanguageEnglish
Published Oxford Elsevier Ltd 15.03.2021
Elsevier Science Ltd
Subjects
Online AccessGet full text
ISSN0196-8904
1879-2227
DOI10.1016/j.enconman.2021.113892

Cover

More Information
Summary:•A new optimal mileage-based PV array reconfiguration (OMAR) is constructed.•The OMAR can maximize the total benefit instead of only the generation benefit.•The OMAR decomposition with two sub-problems reduces the optimization difficulty.•The swarm reinforcement learning is used to obtain high-quality optimums of OMAR.•The proposed method can obtain higher total benefit than 6 comparative algorithms. This paper constructs a new optimal mileage-based PV array reconfiguration (OMAR) in a PV power plant under partial shading conditions. It aims to maximize the power output of a PV power plant, and minimize the additional capacity and mileage payments resulting from the power fluctuation in a performance-based frequency regulation market. To reduce the optimization difficulty of OMAR, it is decomposed into two optimization sub-problems, including an upper-layer discrete optimization of PV array reconfiguration and a lower-layer continuous optimization of real-time generation scheduling. The upper-layer discrete optimization is addressed by the proposed swarm reinforcement learning (SRL), which can implement an efficient exploration and exploitation with multiple cooperative agents instead of a single learning agent. The rest lower-layer optimization is handled by the fast interior point method. The proposed method’s effectiveness is thoroughly evaluated on the 10 × 10 total-cross-tied PV arrays under various partial shading conditions. Simulation results demonstrate that the proposed SRL can obtain a larger total benefit than genetic algorithm (GA), particle swarm optimization (PSO), grasshopper optimization algorithm (GOA), harris hawks optimizer (HHO), butterfly optimization algorithm (BOA), and Q-learning, in which the benefit increment can reach from 2.12% (against PSO) to 10.62% (against Q-learning).
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ISSN:0196-8904
1879-2227
DOI:10.1016/j.enconman.2021.113892