P2P transaction method for distributed energy prosumers based on reputation value

Tao Jiang1, Ting Hua1, Hao Xiao2,3, Linbo Fu1, Wei Pei2,3, Tengfei Ma2,3


1. K1.Key Laboratory of Modern Power System Simulation and Control & Renewable Energy Technology,Ministry of Education (Northeast Electric Power University), Jilin 132012, P.R.China

2.Institute of Electrical Engineering Chinese Academy of Sciences, Beijing 100190, P.R.China

3.School of Electronic, Electrical and Communication Engineering (EECE), University of Chinese Academy of Sciences, Beijing 100049, P.R.China

Abstract

Adding a reputation incentive system to peer-to-peer (P2P) energy transactions can encourage prosumers to regulate their trading behavior, which is important for ensuring the efficiency and reliability of P2P transactions.This study proposed a P2P transaction mechanism and game optimization model for prosumers involved in distributed energy sources considering reputation-value incentives.First, the deviation of P2P transactions and the non-consumption rate of distributed renewable energy in P2P transactions were established as indicators to quantify the influencing factors of the reputation value, and a reputation incentive model of P2P transactions for prosumers was constructed.Then, the penalty coefficient was applied to the cost function of the prosumers, and a non-cooperative game model of P2P transactions based on the complete information of multi-prosumers was established.Furthermore, the Nash equilibrium problem was transformed into a nonlinear optimization problem by constructing the modified optimal reaction function, and the Nash equilibrium solution of the game was obtained via a relaxation algorithm.Finally, the modified IEEE 33-node test system based on electricity market P2P and an IEEE 123-node test system were used to analyze and verify the cost and P2P participation of prosumers considering the reputation value.The results show that the addition of the reputation incentive system can encourage prosumers to standardize their interactive transaction behavior and actively participate in P2P transactions.It can also improve the operation efficiency of the power grid and promote the perfection of the P2P transaction mechanism.

Keywords: P2P; Prosumer; Non-cooperative game; Reputation value; Multi-agent of interest; Operation optimization

0 Introduction

With the significant increase in the proportion of distributed renewable energy sources, such as photovoltaics and wind, in electricity trading, power users have begun to transform from consumers to prosumers in large numbers and are now key players in the production, distribution and consumption of electricity [1].In this context, distributed operation and transaction modes such as peer-to-peer (P2P)are gradually emerging.Unlike the traditional centralized electricity market mechanism, P2P transaction allows direct energy trading between adjacent prosumers with a certain amount of power generation capacity, thereby increasing the enthusiasm of prosumers to participate in energy trading and promoting market development of a sustainable and reliable balance between community energy production and consumption [2].It is also expected to help the grid by reducing peak demand, reserve demand, and network losses [3].In particular, a less-centralized P2P network can take into account both the important role of the power grid in the energy network and the needs of P2P transactions.Therefore, P2P transaction mechanisms are one of the future research directions for the interaction and transaction mechanisms of participating distributed energy systems.Owing to the dynamic characteristics of distributed renewable resource generation and the imperfection of P2P transaction mechanisms, there are challenges such as reducing energy costs, balancing local supply and demand,encouraging and attracting prosumers, and ensuring transaction security among P2P transactions.It is essential to develop more efficient and economical energy trading mechanisms to deploy in real networks.

To solve these challenges, many scholars have conducted in-depth research on distributed energy P2P.In [4], a new electric vehicle P2P transaction system was proposed, which significantly reduced the impact of the charging process on the power system during working hours and improved the economic benefits for all participating users.In [5], a new multi-energy management strategy was proposed to explore the optimal energy scheduling problems of prosumers.The multi-energy coupling matrix of residential prosumers and the resource task network of industrial consumers were constructed to optimize local P2P market operation.Using alliance blockchain technology, the authors of [6] developed a secure localized P2P transaction system in which electric vehicle charge-discharge could conduct electricity transactions without third parties.In [7], a crowdsourced energy systems model was constructed based on blockchain technology.It considered energy storage,controllable load, and other distributed energy resources when performing market clearing and thus achieved dayahead market equilibrium, which efficiently implemented seamless power transactions for prosumers.In [8], various incentive models involving P2P energy transactions were constructed based on motivational psychology.Furthermore,a P2P energy transaction scheme was established based on a cooperative game, and the stability of the proposed scheme was verified, which enhanced prosumer enthusiasm to participate in P2P transactions.In [9], a Demand-Side Management game model containing energy storage elements was established, and blockchain was introduced to ensure security management on the demand side.Then,the demand-side decentralized management method was proposed to maintain coordination of supply and demand.Although the above literature can solve the problems related to energy costs to a certain extent and has the advantages of encouraging and attracting prosumers, it does not involve the incentive mechanism considered of interest to distribution networks.If the power grid company is able to supervise prosumer behavior in the incentive mechanism as a transaction regulator, it is possible to guarantee the distribution network’s interest and promote the internal energy consumption of prosumers in the meantime.The reputation value of prosumers can accurately record their historical trading behaviors [10].Therefore, the reputation value can be an effective indicator of the trading credibility of prosumers.

To date, several studies have begun to explore the introduction of reputation mechanisms into P2P networks.Studies [26-31] introduced different reputation mechanisms or trust models for P2P networks, which can be effective and efficient in terms of a successful transaction rate, the ability to curb malicious behavior, and complexity.The above studies belong to the computer field; the objects of concern were nodes in the network layer and were not mapped to specific power users or prosumers in the physical layer.There have also been studies that explored the introduction of reputation mechanisms into P2P transactions.Based on blockchain technology, a weakly-centralized trading platform for photovoltaic (PV) systems was proposed in[11].By tracking the reputation value over more than ten days to estimate whether users cooperate effectively, the platform could ensure the active and effective cooperation of users in the P2P PV transaction network.The authors of [12] proposed a transaction mechanism for PV on-site consumption based on blockchain.It encouraged users to absorb PV output and punished them with low local consumption through a time-shifting load characteristic.In [32], a decentralized electric energy trading mechanism was proposed in which distributed users participated in the distribution network.In the process of trading, the reputation value model was introduced to punish a breach of contract and hence encourage the PV users to keep faith and protect the rights and interests of the power buyers.The above studies only investigated the trading behavior between PV users and distributed PV aggregators during the PV output period (6:00-18:00).Additionally, the reputation value tracking cycle was brief.The authors of [13] proposed a centralized energy management system based on reputation and considered a microgrid composed of households each with a PV system that could inject the surplus PV energy into the main grid without any compensation.This system could solve the energy management problem of energy storage devices and available energy by evaluating the reputation of household users in allocating available energy to shared storage units.Based on the existing storage of a virtual power plant, as well as scheduling and trade issues,[14] proposed a continuous double auction mechanism based on reputation with the traditional transaction mechanism.According to the reputation value, the mechanism divided participants to create a favorable reputation-first trading environment and thus standardized the trading behavior of prosumers.In [15], a distributed automatic control method was proposed to manage the energy transactions between agents in grid-connected microgrids.Interactions between agents used historical information on familiarity,acceptance, and internode value to quantify reputation.The above studies fully concentrated on users or prosumers in the P2P transaction network while ignoring the interests of the grid company.

In this regard, to further promote the benefits of both prosumers and the grid company and conduct reliable and stable energy trading by improving the reputation value,this study proposed a reputation-incentivized benefit model for distributed energy prosumers in the P2P energy transaction market.The prosumers were participants in P2P transactions involving distributed energy such as wind/PV/storage.Therefore, we could focus on any time of the day at which P2P transactions occurred to ensure continuity of transactions.The reputation value tracking cycle was sufficiently long to discover how the reputation mechanism developed, and the trading behavior could be adjusted in a timely manner according to changes, while considering the benefits of the power grid company.First, considering the non-cooperative game relationship between prosumers, the reputation value was determined via transaction deviation and renewable energy consumption indicators during trading hours.A P2P prosumer game model was established to meet the cost minimization requirements of multiple stakeholders.Second, using the influence of penalty coefficients on the cost of stakeholders, the transaction priority and cost function were changed, which in turn affected the optimization result of the P2P transactions.Simultaneously,the distributed multi-agent optimal response function was innovated to make it suitable for the P2P electricity market considering the reputation value.The problem of the optimal response function only being applicable to network nodes was solved through strictly positive definite basic convergence conditions.Finally, using the modified IEEE 33- and 123-node test systems, the P2P prosumer game model was verified based on the reputation value, and the influence of the reputation system on prosumer behaviors,P2P transaction market, and input of the grid company was analyzed.

1 Weakly-centralized P2P transaction mechanism considering the reputation value

Given the regulatory needs of the energy market, lesscentralized energy transaction mechanisms can regulate and organize the market behavior of prosumers [17] in a better way and effectively guarantee the security of transactions[18].They are also applicable to the marketization process of the energy market, in contrast with no-trust decentralized transaction platforms [16].Referring to the current lesscentralized energy transaction mechanism [19], instead of interfering with the direct transaction agreements of P2P prosumers, the weakly-centralized management institution is only responsible for transaction intention collection,transaction supervision coordination, and other auxiliary functions.It decentralizes transactions while ensuring energy internet security.The proposed less-centralized P2P energy transaction mechanism that considers reputation in this study is shown in Fig.1.

Fig.1 Weakly-centralized P2P transaction mechanism considering the reputation value

Suppose there are n producers (n≥3) participating in a weakly-centralized P2P transaction.As shown in Fig.1, in the weakly-centralized P2P energy transaction mechanism considering the reputation value, when prosumers enter the market, they can conduct P2P transactions with prosumer i (in) at any time.In addition to maintaining the balance of power supply and demand and recording transaction information in the system, the grid company must also judge and give feedback on the reputation of every transaction.Prosumers with wind-solar-battery and other facilities conduct P2P transactions internally to absorb as much energy as possible, and the part that cannot be absorbed is absorbed by the power grid company.After the transaction is completed, prosumers upload transaction information and the reputation value.The grid company can generate a penalty coefficient as feedback to other prosumers based on the historical reputation value.This will guide prosumers to regulate their market behaviors then reduce the grid company’s market balancing costs and prosumers’transaction costs [20].The P2P transaction mechanism considering reputation incentive is shown below.

When prosumers are included in the P2P market, they upload the transacted electricity quantity and acceptable transaction price as transaction information to the P2P transaction platform and conduct the following prosumer game procedure:

(1) Priority ranking.The prosumers are ranked according to the priority of the sales price from low to high, purchase price from high to low, and reputation value from high to low.

(2) Transaction matching determination.Transaction matching is performed in priority order.The outer priority is price.If the sales price is less than the purchase price,the inner priority is entered, which is the reputation value.The higher the reputation is, the higher the priority for transaction, and then the preliminary transaction is reached.

(3) Power matching.

If purchase power=sales power, this group of buyers/sellers completes power matching and exits the market.Then, it will match the next group of transactions.

If purchase power<sales power, the buyer completes power matching and exits the market.The seller does not match the power and stays in the market.The transaction will be matched with the next buyer.Otherwise return to (2).

If purchase power>sales power, the seller completes power matching and exits the market.The buyer does not match the power and stays in the market.The transaction will be matched with the next seller.Otherwise return to (2).

(4) After all the available transactions are matched, the unmatched unbalanced power is traded with the distribution network according to the on-grid price.

When the prosumer first enters the P2P market, there is an initial purchase and sales price.The subsequent purchase and sales price fluctuates around the penalty coefficient according to the historical purchase and sales price.The transaction price is the average of the buyer and seller’s price after the transaction is reached.In other words, this is the intermediate price mechanism.As for the volatility of the penalty coefficient and the continuous reference of the price to it, the intermediate price also fluctuates.

The reputation system guides prosumers to actively participate in the P2P energy market to reduce transaction costs and standardize the behaviors of prosumers.Therefore,it is important to make it better suited to interactive transactions in P2P energy markets and reduce the cost of balancing supply and demand on the grid.

2 P2P game model of multi-agent prosumers

Prosumers participate in P2P transactions with the strategies and trading information of all other players.This study followed the complete information dynamic game[21].Under the condition of complete information, the multi-agent P2P game model was established.The strategy of the multi-agent game is to change the priority in the P2P transaction by controlling the price and turning the price advantage into transaction priority, which will determine the turnover.

2.1 P2P game model of prosumers

In the P2P transaction of distributed energy prosumers,all prosumers aim to maximize the benefits to their own interests.Although there are several differences in decision variables and decision combinations among prosumers, they still belong to the same model nature; the objective function,constraints, and trading rules are essentially the same.The multi-agent distributed energy prosumer P2P game model aims to minimize the cost to every prosumer, including the transaction cost to prosumers in the P2P market and the transaction cost to the grid company.To absorb as much excess energy as possible within the microgrid through P2P transactions, the cost per prosumer should be minimized in the meantime.This changes the problem of being unable to optimize the consumption of electricity and the cost of the prosumer at the same time in the P2P transaction.The objective function and constraint conditions are shown in equation (1).

where Cop is internal unit operating cost of the prosumer[22], Com is the maintenance cost of the prosumer’s internal unit, and Cmf is the cost of the prosumer participating in the transaction.When the prosumer sells power, the value of Cmf is positive.When the prosumer purchases power, the value of Cmf is negative.Cq(.) and Ceq(.) are the inequality and equality constraints, respectively, x and s are decision and state variables, respectively, and N is the set of prosumers participating in the P2P energy transaction.

Take any prosumer of n∈as an example.The objective function Cn includes the operating cost of the prosumer’s internal distributed energy Cop, the maintenance cost Com, and the cost of the prosumer participating in the transaction Cmf .

where the internal operating cost of prosumer Cop is shown below, which is the total operating cost of all distributed energy within time T, that is, the product of the unit generation cost and generated output at time t.

where Cbattery is the equipment loss cost from charge and discharge of the energy storage device, C'op is the operating cost of renewable energy units, Xt¯xwt, and ¯xpv are the battery cycle cost, the average cost of wind power generation, and the average cost of PV generation, respectively, Pbattery t is the battery charge and discharge power, and SOCmax and SOCmin are the battery charging and discharging depth thresholds.

The prosumer’s internal distributed energy includes wind and photovoltaic power, and the total maintenance cost Com is shown below, which is the product of the total power generation of the distributed energy generation group within T and the maintenance cost factor.

where kj is the maintenance cost factor of the distributed energy generator unit j(jI), Pt is the gross generation of the prosumer at time t, and jis the type of distributed energy.

The cost of the prosumer participating in the transaction Cmf is shown below, which includes the P2P cost of trading with other prosumers and the cost of trading with the grid.

where Cp2pn is the transaction cost between prosumers,Cp2gn is the transaction cost between the prosumer and grid company, cgrid is the on-grid price, which includes the ceiling price csell and the minimum purchase price cbuyPtp2g is the trading power that the prosumer trades with the grid at time t,and cmid,i is the transaction price at time t.Numerically, this is the intermediate price of both prosumers in the transaction[23].Ptp2p,i is the transaction power that prosumers contribute in the P2P market at time t.

2.2 P2P transaction constraints

The constraints of prosumers participating in the P2P transaction are

(1) The P2P trading power equilibrium constraint at time t:

The prosumer’s trading power includes the trading power with other prosumers and with the grid company.

(2) The prosumer’s power supply and consumption balance constraint:

The power supply and consumption should maintain a dynamic balance.

Equations (8) and (9) represent the power consumption and supply of the prosumer, respectively.Here, ptload is the trading power of the load at time t.

(3) The prosumer’s trading power constraint:

The power required for prosumers to participate in P2P transactions must meet network security constraints.

where pmax and pmin are the maximum and minimum power that are allowed to participate in P2P transactions.

(4) P2P intermediate price constraint:

According to the market transaction mechanism proposed in this paper, after adding the reputation incentive system to the P2P transaction mechanism, the trading power of every prosumer will change with historical reputation.This is explained in detail in Section 3.2.Therefore, the P2P trading price should be varied from the maximum trading price cmaxgrid and the minimum trading price cmingrid to ensure that P2P transactions have the priority to operate first.

3 P2P game model of multi-agent prosumers considering the reputation value

Considering the uncertainty of the wind power and PV output forecast and the diversity of prosumer choices in P2P energy market transactions, this section further introduces the reputation mechanism in the P2P energy market with a high penetration rate of renewable energy.From this, the multi-agent prosumer P2P game model considering the reputation value was established.It calculates the reputation value based on one day’s worth of the prosumer’s transaction behavior and uses it as the index affecting subsequent trading behavior and income.With the introduction of the reputation system, the new game model increases the incentive of the power market based on the original model, which encourages prosumers to change their behavior to minimize costs.

3.1 Reputation incentive mechanism

The reputation incentive mechanism takes the reputation value as the standard to measure the reputation value of prosumers.The reputation value is an incentive mechanism to encourage prosumers to give priority to P2P transactions, which changes every day.It represents the quality and enthusiasm of the prosumers participating in P2P transactions.We used the P2P transaction power deviation V and prosumer unconsumed rate φ as the two indicators to measure the reputation value.As an incentive mechanism, the reputation value uses the penalty coefficient to change the cost function of the prosumer as well as the grid company, which is reflected in equations (24) and(26).Based on the objective function (3) of the prosumer,the payment function Cmf and the trading price cmid in the objective function are changed by the penalty coefficient.

Considering the error when the prosumer predicts the wind power and PV output, the actual power of the P2P transaction cannot be completely consistent with the promised power of the P2P transaction; therefore, it is essential to introduce the P2P transaction power deviation V to quantify the error between the actual and planned power of the prosumer participating in the P2P transaction.V is the ratio of the difference between the power that the prosumer actually contributes in the transaction and the predicted planned trading power and predicted power.Here,the power that the prosumer actually contributes in the transaction is the difference between the load consumption and wind power and PV generation,

where Pactu is the actual power of the prosumer participating in the transaction, and Ppredi is the planned trading power that the prosumer obtains according to the predicted internal heating load, wind power, and PV generation.From the uncertainty of wind power and PV generation, there is a chance that the deviation between the actual and planned trading power will be too large.The smaller is, the higher the accuracy of the power supply reported by the prosumer,and the higher the reputation of the prosumer.Conversely,the larger V is, the lower the reputation value of the prosumer.If the power provided by the prosumer cannot be fully absorbed via internal P2P, the unconsumed part will be absorbed by the grid company to achieve a balance between supply and demand.

There is a difference between the planned and actual trading power in P2P transactions, and the excess wind and photovoltaic power of incomplete transactions must be absorbed by the distribution network.Therefore, the prosumer unconsumed rate φ can be introduced to describe the two consumption situations of renewable energy in the P2P market.φ is the product of the ratio of the remaining and total power of the market and the self-absorption coefficient of the prosumer.The surplus trading power of the prosumer is the difference between the P2P trading power and the contributed trading power,

where EP is the surplus power of the seller/buyer after a transaction is completed, which is the new round of this prosumer’s trading power re-entering the market and with other prosumers, and λ is the unconsumed proportion of renewable energy.It is the ratio of the actual consumed power to renewable energy generation.Owing to the high daily fluctuation of wind power and PV units, the unconsumed power also changes significantly.Therefore,λ can be used to measure the prosumer’s self-absorption ability.

where Psum is the total power of all prosumers participating in this market transaction (assume that there are n prosumers entering the P2P market at time t to participate in the transaction).When Pactu is positive, the wind and PV generation power are fully absorbed, and the prosumer purchases power.When Pactu is negative, the prosumer sells power.The larger λ is, the higher the unconsumed ratio.The larger the unconsumed proportion of EP/Psum of renewable energy in the second market absorption, the higher the unconsumed degree of renewable energy.Therefore, the larger φ is, the lower the reputation.

The P2P transaction power deviation V and prosumer unconsumed rate φ update once after every transaction.According to V and φ, the intermediate reputation value of prosumer i during time t is as follows: It is the sum of the product of the predicted power deviation and the unconsumed rate of the prosumer and its weight.

where Vt i is the predicted power deviation of prosumer i at trading period tφit is the unconsumed rate of prosumer i at trading period t, and η and ξ are the weights of two indicators, representing the importance of the planned power deviation and wind power and PV absorption situation.

The incentive role that reputation and penalty factors play in the trading market can be shown by daily changes in η. Considering the influence of the wind power and PV absorption situation, the day deviation can be used as the intermediate variable to solve η.is the ratio of the total amount of P2P transactions to the actual total amount of transactions in the total time period.The larger is,the higher the degree of P2P participation in prosumers.η is determined according to the range of x values.The expression for x is as follows:

η can be obtained with x:

Furthermore, η can be defined by the range statistics obtained using Method 1 of the case study, as well as μ in equation (23).The reason for this is explained in detail in Section 4.1.

Prosumers only calculate the intermediate reputation value Ct once during every trading period.The reputation value Ri of prosumer i is the mean value with the extreme values of the intermediate reputation value removed.That is, the average value of the intermediate reputation values should be calculated with the maximum and minimum intermediate reputation values in T removed, as shown in equation (22).The penalty coefficient μ is determined according to the interval of the reputation value, as shown in equation (23).

where T=24 is the time of day during which transactions are made, Ct is the intermediate reputation value when any t∈T, Cmin is the minimum intermediate reputation value,and Cmax is the maximum intermediate reputation value in T.The lower the prosumer’s reputation value, the higher the penalty coefficient.If the penalty coefficient is negative, the reputation of this prosumer is good, and there will be incentives for this prosumer.Furthermore, prosumers can be incentivized to improve performance by improving the reputation.

3.2 Non-cooperative game based on the reputation value

After introducing the reputation incentive mechanism,the cost function of the prosumer can be changed via the penalty coefficient.Based on the objective function (3) of the prosumer, the payment function Cmf and trading price cmid in the objective function is changed via the penalty coefficient. Cq(.) and Ceq(.) from the objective function are still the inequality and equality constraints, respectively,and can be used in the P2P game model of multi-agent prosumers considering the reputation value.However,constraints must add a P2P intermediate price constraint because the trading power of every prosumer will change with historical reputation.The changes are as follows:

Specific changes in the payment/benefit function Cmf are reflected in the P2P and grid transaction costs.1) The penalty coefficient acts on the purchase and sales price and changes the P2P transaction cost by changing the trading price cmid t.2) The grid transaction cost Cp2g is changed in the form of (1+μ). The specific change trends are as follows:

(1) The change trend of the P2P transaction cost Cp2p:According to equation (23), on the seller’s side, reputation is positively correlated with priority.The smaller the penalty coefficient μ, the lower the sales price, and the lower the cost of the P2P transaction of the prosumer.Conversely,on the buyer’s side, reputation is negatively correlated with priority.The smaller the penalty coefficient μ, the higher the purchase price, and the higher the cost of the P2P transaction of the prosumer.The price is influenced through reputation value, which in turn affects the priority of prosumers in P2P transactions.This standardizes the market behavior and promotes prosumers to change the trading behavior to improve the reputation.

(2) The change trend of the grid transaction cost Cp2g:When the reputation is on a scale of 0 to 0.1, the value of the penalty coefficient is negative, and the prosumer’s grid transaction cost for the next day decreases.When the reputation is in the other range, the value of the penalty coefficient is positive, which causes the grid transaction cost of prosumers to increase.

The grid company acts as the mechanism supervisor,and its expenditure is related to the prosumer reputation changes when participating in P2P.The grid costs with or without considering the reputation are shown below.

(1) Without considering reputation:

The grid company is responsible for absorbing the unconsumed power and providing the power shortage in the first round of P2P transactions.The grid cost Cgrid is the difference between the sales and purchase power from the P2P market, which is expressed as

To ensure the interests of the grid company, the minimum purchase price is used to acquire unconsumed power and sell it at the maximum sales price to fill the gap.

(2) Considering reputation:

Owing to the existence of the penalty coefficient, except for transactions with certain prosumers, the penalty part of prosumers becomes the compensation of the grid company.The grid cost Cgrid is the difference between the sales and purchase power from the P2P market and the amount of compensation to the grid company by prosumers due to low reputation, which is represented as

The P2P transaction process considering reputation is also changed accordingly.The specific process is shown in Fig.2.This procedure emphasizes the entire process of P2P transactions, including the prosumer game procedure as the circular game part, that is, prosumer traversal, transaction matching, and P2P internal traversal.

As known in Fig.2, when a prosumer enters the P2P transaction platform, it sorts, matches, and trades via the transaction mechanism.The P2P transaction power deviation V and prosumer unconsumed rate φ are calculated only on the first transaction in each period.Then,the intermediate reputation value is determined.After determining the prosumer’s reputation value for the day, the penalty coefficient is predicated and used as the cost impact factor of the next day.

3.3 Solving method in the P2P game model

A game with reputation is still considered a complete information game.The Nikaido-Isoda and optimal response functions can be constructed for numerical iteration, and the Nash equilibrium problem is transformed into a nonlinear optimization problem to obtain the Nash equilibrium solution [24].

The Nash equilibrium solution is obtained using a relaxation algorithm.This algorithm not only breaks through the search for the Nash equilibrium solution under nonlinear conditions in theory, but can also be widely used in the power market and other important fields, extending the practical applications of Nash equilibrium.

Fig.2 P2P Transaction flowchart considering the reputation value

The Nikaido-Isoda function represents the sum of the income changes of all prosumers participating in P2P transactions.Prosumer i’s benefit change is the change when prosumer i modifies their strategy from pi to yi while other prosumer strategies remain the same.

To confirm the Nikaido-Isoda function, the payment function is divided into Cmf(xi) that changes with the prosumer’s strategy and constant unit operation and the maintenance cost ei.The prosumer payment function φi(pi)is shown below.Cmf(xi) that changes with the prosumer’s strategy includes the P2P and grid transaction costs.The P2P transaction cost is the sum of the product of the trading price with other prosumers and the supply and demand function within T.The grid transaction cost is the product of the on-grid price and surplus market power.

where μi is the penalty coefficient of prosumer ipi is the output of prosumer i at time t, and pload and p'load are the electricity turnover by the buyer and seller, respectively,which are functions of the purchase price cbuy and sales price csell.

The purchase price is higher by the buyer with higher priority in P2P transactions, whereas the sales price is lower by the seller with higher priority.Therefore, the electricity turnover is a strictly increasing function of cbuy for the buyer,and a strictly decreasing function of csell for the seller.The supply and demand function in period t can represent the sum of the supply and demand expectation and the elasticity of the purchase and sales prices, as shown in equation (29)and (30).

where and are the supply and demand expectation in period t, respectively, and α and β are the elasticity of supply and demand with respect to the purchase and sales prices, respectively.The purchase and sales prices can be represented as

Next, we construct the Nikaido-Isoda function for the sum of the income changes of prosumers, taking the participation of three prosumers in the P2P market as an example.The function is expressed by the sum of the income of each prosumer when each prosumer only changes its own strategy or does not change its own strategy.

where ci(yip) represents the benefit of prosumer i when they only change their own strategy, ci(pi) represents the benefit of the prosumer without changing any prosumer strategies, pi is the strategy of prosumer i before a change,pjp'j are the unchanged strategy of prosumer j(j≠i) as the buyer and seller, respectively, yi is the strategy of prosumer i after a change, is the elasticity of the supply and demand of prosumer i’s unchanged strategy, , are the unchanged strategies’elasticity of the supply and demand of prosumer j(j≠i) as the buyer and seller, respectively, and y0 is the elasticity of the supply and demand of prosumer i after changing the strategy.

Because the finished Nikaido-Isoda function ψpi yi)contains the positive square term of the variable pi and the negative square term of yi, equation (31) is a weak convexconcave function.

Subsequently, the optimal response function of the prosumer can be constructed, which is shown in equation (34).

To find the Nash equilibrium point, a relaxation algorithm is used to solve the optimal response function of the game.The algorithm can converge to Nash equilibrium for both static and dynamic games.As shown in equation(26), the payment/benefit function of the prosumer is known, and the optimal solution can be found directly via the relaxation algorithm.However, the existence of the Nash equilibrium solution must be determined before solving the most reactive function of the prosumer.The decision formula Q(py) is the difference between the sum of the change in the original strategy return and the sum of the change in the changed strategy return, which is presented in equation (35).By applying definite judgment via equation (31), μi is less than 2.When the value of β is positive, equation (35) is strictly definite.Then, the basic convergence condition is satisfied, and the Nash equilibrium solution exists.

According to the relaxation algorithm and game, an appropriate step size is chosen as the convergence condition.When the change in the step size is sufficiently small, the iteration stops and the Nash equilibrium solution is obtained [25].

4 Case studies

As shown in this section, a P2P transaction simulation was conducted on multi-agent prosumers considering reputation using modified IEEE 33-node and IEEE 123-node test systems to test the accuracy and feasibility of the distributed energy P2P transaction mechanism and multiagent game model considering the reputation incentive.

4.1 IEEE 33-node test system

The modified IEEE-33 node test system was taken as a case study.Different amounts of wind and photovoltaic power were installed at nodes 11, 21, and 30 to form Prosumers 1, 2, and 3, as shown in Fig.3.The technical parameters of the prosumers are shown in table 1.The load volumes of Prosumers 1, 2, and 3 were 462.16 MVA,380.48 MVA, and 408.48 MVA, respectively.

A one-year transaction simulation was conducted on Prosumers 1, 2, and 3 participating in P2P market transactions, and the following two calculation methods were adopted for the reputation value:

Method 1: Obtain the corresponding penalty coefficient μ according to the reputation value of the previous day.This method only performs reputation value statistics by day, and prosumers have high freedom in their correction behavior.

Method 2: Obtain the corresponding penalty coefficient μ according to the reputation value of all known historical statistics.This method takes all previous historical reputation values into account by averaging them.

Fig.3 Topology of the IEEE 33-node distribution network

Table 1 Prosumers’technical parameters

With Method 1, because the reputation value and day deviation are both obtained according to the previous day,the statistics are also saved as information, and equations(21) and (23) are able to define η and μ based on the range.In Method 1, the initial reputation value of all prosumers is 0 every day.Because the reputation is obtained according to the day before, it has high volatility with an obvious change trend.The change trend of the reputation value of different prosumers in Method 2 is shown in Fig.4.Because the reputation value is an artificially defined variable, it is dimensionless.

Fig.4 Changing trend of the reputation value of prosumers

After a short period of large oscillations and a longer period of small fluctuations, the reputation values of all prosumers were maintained within a stable range.Prosumer 1 maintained [-0.02, 0], Prosumer 2 maintained [0.05, 0.08],and Prosumer 3 maintained [0, 0.03].Subsequently, it was verified from the aspects of prosumer costs and grid benefits that maintaining the reputation value in a stable range can stably and continuously feedback prosumer behaviors and guide prosumers to conduct standardized trading behaviors.This is conducive to the construction of a good transaction environment.

Considering the above two reputation value calculation methods, the following three scenes were set for comparison:

Scene 1: The prosumer participates in the traditional multi-agent prosumer P2P transaction without considering the reputation value.

Scene 2: The prosumer participates in the muti-agent prosumer P2P transaction inspired by the reputation incentive system of Method 1.

Scene 3: The prosumer participates in the muti-agent prosumer P2P transaction inspired by the reputation incentive system of Method 2.

Owing to the uncertainty of the distributed energy output, ten typical days in the transaction simulation were selected as examples to study the prosumers’behavioral changes and the grid benefits in the three scenes.

The cost changes of prosumers participating in P2P in the three scenes are shown in Figures 5-7.Compared to Scene 1, the costs of Prosumers 1, 2 and, 3 fluctuated in the ranges -17.3-8.5%, -17.1-9.7%, and -12.7-7.6%,respectively in Scene 2, whereas the costs of Prosumers 1,2, and 3 fluctuated in the ranges -13.8-4%, -7.3-9.8%,and -2.4-0.7% in Scene 3.Prosumer costs fluctuated less in Scene 3 than in the first two scenes.Therefore,the reputation value can be used as a factor affecting the prosumer’s P2P and grid transaction costs.When the reputation tended to be stable, the fluctuation of the prosumer cost obviously decreased.

Fig.5 Cost of prosumer 1

Fig.6 Cost of prosumer 2

Fig.7 Cost of prosumer 3

Prosumer 1’s reputation value was stable in the range[-0.02,0] in Scene 3.Compared to Scene 1, there were seven out of ten typical days, and the cost significantly decreased.Although there were three typical days of increased cost, the cost of Scene 3 was still lower than that of Scene 2.Prosumer 2’s reputation value was stable in the range [0.05,0.08] in Scene 3.Compared to Scene 1, although the cost decreased on only two typical days, the cost growth rate of Prosumer 2 remained below 5% on a typical day when the cost increased.Prosumer 3’s reputation value was stable in the range [0, 0.03] in Scene 3.On a typical day of cost increase, Prosumer 3 could effectively maintain the cost fluctuation below 1%.The cost reduction and fluctuation range of all prosumers were affected by reputation, which indicates that the changing trend of prosumer cost is related to reputation.If the reputation value is stable at a negative value, the prosumer will have a high reputation value, and there are more days in which costs are decreased.If the reputation value is stable at a positive value, there are more days in which costs are increased.Moreover, a reputation value closer to 0 would indicate that the behavior is no longer affected by reputation.It was closer in Scene 1, and the volatility was small.

The stability of reputation will directly affect the stability and fluctuation of prosumer cost.As an influencing factor,the reputation value not only affects P2P transactions, but also grid transactions.Because the transaction part of the grid accounts for a large proportion, the change in the grid transaction cost is the main change, and the change (1+ μ) is the changing trend of the cost.

The total revenue changes of the grid under different scenes after a further analysis and comparison before and after the introduction of the reputation system are shown in Fig.8.The grid company bought power at a high price and sold power at a low price to balance the P2P supply and demand in Scene 2.Prosumers chose to trade with other prosumers than with the grid company in Scenes 2 and 3.Therefore, compared to Scene 1, there were five typical days in which the grid benefit increased, namely, 3, 4, 5, 9,and 10 in Scenes 2 and 3.With the change in trading time,the reputation value in Scene 3 gradually tended toward stability within -5.8-1.9%.A stable reputation value allows the grid company to adjust the regulatory market trading behavior to improve benefits to the grid.

Fig.8 Total revenue of the power grid

4.2 IEEE 123-node test system

The modified IEEE 123-node distribution system was taken as another case study to further verify the applicability of the proposed transaction mechanism and model in largescale systems.Different numbers of wind power and PV units were installed at nodes 5, 19, 29, 43, 51, 57, 62, 75,91, and 112 to form Prosumers 1-10, respectively.The distribution network topology of the IEEE-123 nodes is shown in Fig.9.The relevant technical parameters of every prosumer are shown in tables 2 and 3.The load volume of Prosumers 1-10 were 445.84 MVA, 366.88 MVA, 405.76 MVA, 372.16 MVA, 352.32 MVA, 412.08 MVA, 434 MVA, 330.48 MVA, 491.2 MVA, 456.8 MVA,respectively.

Fig.9 Topology of the IEEE 123-node distribution network

The reputation value calculation method (2) in Section 4.1 was adopted to simulate P2P transactions for prosumers,and the following two scenes were set for comparison:

Scene 1: The prosumer participates in the traditional multi-agent prosumer P2P transaction without considering the reputation value.

Scene 2: The prosumer participates in the muti-agent prosumer P2P transaction inspired by the reputation incentive system of Method 2.

The variation trend of the reputation value of every prosumer is shown in Fig.13.Similar to Fig.4, after a period of fluctuation among prosumers, the reputation value gradually stabilized at approximately 250 days as the dividing line.Owing to the large number of prosumers,compared with the small and medium-sized system in Section 5.1, the reputation value of the large-scale system in Section 5.2 reached the stability time slightly later.As shown in Figure 10, the reputation values of prosumers wereas follows: Those of Prosumers 3, 5, 7, 9, and 10 remained at or above 0.1; those of Prosumers 1, 4, 6, and 8 remained in the range [0, 0.05]; and that of Prosumer 2 was below 0.

Table 2 Technical parameters of prosumers

Table 3 Technical parameters of prosumers

Fig.10 Changing trend of the reputation value of prosumers

To analyze the influence of reputation value on prosumer transaction costs, eight typical days were selected to study the comparative results of the cost changes of the ten prosumers in the two scenes, as shown in Fig.11.The cost increase of Prosumers 3, 5, 7, 9, and 10, whose reputation values were stable above 0.1 (including 0.1), was more obvious.The cost of Prosumers 1, 4, 6, and 8, whose reputation values were stable at [0,0.05], increased slightly.The cost of Prosumer 2, whose reputation value was stable below 0, decreased.

Fig.11 Cost changes of prosumers under different cases

Fig.12 compares the changes in grid benefits under the two scenes.As shown in Fig.12, the grid expenditure in Scene 2 was significantly lower than that in Scene 1.When prosumers with a low reputation accounted for a large proportion, that is, in the case of this example, the cost of prosumers with a low reputation generally increased,making the grid company act as the system regulator and increasing the grid’s income.Conversely, when prosumers with a high reputation accounted for a significant proportion,the grid company offered cost subsidies for prosumers with a high reputation, and the grid’s income decreased.

Fig.12 Total revenue of the power grid

5 Conclusions

This study designed a distributed energy P2P transaction mechanism based on reputation and constructed a related model considering the reputation incentive mechanism.The reputation value was determined using the two indexes of the P2P transaction power deviation and prosumer unconsumed rate to guide and improve the behavior of prosumers in the form of the reputation penalty coefficient.In the modified IEEE-33 node and IEEE-123 node systems,multi-agent prosumers were considered to analyze and verify the proposed mechanism.The relevant conclusions are as follows:

(1) Compared with the reputation calculation method based on the reputation value of the previous day only, the reputation value calculated using all historical reputation values eventually tended to stabilize, which can better guide prosumers to reasonably participate in P2P market transactions.

(2) The introduction of reputation can affect the prosumer’s cost.The reputation of prosumers can motivate prosumers to change their strategies by influencing their costs, thereby urging them to standardize and improve their own trading behaviors.

(3) The introduction of reputation can affect grid expenditure.As the system regulator, the grid company reduces its total expenditure when the more prolific prosumers have a lower reputation and subsidizes the cost to the more reputable prosumers when the more prolific prosumers have a higher reputation, thus increasing the total expenditure of the grid.A proper introduction of the reputation system can create a win-win situation for both prosumers and the grid company.

This study did not consider the influence of grid security constraints on prosumer P2P transactions.In a subsequent study, the authors will further explore a P2P transaction method that considers the security constraints of the power system under the reputation incentive to improve the entire transaction mechanism.

Acknowledgements

This study was supported by the National Natural Science Foundation of China (U2066211, 52177124,52107134), the Institute of Electrical Engineering,CAS(E155610101), the DNL Cooperation Fund, CAS(DNL202023), and the Youth Innovation Promotion Association of CAS (2019143).

Declaration of Competing Interest

We declare that we have no conflict of interest.

References

[1] Yang J W, Paudel A, Gooi H B, et al.(2021) A proof-of-stake public blockchain based pricing scheme for peer-to-peer energy trading.Applied Energy, 298, 117154

[2] Mengelkamp E, Gärttner J, Rock K, et al.(2018) Designing microgrid energy markets.Applied Energy, 210, 870-880

[3] Tushar W, Saha T K, Yuen C, et al.(2020) Peer-to-peer trading in electricity networks: an overview.IEEE Transactions on Smart Grid, 11(4): 3185-3200

[4] Alvaro-Hermana R, Fraile-Ardanuy J, Zufiria P J, et al.(2016)Peer to peer energy trading with electric vehicles.IEEE Intelligent Transportation Systems Magazine, 8(3): 33-44

[5] Si F Y, Wang J K, Han Y H, et al.(2018) Cost-efficient multienergy management with flexible complementarity strategy for energy internet.Applied Energy, 231: 803-815

[6] Kang J W, Yu R, Huang X M, et al.(2017) Enabling localized peer-to-peer electricity trading among plug-in hybrid electric vehicles using consortium blockchains.IEEE Transactions on Industrial Informatics, 13(6): 3154-3164

[7] Wang S, Taha A F, Wang J H, et al.(2019) Energy crowdsourcing and peer-to-peer energy trading in blockchainenabled smart grids.IEEE Transactions on Systems, Man, and Cybernetics: Systems, 1-12

[8] Tushar W, Saha T K, Yuen C, et al.(2019) A motivational gametheoretic approach for peer-to-peer energy trading in the smart grid.Applied Energy, 243: 10-20

[9] Noor S, Yang W T, Guo M, et al.(2018) Energy demand side management within micro-grid networks enhanced by blockchain.Applied Energy, 228: 1385-1398

[10] Peng H, Zhao D D, Han J M, et al.(2014) Reputation evaluation algorithm based on transitive mode of reputation optimization in P2P system.Journal of Shandong University (Natural Science),49(9): 97-102+108

[11] Qi B, Xia Y, Li B, et al.(2019) Photovoltaic trading mechanism design based on blockchain-based incentive mechanism.Automation of Electric Power Systems, 43(9): 132-139, 153

[12] Jin K Y, Yang J H, Chen Z, et al.( 2021) Blockchain-based transaction model of distributed photovoltaic generation for local power consumption.Electric Power, 54(5): 7-16

[13] Alskaif T, Luna A C, Zapata M G, et al.(2017) Reputationbased joint scheduling of households appliances and storage in a microgrid with a shared battery.Energy and Buildings, 138: 228-239

[14] Zhang X H, Song Z L, Moshayedi A J (2022) Security scheduling and transaction mechanism of virtual power plants based on dual blockchains.Journal of Cloud Computing, 11(1)

[15] Janko S, Johnson N G (2020) Reputation-based competitive pricing negotiation and power trading for grid-connected microgrid networks.Applied Energy, 277, 115598

[16] Zhang L B, Tong L M, Zhou D Q, et al.(2021) Weaklycentralized electricity transaction mechanism of park energy internet based on Blockchain.Guangdong Electric Power, 34(2):1-9

[17] Dou X B, Cao S J, Liu Z H, et al.(2019) Trading mechanism,model and technical realization of weakly-centralized distribution network market.Automation of Electric Power Systems, 43(12):104-112

[18] Gao C W, Tong G G (2018) Application Analysis of blockchain technology in market-based transaction of distributed generation.Demand Side Management,20(4): 1-4, 15

[19] Li G, Zhao L Y, Guan X, et al.(2021) Security transaction mechanism of energy blockchain applying game strategy.Electric Power Construction, 42(12): 127-135

[20] Cao J Q, li S X, Fan B, et al.(2017) Blockchain based energy trading in energy internet.Electric Power Construction,38(9): 24-31

[21] Abuduwayiti X W, Lv H P, Chao Q (2022) Optimal capacity configuration of wind-photovoltaic-hydrogen microgrid based on non-cooperative game theory.Jiangsu Electrical Engineering,41(02): 110-118

[22] Jia H, Peng J Q, Li N P, et al.(2021) Optimization and economic analysis of distributed photovoltaic-energy storage system under dynamic electricity price.Acta Energiae Solaris Sinica, 42(5):187-193

[23] Chao L, Wu J, Zhang C, et al.(2017).Peer-to-peer energy trading in a community microgrid.IEEE Power & Energy Society General Meeting

[24] Kong G W (2008) Algorithm and application of Nash equilibrium and general equilibrium under weak convex-concave condition.Shanghai: Fudan University

[25] Li X S, Yang X Y (2020) Optimization dispatching for joint operation of hydrogen storage-wind power and cascade hydropower station based on bidirectional electricity price compensation.Power System Technology, 44(09): 3297-3306

[26] Li M C, Wang J L, Lu K, et al.(2016).A Novel reputation management mechanism with forgiveness in P2P file sharing networks.Procedia Computer Science, 94: 360-365

[27] Meng X F (2018) speedTrust: a super peer-guaranteed trust model in hybrid P2P networks.The Journal of Supercomputing,74(6): 2553-2580

[28] Hu J L, Zhou B, Wu Q Y (2011) Research on trust management with incentive mechanism in P2P network.Journal on Communications, (05): 22-32

[29] Qin Z G, Yang Y, Yang L, et al.(2013) Using push-pull mode to achieve reputation system in P2P networks.Computer Engineering and Applications, 49(5): 88-92

[30] Ma R FRongfei.(2019) Super node selection algorithm combining reputation and capability model in P2P streaming media network.Personal and Ubiquitous Computing

[31] Meng X F, Wang D (2010) Collaboration incentive reputation model based on repeated game theory and punishment mechanism in P2P networks.Journal of Computer-Aided Design &Computer Graphics, 22(05): 886-893

[32] Chen Y X, Wen Y, Zhao C, et al.(2023) Decentralized trading mechanism in photovoltaic distribution network based on blockchain.Journal Of Electrical Engineering, 1-9

Received: 9 January 2023/ Accepted: 3 April 2023/ Published: 25 June 2023

Hao Xiao

xiaohao09@mail.iee.ac.cn

Tao Jiang

electricpowersys@163.com

Ting Hua

huatingNEEPU@163.com

Linbo Fu

fulinbo123@aliyun.com

Wei Pei

peiwei@mail.iee.ac.cn

Tengfei Ma

flytengma@mail.iee.ac.cn

2096-5117/© 2023 Global Energy Interconnection Development and Cooperation Organization.Production and hosting by Elsevier B.V.on behalf of KeAi Communications Co., Ltd.This is an open access article under the CC BY-NC-ND license (http: //creativecommons.org/licenses/by-nc-nd/4.0/ ).

Biographies

Tao Jiang (Senior Member, IEEE) received B.S.and M.S.degrees in electrical engineering from Northeast Electric Power University,Jilin, China, in 2006 and 2011, respectively,and a Ph.D.degree in electrical engineering from Tianjin University, Tianjin, China,in 2015.He is currently a Professor with the Department of Electrical Engineering,Northeast Electric Power University.From 2014 to 2015, he was a Visiting Scholar with the Department of Electrical and Computer Engineering, North Carolina State University, Raleigh, NC, USA.From October 2018 to October 2019, he was a Visiting Scholar with the Department of Electrical Engineering and Computer Science,University of Tennessee, Knoxville, TN, USA.His research interests include power system stability analysis and control, renewable energy integration, demand response, and smart grids.

Ting Hua is currently pursuing a master’s degree in electrical engineering from Northeast Electric Power University, Jilin, China.Her research interests include distribution market transactions and P2P transactions.

Hao Xiao (Member, IEEE) received a B.S.degree from Huazhong University of Science and Technology, Wuhan, China, in 2009, and a Ph.D.degree in electrical engineering from the Chinese Academy of Sciences, Beijing, China,in 2015.He is currently an Associate Professor with the Institute of Electrical Engineering,Chinese Academy of Sciences.His research interests include optimal operation and planning of power systems and the application of artificial intelligence in power systems.

Linbo Fu received a master’s degree in electrical engineering from Northeast Electric Power University, Jilin, China, in 2018, where she is currently pursuing a Ph.D.degree.Her research interests include distribution market operation and transaction.

Wei Pei (Member, IEEE) received B.S.and M.S.degrees in electrical engineering from Tianjin University, Tianjin, China, in 2002 and 2005, respectively, and a Ph.D.degree in electrical engineering from the Institute of Electrical Engineering, Chinese Academy of Sciences, Beijing, China, in 2008.He is currently a Professor and the Director of the Distributed Generation and Power System Research Group, Institute of Electrical Engineering, Chinese Academy of Sciences.His research interests include the impact of the integration of renewable energy sources on the electricity grid and the development of the distribution grid for large-scale renewable integration, active distribution networks, and ac/dc microgrids.During the past 10 years, he has been principal investigator to several projects funded mostly by the National Key Research and Development Program of China, National Natural Science Foundation of China, which were mostly related to new generation information technology and smart power systems.He has published over 160 papers in peer-reviewed journals and international conferences.He is Associate Editor of IET Smart Grid, IET Energy Systems Integration, and he is a Young Editor of the CSEE Journal of Power and Energy Systems, Power System Technology and High Voltage Engineering.

Tengfei Ma received a Ph.D.degree from Beijing Jiao tong University in 2019.He was a postdoc at the institute of electrical engineering, Chinese Academy of Sciences,from 2019 to 2021.He was a joint training doctor at the university of Texas at Arlington,USA from 2017 to 2018.Now, he is a research assistant at the Institute of Electrical Engineering, Chinese Academy of Sciences.He has published more than ten journal papers and hosted or joined more than five National Natural Science Foundation projects.His research interests include the modeling, operation, and planning of integrated energy systems,and engineering game theory in power systems.

(Editor Yanbo Wang)

  • 目录

    图1