当前位置:首页 >> 材料科学 >>

Distributed learning approach for channel selection in Cognitive Radio Networks


Distributed Learning Approach for Channel Selection in Cognitive Radio Networks
Chowdhury Sayeed Hyder and Li Xiao
Department of Computer Science & Engineering Michigan State University East Lansing, Michigan 48823 Email: {hydercho, lxiao}@cse.msu.edu I. I NTRODUCTION Cognitive Radio Network (CRN) is an emerging networking concept that has been introduced to solve the spectrum scarcity problem. There are typically two kinds of users in CRN primary users (PU) and secondary users (SU). SUs can opportunistically use licensed channels without interfering with the ongoing transmission of PUs. The successful transmission depends mostly on the selection of idle channels by SUs. Hence, channel selection under the aforementioned constraints is a signi?cant research problem. Accordingly, many researchers have developed channel selection algorithms based on different variations of PU channel usage models (e.g. Markov model) or adopted learning algorithms. However, in real scenarios, SUs do not know the activity of PUs in advance and channel usage by PUs may not exactly follow any mathematical model. On the other hand, existing learning algorithms (e.g. [1],[2]) either need a central base station or exchange of message passing between neighbors for cooperation and collision avoidance. Due to the unique characteristics of CRN, the centralized solution may not always be feasible and message passing among nodes will introduce excessive overhead. The learning algorithm proposed in [3] overcomes these problems. In this approach, channels are ranked according to g-statistics. Although this learning algorithm achieves logarithmic regret, the sum regret under this approach is still high and it does not scale well with the increase of users. As a result, performance degrades when the ratio between the number of users and channels exceeds a certain threshold. The aggravated delay and packet loss introduced from channel switching may degrade the performance of the overall network. In the case of CRN, the effect can be more intense. Since every learning algorithm involves exploration and exploitation, ignoring channel switching cost may radically impact the performance. Furthermore, the work in [3] did not consider dynamic channel idle rates. In real life, primary users’ (e.g. cell phone users) activity varies with time, so channels’ idle rates also vary accordingly. Our algorithm adapts to such realistic changing environment and achieves same logarithmic regret. In this paper, we address the channel selection problem with switching cost and propose a distributed learning approach that
978-1-4577-0103-0/11/$26.00 c 2011 IEEE

minimizes the sum regret while ensuring quick convergence to an optimal solution and logarithmic regret. Our algorithm is adaptive in the sense that it adapts to the changing idle status of channels and achieves logarithmic regret even in a dynamic environment. The experimental result shows that our algorithm outperforms the algorithm in [3] in terms of regret, scalability and channel switching cost. II. P ROBLEM F ORMULATION We consider a slotted network where every SU has only one radio and cannot sense and transmit at a time. So, each SU goes through a cycle of sensing channel and transmitting data if it is free. Consider a network where U ≥ 1 be the number of secondary users and C ≥ U be the available channels. We assume that each channel i has a ?xed probability ?i of being idle (or, not being occupied by any PU). Every secondary user can sense only one channel in each transmission slot and transmits if the channel is idle. A successful transmission is considered as reward. Hence, SU receives reward if the channel is idle and no other SU transmits over the same channel. We consider regret bound over the time averaged throughput or utilization rate to evaluate the performance of our algorithm because even a sub-linear regret (with respect to time) implies optimal average throughput [3]. In respect to our work, we de?ne regret as the expected loss in achieving reward due to not selecting one of the top available channels compared to knowing the channel statistics perfectly. Ideally if the users know about the channel availability, they will select only the top channels. In this case, the expected reward, S (ρ, t) will be:
U

S (ρ, t) = t ?
i=1

ri ? ?? i

th where ?? ranked (in terms i denotes idle probability of i of availability) channel and ri denotes the data rate of ith channel. For simplicity, we assume data rate of all channels are equal and commonly represent as r. Any learning approach except ρ tries to select a channel with higher availability through exploration. Let, S (? ρ, t) denotes the expected reward (i.e. number of successful data transmissions) by a learning approach ρ ?. It is obvious that S (? ρ, t) ≤ S (ρ, t).

The expected regret R(? ρ, t) can be determined using following equation. R(? ρ, t) = S (ρ, t) ? S (? ρ, t) + Γ(? ρ, t) (1) where S (ρ, t) ? S (? ρ, t) and Γ(? ρ, t) denote the expected exploration regret and the channel switching regret respectively. 1) Calculation of Γ(? ρ, t): Let, γ (t) denotes the total numU ber of switches up to t, γ (t) = j =1 γj (t) where γj (t) denotes the number of switches by user j .
t?1

γj (t) =
i=1

F (aj (i), aj (i + 1))

where, 1, a = b 0 otherwise Accordingly, we de?ne switching cost as the number of packets that could have been transmitted within the time if it did not switch the channel. The switching cost de?ned in [4] is switching delay switching cost = estimated packet transmission time F (a, b) = Assuming packet size d and switching delay τ , we formally de?ne switching cost c as τ ?r c= (2) d 2) Calculation of S (? ρ, t): In a distributed approach, there is a possibility that more than one SU select the same channel. As a result, no user receives any reward. In that case, the number of collision depends on the underlying collision mechanism. Assuming the collision channel model with no avoidance mechanism, the expected reward following a distributed approach ρ ?dist de?ned in [3] is,
C U

is q at the end of a block, the user chooses channel q over p if ? p is one of the top U channels and q is not. But upper con?dence bound of q is greater than the estimated idle rate of p. ? q is one of the top U channels but p is not. Each user selects a channel and updates its idle estimate and upper bound at each time slot throughout the time period. 1) Calculation of Block and Frame Size: We divide transmission slots into frames. Each frame (except frame 0 which consists of ?rst C slots) is then divided into multiple equal length blocks. Block length increases with frame number. Each block length is set equal to the frame number and frame length is set according to the following equation. lf = C ? 2Nf
2

where lf and Nf denotes frame length and frame number. 2) Computation of Upper Con?dence Bound of Channel: We use the interval estimation strategy [5] to compute the upper con?dence bound of channels. The strategy works based on expected reward estimates of channels and con?dence estimates of data. In our case, it is sensing data. It uses the standard statistical technique of constructing con?dence interval estimates. If xi,j (t) be the number of successes arising from a series of Bernoulli trials (πi,j (t)) with probability p, the upper bound of a 100(1 ? α)% con?dence interval for p can be approximated by βi,j (t). βi,j (t) =
xi,j (t) πi,j (t)

+

2 Zα/ 2 2πi,j (t)

1+

2 Zα/ 2 πi,j (t)

S (? ρdist , t) = r ?
i=1 j =1

?i E[Θi,j (t)]

(3)

+

Z √ α/2 π i,j (t)

xi,j (t) πi,j (t)

1?

xi,j (t) πi,j (t)

+

2 Zα/ 2 4?πi,j (t)

where Θi,j (t) denotes the number of times user j has the sole access on channel i up to transmission slot t. The expected regret in distributed approach will be
U C U

1+

2 Zα/ 2 πi,j (t)

(5)

R(? ρdist , t) = r ? t
i=1

?? i ?r?
i=1 j =1

?i E[Θi,j (t)] + γ (t) ? c (4)

III. D ISTRIBUTED L EARNING A LGORITHM We formulate the channel selection problem as a multi-arm bandit problem with multiple play and switching cost. and develop a distributed block learning algorithm (ρDBLA ). A. Design of the Algorithm At the beginning of each interval, the user picks a channel and compares it with the currently selected one. The channel selection process is periodic i.e. each channel is selected for comparison after C blocks. Then it continues to use that channel until collision occurs. If a collision occurs, it randomly selects a channel among the top U channels based on its current idle estimates of channels. Considering a user whose present selected channel is p and the channel to be compared 2

Here, the parameter Zα/2 denotes the value that will be exceeded by the standard normal variable with probability α/2. It controls the size of the con?dence intervals. We follow the evaluation function stated in Equation 5 to compute the upper bound of each channel. In our algorithm, we set α = 2.5% and the corresponding Zα/2 value is 1.96. In respect to our algorithm, xi,j (t) represents the number of times user j sees channel i idle up to transmission slot t. Initially each of the channels will have an upper bound of 1. As more channels are explored, more the bounds will be adjusted close to its exact value. B. Non-stationary Environments We extend our algorithm so that it can adapt in nonstationary environments. If the idle rate of the current channel in the current frame is not consistent with its overall idle rate, we increase the upper bound of every alternative channels to a certain value. In this way, if the selected channel status changes abruptly, a user will experience less availability of the current channel. This will lead to increase the upper bound

Fig. 1.

Normalized Regret vs Time Slot (without channel switching cost)

(a)

(b)

Fig. 3. (a) Regret and (b) Switching per user with varying number of Users

Fig. 2.

Regret vs switch cost after 5000 time slot

Fig. 4.

Regret vs Time in Dynamic Environment

of alternative channels and thus probability of selecting a channel from these alternatives will increase. As soon as a user switches to another channel, it again starts acting accordingly. IV. S IMULATION R ESULTS In this section, we present simulation results to evaluate the performance of our learning approach. We compare performance of our algorithm ρDBLA with the learning approach ρRAN D proposed in [3]. We set up the identical simulation environment. To get the average result, we run the simulation for 10 times under the same conditions and each time our algorithm runs for 50, 000 time slots. We also implement our centralized approach ρCBLA . This will provide a lower bound of regret achievable by SUs. Performance Comparison with ρRAN D : We present the comparison results of our algorithm (denoted as ρDBLA ) and the algorithm (denoted as ρRAN D ) in [3]. The simulation result in Figure 1 shows the normalized regret for both the approaches with respect to number of transmission slots without considering switching cost. Clearly, our algorithm ensures faster convergence with less regret compared to [3]. We also plot our centralized algorithm (ρCBLA ) to show the lower bound of regret. We then compare the performance of both the algorithms with varying switching cost (see Figure 2). As expected, regret increases almost linearly with switching cost in ρRAN D . On the other hand, regret increases slightly with switching cost in our algorithm ρDBLA . Performance with Varying Number of Users: We compare the performance of ρRAN D and our ρDBLA in terms of regret and number of channel switching. The results are presented in Figure 3(a) and Figure 3(b) respectively. In both ?gures (regret and channel switching), the rate of increase for ρRAN D is much higher than ρDBLA . In former algorithm, collision between users increase with the increase of number of users because in every slot, it tries to select the top channel and it 3

may lead to more channel switching and more collision with other users. On the other hand, our algorithm performs better because it carefully decides about channel switching after each block. Performance in Non-stationary Environments: To simulate a non-stationary environment, we assume that there are 4 secondary users and 9 channels in the network. Initially, channels’ idle rates follow a Bernoulli distribution with evenly spaced probability from 0.1 to 0.9. We run the simulation for 50, 000 slots. After each 10, 000 slots, we randomly assign each channel a different probability from 0.1 to 0.9. Figure 4 shows that our algorithm achieves logarithmic regret even in the dynamically changing channel states. V. C ONCLUSION & F UTURE W ORK In this paper, we propose an ef?cient distributed learning algorithm that minimizes the sum regret and channel switching cost, and ensures logarithmic regret. Our algorithm also adapts quickly in a dynamic environment and converges to an optimal solution. We compare our algorithm with existing approach and our algorithm achieves signi?cantly lower regret and switching cost than existing ones. We will extend our work to support users’ different QoS requirements and channel bandwidth. R EFERENCES
[1] S.H.A. Ahmad and Mingyan Liu. Multi-channel opportunistic access: A case of restless bandits with multiple plays. In Communication, Control, and Computing, 2009. Allerton 2009. 47th Annual Allerton Conference on, pages 1361 –1368, sept. 2009. [2] Yi Gai, B. Krishnamachari, and R. Jain. Learning multiuser channel allocations in cognitive radio networks: A combinatorial multi-armed bandit formulation. In New Frontiers in Dynamic Spectrum, 2010 IEEE Symposium on, pages 1 –9, 6-9 2010. [3] A. Anandkumar, N. Michael, and A.K. Tang. Opportunistic Spectrum Access with Multiple Users: Learning Under Competition. In Proc. of IEEE INFOCOM, San Deigo, USA, March 2010. [4] Yang Xiao and Fei Hu. Cognitive Radio Networks. CRC Press, 2008. [5] L.P.Kaebling. In Learning in Embedded Systems. MIT Press, 1993.


相关文章:
...channel selection strategies in cognitive radio networks_....pdf
(已打印)Study of channel selection strategies in cognitive radio networks - 2012 12th International ...
认知Ad Hoc网络能量有效频谱接入策略.pdf
incognitiveradioadhoenetwork,inwhich secondaryan user sensedthechannelsli- censcdtosomeprimaryuserssequentiallybeforeitdecidedtotransmit.Wedevelopedenergy analytical...
...Algorithm for Channel Selection in Cognitive Radio Networks.unkown
DBLA: Distributed Block Learning Algorithm for Channel Selection in Cognitive Radio Networks Chowdhury Sayeed Hyder Department of Computer Science and Engineering...
...for Channel Selection in Cognitive Radio Networks: A ....unkown
Multi-User Cooperation for Channel Selection in Cognitive Radio Networks: A Bayesian Approach Guangxiang Yuan, Yi Zhu, Yu Zhang§, Yang Yang...
...for Transport-Aware Channel Selection in Cognitive Radio ....unkown
Channel Selection in Cognitive Radio Networks Shih-Ho Chen, Tein-Yaw Chung,...This is an open access article distributed under the Creative Commons ...
Distributed Channel Selection Protocol for Single-Hop ....unkown
Distributed Channel Selection Protocol for Single-Hop Transmissions in Cognitive Radio Networks Christopher N. Ververidis Department of Wireless Networks, RWTH ...
QoS-Aware Channel Selection in Cognitive Radio Networks: A ....unkown
QoS-Aware Channel Selection in Cognitive Radio Networks: A Game-Theoretic Approach Hai Ngoc Pham1, Jie Xiang2, Yan Zhang2, Tor Skeie1,2 1Department ...
SURF: A Distributed Channel Selection Strategy for Data ....unkown
SURF: A Distributed Channel Selection Strategy for Data Dissemination in Multi-Hop Cognitive Radio Networks Mubashir Husain Rehmani, Aline Carneiro Viana, ...
Smart Channel Switching in Cognitive Radio Networks.pdf
Smart Channel Switching in Cognitive Radio Networks...Many researches are focused on the methods which ... Distributed learning a... 暂无评价 3页 免费 ...
...Channel Allocation and Power Control in Cognitive Radio ....unkown
//www.Jofcis.com Distributed Multi-agent Q-learning for Joint Channel Allocation and Power Control in Cognitive Radio Networks Latifa BOUMEDIENE 1,...
Distributed Channel Assignment in Cognitive Radio.pdf
Distributed Channel Assignment in Cognitive Radio_信息与通信_工程科技_专业资料...erent methods for channel assignment in an ad hoc CR network. According to...
Chain store game based channel allocation iu cognitive radio ....pdf
Chain store game based channel allocation iu cognitive radio system_专业资料。Avil nieataleol abnwww.denedrc.OT scittIeC∞ , 静 :cneic Si ...
...channel selection scheme for cognitive radio networks with....pdf
cognitive radio networks with variable channel ...Davie, Computer Networks: A System Approach, 3rd... Distributed learning a... 暂无评价 3页 免费 ...
Cognitive radio channel.doc
Cognitive radio channel_英语学习_外语学习_教育专区... refers to spectrum sensing methods where ...Cognitive radio networks target to use the ...
...VIA CONFLICT SHIFTING FOR DISTRIBUTED COGNITIVE NETWORKS_....pdf
A CHANNEL ASSIGNMENT ALGORITHM VIA CONFLICT SHIFTING FOR DISTRIBUTED COGNITIVE NETWORKS_电子/电路_工程科技_专业资料。V170 o2 ..N5 垒 9FEETOIS(HN) LCRNC...
...leasing in cooperative cognitive radio network_图文.pdf
relay selection and power allocation for spectrum ...Literature [10] proposes a distributed relay ...channel gain in cognitive radio networks, which ...
Channel Selection under Interference Temperature Model in ....pdf
Channel Selection under Interference Temperature Model...Cognitive Mesh Network A cognitive radio is a ... learn from the history, and make intelligent ...
一种改进的基于MMAC协议的认知无线网络MAC协议.pdf
thenodesinthenetworksreceiveandsenddatabysharedwirelesscanuse channels.Whether ...radionetwork.Focusing designingandimplementingan excellentMACprotocolincognitiveon ...
...Access in Cognitive Radio Networks with Imperfect Channel ....pdf
Energy Aware Spectrum Access in Cognitive Radio Networks with Imperfect Channel Sensing_电子/电路_工程科技_专业资料。LETRTE Cognii adi tveRoNetorswihwk t ...
Traffic Prediction for Cognitive Networking in Multi-Channel ....pdf
channel selection in multi-channel wireless networks...learn, plan and act in a way that meets ...Cognitive networking differs from cognitive radios ...
更多相关标签: