A distributed algorithm for semantic collectors election in wireless sensors networks

Energy consumption has been considered as a major constraint in Wireless Sensor Networks (WSN) (Vieira et al., 2003; Jinwala et al., 2009; Padmavathy and Chitra, 2010) since without energy, a sensor node is essentially useless and cannot contribute to the network as a whole. Hence, the wireless sensor nodes or “motes” must be careful about how they spend their energy to prolong the network lifetime. A sensor node is driven by electronics, and electronics need power. Despite several other possibilities for power, such as solar cells, thermal energy, piezoelectricity and other forms of energy harvesting (Seah et al., 2009; Sudevalayam and Kulkarni, 2011) the most common power source is still a battery. In this work, we are interested in extending the network lifetime of a WSN with batterypowered nodes. To accomplish this, it is nec-


Introduction
Energy consumption has been considered as a major constraint in Wireless Sensor Networks (WSN) (Vieira et al., 2003;Jinwala et al., 2009;Padmavathy and Chitra, 2010) since without energy, a sensor node is essentially useless and cannot contribute to the network as a whole.Hence, the wireless sensor nodes or "motes" must be careful about how they spend their energy to prolong the network lifetime.A sensor node is driven by electronics, and electronics need power.Despite several other possibilities for power, such as solar cells, thermal energy, piezoelectricity and other forms of energy harvesting (Seah et al., 2009;Sudevalayam and Kulkarni, 2011) the most common power source is still a battery.
In this work, we are interested in extending the network lifetime of a WSN with batterypowered nodes.To accomplish this, it is nec-1 Master Scholarship (PPGETI/DETI/UFC) sponsored by CAPES essary to use efficient techniques for energy saving, among which we highlight clustering (Abbasi and Younis, 2007).The main purpose of clustering is to organize the network into groups (clusters) composed by nodes and a cluster-head (CH), which collects the data sensed by the nodes and forwards the data to the sink through a multi-hop communication.As in general, the cluster's internal nodes are closer to their CH than the sink, the network saves energy.In traditional clustering algorithms (Heinzelman et al., 2000), the intracluster neighbourhood is usually formed using the geographical distance between the nodes or the Received Signal Strength Indicator (RSSI).This kind of clustering is known as physical clustering, by which neighbouring nodes can perform area monitoring even if the sensed data are semantically uncorrelated.In security applications (Bruckner et al., 2008), the nodes capture audio-visual metrics and com-municate with their neighbours to provide an overview of the environment.However, nodes set on opposite sides of the same wall monitor different areas of the environment, semantically uncorrelated.Consequently, the sharing of information between these nodes implies wasted resources.The complexity of the problem increases in continuous monitoring applications due to the large volume of data semantically uncorrelated.
Semantic clustering, however, does not use physical metrics (RSSI, distance) to group nodes in the same cluster.Instead, the similarity of data collected by the nodes is used as classification criteria.The nodes that are semantically related are termed as semantic neighbours.Semantic clustering, like the physical one, is composed of two levels of hierarchy.On the first level there is the semantic collector, which resembles the role of CH in physical clustering and is responsible for receiving (processing and sending to the sink node) data collected by the semantic neighbours, on the second level.
In this paper, we propose an algorithm for electing semantic collectors in a distributed way based on a fuzzy inference engine.Since much of the energy expenditure is focused on these nodes, our starting hypothesis is that a careful choice of semantic collectors increases the lifetime of the network.To test it, we started from a previous work of the authors of this article (Rocha et al., 2012) and added a new mechanism for electing semantic collectors.
Our study case involves the structural health monitoring (SHM) domain.In the applications of structure health monitoring, sensor nodes are often embedded in or tightly attached to walls, surfaces of bridge or containers of hazard materials (Tong et al., 2010).In these SHM scenarios, energy constraint is a usual problem since the human intervention or robotic remounting of sensors (Tong et al., 2009) may be too dangerous, costly or technically infeasible.

Related work
Among the various proposals for the election of CH found in the literature, the first one is the LEACH clustering algorithm (Heinzelman et al., 2000).The LEACH selects CH through a probabilistic model, but without taking into consideration the amount of node's residual energy.A drawback of this algorithm is the possibility that a node with low energy available is chosen.Furthermore, by having a random choice, there is a possibility that none or all nodes are elected.Siew et al. (2011) proposed a system based on fuzzy logic for choosing CHs.The input variables of the system are the residual energy and the RSSI to sink.However, when considering fixed distances, the fuzzy engine will prioritize nodes closer to the sink even if a farther node has more energy.Although the proposal for this work holds elections for cluster heads, we are interested in the election of semantic collectors, which was not proposed by the authors.
With respect to the election of semantic collectors, scope of this article, the SEMANTK algorithm proposed by Rocha et al. (2012) takes into account only the amount of semantic neighbors in the existing physical clusters during the network's initialization.To understand this aspect, it is worth mentioning that in SEMANTK, before semantic clusters formation, it is necessary for the network to have been previously grouped physically (through algorithms which use RSSI or other physical distance parameter).The semantic clustering phase is event-driven and the initialization occurs when the event defined by the designer of the network is detected.For example, in a fire detection application, if the temperature value extrapolates a certain threshold value, the process of semantic organization is initialized.Thus, upon detection of the event, the network that previously had only physical clusters forms semantic clusters using only the nodes that detect this event (semantic neighbours), even if these nodes are in distinct physical clusters.In SEMANTK, the election of semantic collectors occurs between CHs that are pre-defined in the initial stage of the algorithm.The semantic collector will be that node that has the highest number of semantic neighbours inside your group.However, this election may also fail if one node with low residual energy and/or too low RSSI has the largest quantity of semantic neighbours.
The main contribution of this paper is a new distributed algorithm for WSN that extends SEMANTK (Rocha et al., 2012) to increase survival of semantic collectors through an efficient election.While in SEMANTK the election of semantic collectors takes into account only the amount of semantic neighbours within a physical cluster, our algorithm performs a weighted election based on RSSI and residual energy using a fuzzy inference engine.
The proposal, described in the next section, is evaluated through simulations and compared with related work in this section.

Proposed work
We present a method for electing semantic collectors in a distributed way based on a fuzzy inference engine, having as inputs the CH's residual energy and the RSSI to sink.After running the algorithm described in Rocha et al. (2012), all CHs will know how many semantic neighbours are in their group and then the process of electing semantic collectors starts.We propose a fuzzy system (Figure 1) that can infer a balanced and unbiased decision on the election of semantic collectors that possess better conditions to assume this role, in other words, those nodes that have higher levels of energy and higher RSSI to sink.
Unlike the classical theory of sets, the theory of fuzzy sets (Zadeh, 1988) that the relevance of a value in a given set can be partial or intermediate.Thus, the fuzzy inference systems are able to treat problems that require uncertain or inaccurate information.The proposed fuzzy system combines the input linguistic variables Energy and RSSI to assist in decision making concerning the election of the most suitable semantic collector.As in this work there is no mobility of sensor nodes, their distances do not change, which could favour the choice of candidates closest to sink, instead of the value of the node's residual energy.To work around this, we assign weights to the variable energy, so that the large distance (corresponding to a low RSSI) does not exclude the possibility of choice of a given device.
The fuzzy system has two linguistic variables as input: the candidate's residual energy and the signal intensity to the sink.The Energy variable has three sets of values, Low, Average and High, and the RSSI one has also three sets, Near, Average and Far. Figure 2 and Figure 3 show respectively the energy and RSSI membership functions.
The output of this fuzzy system is represented by the "Chance" variable that represents the probability of a candidate being elected as semantic collector.This variable has nine sets of values: Very Small, Small, Little Small, Little Average, Average, Very Average, Little Great, Great and Very Great. Figure 4 shows Chance's membership function.
The universe of discourse for all the fuzzy linguistic variables were defined as being the closed interval [0, 100], characterized by normalization of the values obtained for Energy  and RSSI.For the geometrical shape, we used a similar model proposed by Barolli et al. (2011).The ranges of values for each function were defined empirically.The behaviour of the fuzzy inference engine is modelled by rules that were obtained experimentally until they reach a suitable arrangement.This base has nine rules that are composed of two antecedents and consequent, respectively, Energy, RSSI and Chance.The arrangement modelled tries to obtain the combination among all antecedents providing a distinct consequent (Table 1).
After all CHs know their chances of becoming semantic collectors, they will start to exchange messages between themselves in order to determine which CH has the most chance of being a semantic collector.Each CH verifies, from its point of view, if one of its neighbours has highest values of variable Chance.If so, the CH sends a vote message saying that, from its point of view, that neighbor has more chances to become a semantic collector.Otherwise, he votes for itself.Upon receiving the vote, a counter of received votes will be incremented.That CH that received the most votes will be a semantic collector.It is important to note that if there are two candidates with the same chance, the elected one will be the one that has the largest residual energy.When there are two or more geographically distant events in the network, semantic collectors for each distinct event (Figure 5).

Rule
Figure 6 shows two flowcharts describing the election procedure.The first one (a) shows how the comparing process is made.The second shows what happens when a CH receives a vote.

Material
All codes were implemented in C and executed as Contiki operating system processes (Dunkels et al., 2004).We used Cooja Simulator (Österlind, 2006), which comes shipped with Contiki.The Cooja simulator allows simulations of wireless sensor networks using IEEE 802.15.4,whose maximum transmission rate is 250 kbps and maximum output power is less than 1 mW.Moreover, Cooja has support for multiple platforms, including MicaZ, which was used in this work.

Energy model
We used the energy model proposed by Jurdak et al. (2008) defined for MicaZ sensors: (1) where Et and Er are, respectively, the energy cost in mW (milijoules) for transmitting and receiving; P sent and P received are the amount of packages sent and received, P size is the size of each package, TB is the time necessary for the radio CC2420 (used in Micaz) to send 1 byte (32 μs).I r and I t are the electric current values in the reception and transmission mode (19.7 mA and 17.4 mA respectively).Finally, V is the voltage supplied to the MicaZ (3 V).

Experiments and scenarios
We used the same settings and values as in Rocha et al. (2012) (Figure 7), which was simulated Structural Health Monitoring (Brownjohn, 2007) applications with and without damages (Table 2).The network was assembled representing a five-floor building.The physical groupings are all deterministic, in other words, at the network start up the cluster formation is manually setting (directly over the images installed in the nodes).Thus, all nodes knew whether they were CHs or not.Those who were not, knew who was its CH.Furthermore, all CHs had knowledge of their adjacent CHs.All sensors had predetermined times of transmission that varied between 500 and 1300 ms.Each round lasted 15 minutes.
During the initials ten seconds of each round, the members of the physical groups send messages containing the healthy frequency's modes, representing a structure without damage.After this period, in the scenario of Figure 7a, all nodes change their frequencies to Damage_2, which means damage on the second floor.In the scenario represented by Figure 7b, the frequencies are changed to Damage_4, meaning damage on the fourth floor.The weight assigned to the nodes, during network initialization, identifies which nodes are closer to the damage, characterizing these nodes as semantic neighbours.In our scenarios these nodes are identified by IP addresses 172.16.6.0,172.16.7.0 and 172.16.8.0 in scenario 7a,and 172.16.9.0,172.16.11.0,172.16.13.0 and 172.16.14.0 in scenario 7b.The change in mo-dal frequency (from healthy to Damage_2/ Damage_4) starts the semantic clustering procedure.After that, all CHs will know if there is any semantic neighbor in their physical cluster (Rocha et al., 2012) and, at this point, the semantic cluster head election proposed in this work starts.
In this work, we want to postpone as much as possible the time that the first CH has completely exhausted its energy or the First Node Death index (Dietrich and Dressler, 2009).We compared our proposal with the election algorithm used in SEMANTK (Rocha et al., 2012), which takes into account only the amount of semantic neighdours in the physical clusters, and with the work proposed by Siew et al. (2011), which, like our work, uses the remaining energy and RSSI, but without considering weights to the input variables.The next section shows the results obtained so far.

Results
Figures 8, 9 and 10 show the graphs comparing our proposal, the SEMANTK (Rocha et al. 2012) and the work proposed by Siew et al. (2011).
In our scenarios, no node was added during the experimentations and we did not consider any physical member death.Therefore, node     172.16.2.0 will always be elected as a semantic collector in scenario 7a just as 172.16.5.0 in scenario 7b.Taking the role of a leader, semantic or physical, is an onerous task in terms of energy, these nodes tend to die faster than the other nodes.Considering 5 Joules as the initial battery charge, both nodes 172.16.5.0 and 172.16.5.0 lasted about 240 minutes.As illustrated in Figures 8a and 8b, we observed that a strategy of turnover leading may increase the time the first CH has its energy completely depleted.Looking at Figure 9, it is observed that the residual energy falls more evenly than previously seen in SEMANTK.In the experiments illustrated in Figures 9a and 9b, the first CH took approximately 350 minutes to die, given the same initial energy of 5 Joules like in the previous experiment.We noted that in Experiment 9a, where the two devices were evaluated near the base (scenario 7a), the turnover was greater than in experiment 9b.The nodes are farther than in experiment 9a, consequently the RSSI variable influences more than in the previous case.Therefore, the nodes that are farthest are elected at a lower frequency than the closer ones.As it can be seen in Table 4, the candidate 172.16.3.0 was elected more times due to its proximity to the sink.As distances are fixed, those devices closer to the sink are chosen more often, even when their residual energy is low, making the fuzzy inference less efficient, leading to non-optimized elections.At about 300 minutes, which corresponds to round 20, the CH 172.16.0.3 was elected even though its residual energy was about 12%.The CH 172.16.5.0, at that moment, had residual energy around 40%.
With the strategy proposed in this work, we got a better rotation of the semantic collectors, both in scenarios 7a and 7b.The CHs' energy levels tend to fall in a balanced way due to weight assigned to the variable Energy, enabling remote devices to have more chance to become semantic collectors as those closest to the sink.It is important to note that, regardless of the candidates' distances, the turnover of semantic collectors is guaranteed by our proposal.Therefore, the weighting of the input variables (residual energy and RSSI) enabled a balanced behaviour independent of the devices' distances and locations in the scenarios observed, which was not achieved by the related works.The work proposed by Siew et al. (2011) achieved results similar to ours in scenario 7a (Table 3), but was not able to repeat it in scenario 7b.That is because in scenario 7a the distance between the candidates and the sink is small, therefore their values do not have a decisive influence in the election (in the fuzzy inference, the two devices' RSSI have the same relevance to the set Near).Consequently, the variable power is decisive in the election process.
All semantic collectors kept running for approximately 350 minutes in scenario 7a and 405 minutes in scenario 7b using the same initial energy of 5 Joules like in previous experiments.This represents a gain of 45% in survival in relation to the election used in SEMANTK in scenario 7a.In 7b, we had a gain of 68.8% compared to SEMANTK and 17.4% compared to Siew et al. (2011), which used fuzzy logic without weighting input variables.
Although the algorithm proposed in this work was capable to accomplished elections in a more balanced way ensuring a better choice of semantic collectors, when the event of interest is detected by a lot of neighbours and continuous cluster, it will start an intense exchange of messages among all the semantic collectors candidates until there is a consensus, because only one semantic collector will be elected.A suggestion for improvement would be to decide the right time to partition the semantic clusters.

Conclusions
In this paper, we propose an algorithm for electing semantic collectors for semantic clustering mechanisms that takes into account the residual energy level and the RSSI to the sink.We use an inference engine with fuzzy weighted inputs that, besides being able to handle imprecise or uncertain information, also enables choices prioritizing nodes' energy levels.We compare our proposal with two algorithms from the recent literature.Simulation results indicate that the proposed algorithm provides gains in lifetime of the order of (i) 17.4% with respect to the work described in Siew et al. (2011) and (ii) 68.8% in SEMANTK (Rocha et al. 2012).
In order to provide a semantic collectors election involving all the sensor nodes instead of only the CH nodes, as a future work we plan to extend our algorithm so that it can be applied to flat WSN.This will eliminate the need for an initial physical clustering, as presented in the SEMANTK algorithm.Besides, we intend to spread out the load across the largest number of nodes with homogeneous roles in the network.In this way, we also expect to prolong the network lifetime.
Finally, we will also extend our work to consider the trade-off related to the number of semantic collectors, overhead and energy saving.Whenever a cluster is large, there is an overhead in the semantic collector to handle the messages sent from semantic neighbours.Otherwise, if there are several small clusters, there are more semantic collectors and the overhead in this case refers to control information messages.Hence, we intend to investigate to find out the most suitable cluster size.

Figur e 1 .
Figur e 1. Fuzzy model proposed in this work.

Figure 2 .
Figure 2. Membership functions for energy variable.

Figure 5 .
Figure 5. Two semantic clusters for two distinct events.

Figure 6 .
Figure 6.(a) A CH compares its chance to become asemantic collector.(b) The counter of votes is incremented.

Figure 7 .
Figure 7. Scenario used in this experiments.

Figure 10 .
Figure 10.Election algorithm proposed in this work.

Table 3 .
Number of times that the nodes were elected as semantic collectors 7a.

Table 4 .
Number of times that the nodes were elected as semantic collectors 7b.