Optimization of Clustering Time by a Group of Autonomous Robots Making Use of an Exclusive Multi-Marking

: Problem statement: For solving complex issues, the current tendency goes towards the swarms behaviors, realized on a basis of collective interactions, which results from a cooperative work favoring exchanges between individuals of a same group at microscopic level and allowing the emergence of complex collective behaviors at macroscopic level. Many models were inspired by these attitudes to find simple rules, guiding mobile, autonomous robots with limited capacities in their environment in order to achieve tasks like those of exploration, self-assembly and gathering. Multi-marking technique as indirect communication inside the same robots group can optimize time of such achievements Approach: A method based on the reversed emergence principle combined to a genetic algorithm is presented here, making evolve a global behavior inside simulated robots group called agent-robots, with an aim to find the micro-rules forming a heap according to two approaches. The first approach accomplishes an ordinary grouping and the second one, which we propose, based on the exclusive multi-marking principle. The control device, guiding these robots-agent to succeed this task, functions on a basis of sensor-motor rules being used to arbitrate between a given number of elementary behaviors with which we equip each one of them initially. Results: Simulation results, implemented according to a reactive agent’s model, making it possible to show the consistency of the detected rules and the efficient of the proposed approach in comparison with the ordinary one, are provided and commented. The time optimization of grouping by robots like these can have a huge economic and strategic impact in sectors as important as industry, agriculture and military domain.


INTRODUCTION
It's obvious that the collective work is more interesting than the individual one, since it makes it possible to optimize the time necessary to carry out a partible task or to perform tasks that cannot be performed. The fact that collective work makes it possible to exceed specific limitations to group members is one of the reasons which push us to be organized and to work in cooperation. This is natural at the human level and certain animalist groups like social insects (for instance: ants and bees) or predators and preys which live in community (for instance: Wolves and wildebeests). From this point of view, the cooperative exploration and exploitation of an unknown environment by robots group belong to the most interesting subjects related to the domain of the collective intelligence in general and the collective robotics in particular. The emergence of organization concept is important, in this case, to describe how an indivisible phenomenon on macro-level is supported by a structure of interactions on micro-level. The advantage is the possibility for the low-level, in spite of the simplicity of its components, to self-organize thus providing flexibility and robustness in the high-level.
In controller design of a collectivity, we must discern between centralized and decentralized method. The first one refers to the presence of a central agent which controls the other group members. The second alternative, which interests us here, relates to autonomous agents which behave, each one, according to its own rules set. Also, these rules make it possible for any individual of the group to locally communicate with the others in a direct way or via the training ground where they all move.
The heap formation is one of the most studied tasks in this perspective. Several variants were inspired from the latter, as the exclusive positioned heap formation (Abdessemed and Bilami, 2010). The one proposed here according to an exclusive multi-marking principle allows each agent to sign its passage by a print, specific to its mission or to its type, which's dissipated in the time. Our objective, in this case, is to evaluate its formation time and to compare it with that of the formation which we find in (Furukawa, 2010). This supposes that we discovered, before, rules governing the two formations through an evolutionary process using an adequate Genetic Algorithm (GA).
The rest of this study is organized as follows: in Materials and Methods section, we introduce the multivalent systems while focusing on the reactive agents, representing the simulation model used in our experiments, we explain the GA algorithm representing the selected evolutionary approach to find the appropriate chromosomes, implemented as a table  structure named lockup table LT, (Table 1), for the achievement of mentioned above tasks, we describe the principal to form a heap according to the normal approach and we describe the principal to make a heap according to the proposed one based on the technique of the exclusive multi-marking. In results section, we present simulation results according to the both approaches. In discussion section we present rules detected according to these two methods, we compare them. After, we compare these rules with those discovered by ethnologists at ants and we enclose this article by a conclusion where we make a synthesis of the work thus achieved and where we evoke some prospects which we plane for the future. Table 1: The sensorimotor rules: Each local perception is put in correspondence with a basic behavior (coded 0 for the displacement and 1 for the handling an object). The state of robot-agent is coded 0 for free and 1 for carrying an object and each of the 5 squares of its local environment, (Fig. 2), can contain nothing coded 00, an object coded 01 or another agent-robot coded 10 Local perception Internal state Behavior 00 00 00 00 00 0 0 00 00 00 00 01 0 0 00 00 00 00

MATERIALS AND METHODS
Multi-agent systems: Multivalent systems represent a very active search area; it's located at the intersection of other fields like the artificial intelligence and the distributed systems. It's interested in collective behaviors produced by interactions of several autonomous and flexible entities, named "agents". These interactions can turn around cooperation, competition or coexistence between these agents (Wooldridge, 2002). We distinguish, generally, two categories of agents: Reactive-cognitive. The reactive agents respond only to their environment perception and act according to this perception. It represents the lowest complexity level. This agents type cannot be described as intelligent, since their operation is based on the stimulus-action principle, allowing agents to act guided by conditioned reflexes. With perception of a particular stimulus, agent provides a stereotyped answer. The communication which it has with other agents and its environment is elementary. Nevertheless, they show interesting characteristics, like the description simplicity of the local behavior and the emergence of global behaviors beyond the agent simplicity (Ampatzis et al., 2008). According to Nandhakumar et al. (2009), reactive agents are perfect to represent autonomous, homogeneous robots, with limited capacity in a given collectivity, in order to study problems specific to group robotics. In this case, each robot of the real-world corresponds, in the simulated world, with an agent which we will baptize "robot-agent".
It has a number of sensors and actuators enabling him to interact with its local environment. To implement this sensory-motor reactions we use an LT table, Table 1, containing, in one side all possible configurations of the perceptible local environment and on the other side, the basic behavior wish is supposed to much with the local perception done by the robot-agent. The LT table is comparable to a Cellular Automaton; at each step-time, the robot-agent reads its local environment and seeks the entry which corresponds with this perception in its LT to find the right action to perform. In this context, robotics group is often described as being the search field of local rules set producing the desired global behavior.
Genetic algorithm: The genetic algorithm GA as an evolutionary approach constitutes an adaptation model, very simplified, of natural systems; it's employed successfully in stochastic optimization (Ahmed et al., 2010). The most important in the evolutionary approach used here is the discovery of optimal lookup table LT, leading a homogeneous group of robots-agents to succeed the grouping task.
The principle of GA used here is: Initially we generate randomly a population of P Lookup tables LT, which we will make evolve of G generations. Each LT carries a binary chromosome Ø, which's composed of all discrete values field named behavior, (Table 1). With each generation, a fitness value is assigned for each LT table. The fitness of LT determines its membership probability of the next generation. Crossing and mutation operators generate new LT tables in the following population. The K best LT tables are copied of one generation to the following. The K worst LT tables are eliminated and the P-K remaining LT tables are subjected to a crossing of simple site in a random place of chromosome with P c probability (to ensure the search diversification) and to a mutation of random site with P m probability by site (to find the global optimum). It should be noted that P c and P m are selected among a uniform random distribution. This search method of optimal chromosome is valid as well for the ordinary heap formation as for that by the multi-marking.

Genetic algorithm used:
For NG generations with P LT tables for each do For each LT Ordinary grouping: The algorithms inspired of ants behaviors were subject of many works which make, from that, general principles being applied to many combinative problems. One of problems well-known and on which we focus is corpses management . In this context, the work which we present here has as purpose to give a computing reality to collective behaviors, usually qualified as emergent. Moreover, simulation must be perceived like realistic towards an observer. This task requires the total coordination in the team of reactive agents, moving on a surface representing a spread out sphere (for simplicity reasons), containing initially a random number of objects and robots-agents. The mission of robots-agents, then, is to form a heap without any external help. To perform that, they must be equipped with good sensor motor rules, deposited in the LT table giving the highest fitness. Fitness function evaluates, then, this formation in heap. Space in which robotsagents move is divided into J zones. Each zone A j is composed of the same cells number which can contain only one robot-agent, one object or robot-agent carrying one object. The fitness function used, then, is as follows: Where: I = Represents experiments number for LT table ftotal = The average fitness, fi the fitness of the i th experiment Pj = The regrouping rate of objects in cell i N (Aj) = Objects number in the cell Aj J = The number of zones The f i fitness worth 0 when objects are equitably distributed on all the cells and 1 when all objects are in the same cell (the Shannon entropy principle). We allow the collective to train during T step-times, then fi is computed. This experiment is repeated I time to calculate the average fitness, which's more representative of behaviors quality that the tested LT table involves. The time of the total convergence for this approach was improved by using a dynamic subdivision of the environment: The initial grid of 3×3 zones, see central table in Fig. 1 moves in eight directions of half-zone (vertical or horizontal) or of two half-zones (one vertical and the other horizontal) to each evaluation. What makes it possible to solve the problem of heap spread out over several zones and which, in spite of its good formation is not detected by the fitness function mentioned above. The blue disc expresses that robot-agent carries an object, the white disc represents an object and the white disc on the green one expresses that robot-agent deposited its object, then it's ready to move Grouping by an exclusive multi-marking: This approach is based on the behavior idea of social insects which use chemical substances (pheromones) to mark their passage and communicate, thus, indirectly with their congeneric to inform them of the danger for example or to guide them towards the food source or to return towards the anthill or to delimit their territory and to protect their nest (Wurr and Anderson, 2006;Connelly et al., 2009). The stigmergy concept stills, under these conditions, a basic mechanism making it possible to succeed in accomplishment of the targeted task: At every moment, the behavior of each group member influences the behavior of the rest i.e., at each step-time the environment state stimulates group individuals and dictates answers required or reactions to be undertaken (Serugendo et al., 2006). Robots-agents equipped with basic behaviors set and an exclusive multi-marking device, enabling them to leave a trail behind them, arrive to be organized in a little Indian queues containing only robots-agents with the same state: free or carrying an object, Fig. 3. This made emerge a sequence of same action in the zone where this sub-group is. It has like effect to accelerate the heap formation process. Robots-agents are released from these queuing after a random time period to be free to join another queue. This makes it possible to avoid infinite loops where they can turn indefinitely (Fig. 4).
The traces represented by numbers in green or blue, left respectively behind free robots-agents or busy robots-agents expresses the intensity of deposited mark. The deposited mark is dissipated gradually in time and the new deposited mark at a place where another mark of different type is eliminates its effect and then takes its place. Therefore, there is no ambiguity to detect the type of deposited mark.

RESULTS
Simulation results according to the ordinary approach: Implementation: Space in which robots-agents involve is two-dimensional with square cells of same size. Each robot-agent can see 6 cells and carry only one object at the same time. In the five places, around it, we distinguish 3 states (nothing, an object, or robot-agent). In the sixth place, that which the robot-agent occupies, it has two states (carrying an object or free Fig. 2). This means that there are 3 5 ×2 entries in the table LT. Two basic behaviors are defined for this task.
To move: Robot-agent advances towards one of empty cells, respectively: in front of, on the left or on the right, else it does nothing.
To handle an object: if the robot-agent is free it tries to take an object which's in one of the five places around it else it moves. If the robot-agent carries an object, it's necessary that it deposits it somewhere using the same principle.
In our experiments, we used the size population P = 50 and we fixed the parameter K of chromosomes directly retained for each generation at 5. After 60 generations and for the crossing probability P c = 0.6 and the mutation probability P m = 0.005, a heap structure emerged. In order to find good rules, we used a world of 30×30 places with 30 robots-agents and 60 objects. The training time used is T = 2000 step-times with cells number J = 9 and experiments number I = 30 per LT table. After 150 generations we took the best LT table realizing the gathering. In Fig. 5 we show 6 instantaneous allowing us to see the emergence of this structure. We notice small piles formation of objects at the beginning, which, merge with time to give larger piles. Curves histories representing Minimal, Maximum and Average values of each generation evaluation show the convergence of used algorithm GA (Fig. 6).
Scalability: Detected rules (described further) are insensitive with changes of robots-agents number, objects number and training space size used. It means we could show in experiments that these rules always arrive at the heap formation. To perform that, we took the optimal LT table (with the maximum fitness) and we scale it according to robots-agents number, objects number and training space size. The Obtained results are presented respectively in Fig. 7-9. We notice in the first curve (Fig. 7), that the time of heap formation have tendency to decrease in spite of a slight increasing which doesn't influence the curve dominant shape. This expresses that robots-agents number is inversely proportional to the heap formation time, since the degree of parallelism increases with robots-agents number. For the second curve (Fig. 8), we notice that the heap formation time is proportional to objects number; the larger objects number takes generally more time to be grouped. For the last curve (Fig. 9), we notice that more the training space grows and more robots-agents evolve freely; probability of the meeting between robots-agents leading to the mutual nuisance decreases. This takes a short time, since the more space size increases, the more robot-agents need time to move; as well to find an object to deposit it in the formation zone.
Simulation results according to the second approach: Implementation: The marking is represented by numbers expressing its intensity which decreases by one at each step-time until reaching the minimal limit. There are two intervals of marking: free robots-agents ∈ [21, 40] and those carrying an object ∈[0, 20]. In the five places around each robot-agent, Fig. 2 we distinguish 4 cases: nothing, an object, free robot-agent and robot-agent carrying an object. In the sixth place, that which the robot-agent occupies, it has two states: carrying an object and free. This means that there are 4 5 ×2 entries in the LT table with always 2 basic behaviors definite as follows.
To move: If robot-agent carries an object, the displacement priority is given to the direction containing the most intense mark made by other robotagent carrying object else the rest is identical to the first approach (the mark absence is regarded as the mark with intensity zero).
To handle an object: this behavior is identical to that used in the heap formation of ordinary approach.
We accomplished our experiments under same conditions as those of ordinary approach. In Fig. 3 we show 6 instantaneous allowing seeing the emergence of heap structure for the suggested approach. It was noticed that small piles of objects which emergent after a certain time, merge quickly in piles of size more important, because of robot-agents organization in Indian queues. Curves representing the Minimal, Maximum and Average values of evaluation at each generation are similar to those of the first approach (Fig. 6). This confirms the convergence of used algorithm GA.

Scalability:
The detected rules (further) are as consistent as those of ordinary approach. Given results of same scaling as those of first approach are in Fig. 7-9. We made another experiment concerning the marking intensity to see its influence on the time formation, knowing that we took a fixed difference of 20 between free robots-agents mark and that of robots-agents carrying an object. We observed that the intensity of deposited mark is inversely proportional to the heap formation time until a certain critical level from which time starts to increase again, Fig. 4. We noticed that this critical level varied with the ground size.

Detected rules according to the ordinary approach:
Sensor motor rules that we could read in the LT table representing our solution are: • The robot-agent deposits an object in its possession if its local environment contains at least another object else it moves with this object • The robot-agent takes an object if it's alone or if there are two objects in its local environment else it moves Detected rules according to the suggested approach: Generic rules deduced from the optimal table LT concerning the second approach are: • The robot-agent deposits an object in its possession, if its local environment contains at least another object or another robot-agent carrying an object, else it moves according to the multimarking principle describing above • The robot-agent takes an object, if it's alone or if there are two objects in its local environment (one of these two objects can be carried by a robotagent), else it moves according to the same multimarking principle describing above Comparison between the two approaches: At this analysis stage, we can say that for the two approaches the heap formation emerges at random place. The second approach, moreover, makes emerge an organization in Indian files resulting from deposited marks on the ground. From this, emerge also a repetition of the same action makes successively by robots-agents belonging to the same Indian file, which leads to the heap formation time lower than that of the first approach. For example, in the experiment illustrated in Fig. 7, the average time of ordinary formation is 1650 unit-time, whereas only 1379 for that by the multi-marking, illustrated in Fig. 3. We noticed when objects grouping rate increases this approach becomes more powerful.
Comparison between the two approach rules and the ANTS rules: Ordinary approach: According to ethologists, ants like Pachycondyla apicalis form spontaneously a heap of corpses by evolution stage; they make small corpses piles, which become more important in time. Rules which can govern this are: • When an ant meets a corpse, the probability to takes it increases with the fact that this element is isolated • When an ant transports corpse, it deposits it with the probability wish increases if the density of elements with the same type in the vicinity is important (Martin et al., 2002) The analogy with detected rules, in the case of robots-agents community, is immediate: Corpse in the ant case represents objects in the robot-agent case; objects number in the robot-agent case represents a specific concentration characterized corpse type given in the ant case. The more this number is large (in other side, high concentration rate of corpses), the more robot-agent tends to deposit the carried object and more it's small (in other side, weak specific concentration) and more robot-agent tends to carry it (if it's not already in its possession) or to keep it and moves (if it's already in its possession).
Proposed approach: In nature, ants mark their passage by pheromone. This ground marking allows them, not only to trace, for example, path between anthill and food place, but to do it in an optimal way (Dorigo and Stutzele, 2004). The marking by pheromone also enables them to communicate, because the deposited substance can vary in intensity and nature according to the message to transmit (multi-pheromones). When ant is at crossroads for example, the probability to choose a path depends, among others, of deposited pheromones quantity on this path: Larger the quantity is for the most important pheromone type and more the probability of choosing this path increases Dorigo and Stutzele, 2004).
The analogy with rules detected by the proposed approach of marking is as follows: the pheromone in the ant case represents the deposed mark in the robotagent case (each mark type corresponds at a pheromone type). The intensity of this mark is represented, in our simulations, by an integer which express the quantity of deposited pheromone at ants and which decries with each step-time to simulate the dissipation of this pheromone. It should be noted that the mark propagation isn't considered here. Except the displacement, the deposit and taken of object are governed by rules comparable with those of ants presented before, concerning the first approach.