Reflective business games. Reflective games

SUMMARY OF THE CASE (REFLECTION)

"THREE OF MOODS"

A tree is drawn on a piece of whatman paper - each branch is a separate day. In the evening, the child can draw a leaf of one of three colors on today’s branch. green means that the child is in an excellent mood, yellow means good, red means so-so. By the end of the shift, you will have a complete picture of how it went for your children.

“A STAR FALLED FROM THE SKY”

Children are told that when stars fall, you can make a wish, and many people, having seen a falling star, make their most cherished wish and it will certainly come true. The guys write on their star (cut out of cardboard) what they expect from this shift. The counselor collects all the stars and hangs them on the wall. At the end of the shift they are removed, the wishes are read and together they discuss what came true and what did not.

"SUITCASE FOR THE ROAD"

You can put anything in a magic suitcase and it will remain unchanged. Everyone chooses three things that they would like to take away from class: a good mood, a friend, the chair on which they are sitting.

"FINISH THIS SENTENCE"

Today is:

My mood today:

I would like tomorrow:

Based on the children’s statements, the counselor sums up the day.

Let's say goodbye as on the Adaman Islands in the Pacific Ocean.

Place your right palm under the palm of the neighbor on your right; and the left one - on the palm of the neighbor on the left. And with the kindest and brightest wishes, with the most positive energy, we blow on the palm of the neighbor on the right.

Envelope of Revelations”

The counselor prepares an envelope with a large number of questions in advance. It is desirable that the questions be of a moral and ethical nature, such as:

What do you value most in people?

what is your biggest goal in life?

What character traits of a person are especially unpleasant to you?

Which famous hero of the past (film, book) would you like to be like and why? etc.

“They meet you by their clothes...”

This stage of work arouses interest and slight excitement among group members. In the process of searching for their “portrait,” they have to read not one, but several pieces of paper, “argue” with one of the applicants for the same characteristic, and be able to defend their right to it. During the discussion, the facilitator suggests answering several questions:

Are you satisfied with what is written on the piece of paper you received?

What caused surprise, did “discoveries” occur?

What aroused the greatest interest in the process of work?

What difficulties did you experience while doing the exercise?

Horoscope

2-3 day shift.

Children are divided into groups - zodiac signs. Before the statement - a brief description of the sign. An unusual characteristic of children, a form of memorizing personality through highlighting unusual qualities. You can compare by seasons of the year, eye color, etc.

The squad level is low.

Situational

(Electric chair)

One participant has his back to the audience, everyone writes notes with a brief description of this person, which are then read out by the presenter (who corrects the text if it is incorrect in relation to the person).

It makes it possible to assess the behavior of a particular child to members of the squad without ambition, resentment, or insults to his personal dignity.

A candle is a day gone by.

The guys sit in a circle and, passing the candle to each other, take turns telling how the day went for them and their assessment of it.

Candle - desire.

By passing the candle around, you can add your wishes for tomorrow to your usual assessment of the day. You can start with the words: “I would like that tomorrow...”

Cobweb.

Everyone sits in a circle. The counselor takes a ball of thread, winds the thread around his finger and gives this ball to any child in the circle (“I would like to give this ball to Katya because...”). Next, the second participant winds his thread around his finger and passes the ball to the next one, explaining his choice. And so on until everyone is connected by one thread. You can see the connections that have arisen between the guys in the squad. Everyone can cut off a thread wound around their finger as a keepsake.

"Five Minutes of Revelation"

Throughout the day, the entire squad puts notes in a mailbox-shaped box with questions that they would like to ask their counselors at the final “light.” At the “light” the counselors answer these questions.

"Notes"

Envelopes and small notes are made. The number of envelopes is equal to the number of children in the squad. On the notes, each child writes wishes, words of gratitude, impressions of each person from the squad. These notes are placed in envelopes that are creatively designed and signed. These envelopes are then presented in creative ways to their owners. But only the next day you are allowed to open the envelopes and read the notes.

If the information structure has finite complexity, then it is possible to construct reflexive game graph, clearly showing the relationship between the actions of agents (both real and phantom) participating in equilibrium.

The vertices of this directed graph are the actions of e?+, corresponding to pairwise non-identical structures informed™ /., or components of the information structure V" or simply the number g of a real or phantom agent, g e Z+.

Arcs are drawn between the vertices according to the following rule: to each vertex x s arcs drawn from (P- 1) vertices corresponding to structures I mp j e N(/) If two vertices are connected by two oppositely directed arcs, we will depict one edge with two arrows.

We emphasize that the graph of a reflexive game corresponds to the system of equations (2.3.1) (that is, the definition of information equilibrium), while its solution may not exist.

So, Count G, reflexive game Г, (see the definition of a reflexive game in the previous section), the information structure of which has finite complexity, is defined as follows:

  • - vertices of the graph G t correspond to real and phantom agents participating in a reflexive game, that is, pairwise non-identical structures of awareness;
  • - graph arcs G t reflect the mutual awareness of agents: if there is a path from one agent (real or phantom) to another agent, then the second is adequately informed about the first.

If at the vertices of the graph G/ depict the corresponding agent’s ideas about the state of nature, then the reflexive game G, with a finite awareness structure / can be specified by a tuple Г, = (N,(A)), e N,f(), e ,v, G/), Where N- many real agents, X,- set of permissible actions of the z"-th agent, f(-) 0 x X -> 9?" - its objective function, /" e N, G,- reflexive game graph.

Note that in many cases it is more convenient (and clear) to describe a reflexive game in terms of a graph G/, rather than an information structure tree.

Let's consider several examples of finding information equilibrium.

Examples 2.4.1-2.4.3. These examples involve three agents with objective functions of the following form:

Where Xi> 0, / € N= (1, 2, 3}; V e 0 = (1, 2).

For brevity, we will call the agent who believes that demand is low (0= 1), a pessimist, and those who believe that demand is high (0 = 2) - an optimist. Thus, in examples 2.4.1-2.4.3 the situations differ only due to different information structures.

Example 2.4.1. Let the first two agents be optimists, and the third one be a pessimist, and all three are equally informed. Then, in accordance with Statement 2.2.5, for any A e I the identities / st] = / b / st2 = h, Dz = h-

In accordance with property 2 of the definition of information equilibrium, X*.

It can be seen that any information structure is identical to one of the three that form the basis: (/b/2, D). Therefore, the complexity of this awareness structure is equal to three, and the depth is equal to one. The reflexive game graph is shown in Fig. 8.

Rice. 8.


Thus, the actions of agents in a situation of information equilibrium will be as follows: X! = x 2 =1/2, =0.*

Example 2.4.2. Let the first two agents be optimists, and the third one be a pessimist who considers all agents to be equally informed pessimists. The first two agents are equally informed, and both of them are adequately informed about the third agent.

We have: I x ~ I 2 , I >h, h > h,1 ~z I 2~z h? The reflexive game graph is shown in Fig. 9.

Rice. 9.

These conditions can be written in the form of the following identities, which hold for any ae I (we use the corresponding definitions and statements 2.2.1, 2.2.2 and 2.2.5):

12a = ha, 1bа = ha, ha = ha, hla = ha, ha = h, ha2 = hi, hal = h-

Similar relationships hold for equilibrium actions X". The left-hand sides of these identities show that any structure 1 p for |сг|>2 is identical to some structure /„ |г|

Thus, the complexity of this awareness structure is equal to five, and the depth is equal to two.

To find information equilibrium, it is necessary to solve the following system of equations (see expression (2.3.1)):


Thus, the actions of real agents in a situation of information equilibrium will be as follows: X) =x 2=9/20, x 3 * = 1/5.

Example 2.4,3. Let all three agents be optimists, the first and second are mutually informed, the second and third are also mutually informed. According to the first agent, the third considers all three to be equally informed pessimists; Likewise, the first agent, in the opinion of the third, considers all three to be equally informed pessimists.

We have: D x D, / 2 >

These conditions can be written in the form of the following identities, which hold for any a e I(we will use the corresponding definitions and statements 2.2.1, 2.2.2 and 2.2.5):

Similar relationships hold for equilibrium actions x p.

The left-hand sides of these identities show that any structure 1 P for |oj > 3 is identical to some structure /„ |т| 1, A, /3, /зь/13, /вь/132? hn,/sv-

Thus, the basis is formed by the following pairwise different structures: (/ ь />, /3, /зь /в, /lb) - The complexity of this awareness structure is equal to six, and the depth is equal to three. The graph of the corresponding reflexive game is shown in Fig. 10.

Rice. 10.

To find information equilibrium, it is necessary to solve the following system of equations (see expression (2.3.1)):

Thus, the actions of real agents in a situation of information equilibrium will be as follows: x, = x 3 = 17/35, x 2 * = 12/35.

Having completed the description of the reflexive game graph, we will continue to study the properties of information equilibrium.

Polina Astanakulova
Games for children 5–7 years old. Reflective circles “The Secret of My Self”

GAMES FOR CHILDREN 5-7 years old

REFLECTIVE CIRCLES

« THE SECRET OF MY SELF»

"Me and others".

Target:

1. Develop self-confidence, the ability to express your opinion, the ability to listen carefully to your comrades.

2. Develop imagination.

3. Cultivate a friendly attitude towards each other

Material: A ball of thread, calm music.

Content: Children in circle. The teacher has a ball of thread in his hands. Educator: Let's find out what you love most. The music plays and the teacher says that I like to walk in the forest. Then he passes the ball to the child and everyone expresses their opinion, then the ball returns to the teacher. The result is a web like this. The web has woven us into one. Now you and I are one. It is very thin and can break at any moment. So let's make sure that no one can ever quarrel with each other and break our friendship. Children close their eyes and imagine that they are one (the web is wound into a ball).

"I through the eyes of others".

Target: Give children ideas about individuality. The uniqueness of each of them, develop self-confidence, and develop the ability to accept a point of view different from your own.

Material: pebbles, rugs.

With words: “I’m giving you a pebble because you...”

Bottom line: With the help of a pebble you said a lot of good things.

« The secret of my “I”» .

Target: Create a trusting environment in the group that allows children to express their feelings and talk about them, develop empathic communication skills, the ability to accept and listen to another person; develop the ability to understand yourself.

Material: candlestick with candles, matches, mirror, classical music.

The queen took out a magic mirror and ordered to him: “My little mirror, tell me and tell me the whole truth. Am I the sweetest, the most rosy and the whitest in the world?” The teacher shows the children "magic mirror" And speaks: I also have a magic mirror, with the help of which we can also learn a lot of interesting things about each other and answer questions question: "Who am I?". Let's look at the candle flame. It will help us remember our feelings – successes and failures.” The music plays and the teacher talks about himself, then the children speak. So we talked about our advantages and disadvantages and can correct them. Let's be more attentive to each other. Children join hands and blow out the candle.

"Me and my emotions".

Target: Learn children talk about your feelings, develop the ability to identify emotions using schematic images, enrich your vocabulary children.

Material: pictogram, rug, music.

Content: Children sit in circle on the mats. In the center are cards depicting various shades of mood. The teacher suggests taking cards that best suit your mood. After the children take a card that suits them. The teacher concludes what mood he has children - sad, cheerful, thoughtful. What does it take to improve your mood? Let's laugh and forget about the bad mood.

"Me and others".

Target: Form a friendly attitude towards each other,

To develop in children the ability to express their attitude towards others, (if necessary critically, but tactfully.)

Material: a ball of thread, calm music.

Content: Children in circle. The teacher has a ball of thread in his hands. Educator: You have been friends for many years and you all know each other. You are all different, you know each other's strengths and weaknesses. What could you wish for each other to become better? Music plays, children say wishes to each other. The teacher says a wish to a child sitting next to him. (example: so that he cries less and plays more with the children.) Then the adult passes the ball to the child (the child says a wish to the person sitting next to him) etc., then the ball returns to the teacher. Children close their eyes and imagine that they are one.

"The world of my fantasies".

Target: Develop imagination, looseness, communication skills, develop a friendly attitude towards each other.

Material: a chair for each child, a seven-flowered flower.

Fly, fly, petal,

Through west to east,

Through the north, through the south,

Come back after doing circle,

As soon as you touch the ground,

To be in my opinion led!

Educator: Imagine that there is a wizard who will fulfill any desire. To do this, you need to tear off one petal and make a wish and tell about your dream. “The children take turns tearing off the petals and telling them what they would like.”.

Educator: Children, which wish did you like best?

Everyone had different desires, some about themselves, others connected with friends, with parents. But all your wishes will definitely come true.

“How can I change the world for the better?”

Target: Develop children's imagination, the ability to listen to the opinion of another, to accept a different point of view, different from one’s own, to form group cohesion.

Material: "Magical" glasses.

Content: children sit in circle. The teacher shows "Magical" glasses: “Whoever wears them will see only the good in other people, even what is not always immediately noticeable. Each of you will try on the glasses and look at the others.” Children take turns putting on glasses and naming each other's strengths. Educator: “And now we’ll put on our glasses again and look at the world with different eyes. What would you like to change in the world to make it a better place? (Children answer)

This all helps us see something good in others.

“What is joy?”

Target: Develop the ability to adequately express one’s emotional state, to understand the emotional state of another person.

Material: Photos of joyful faces children, pictogram "joy", sunshine, red felt-tip pen.

Educator:

What feeling is depicted in them? (Smile)

What needs to be done for this? (Smile)

Say hello to each other. Each child turns to the friend on the right, calls him by name and says that he is glad to see him.

Educator: Now tell me what joy is? Finish offer: “I rejoice when...”. (Children finish the sentences). The teacher writes down wishes on pieces of paper and attaches them to the rays. Everyone has their own joy, but it is passed on to each other.

Which "I"»

Target: creating a positive emotional mood, forms a group and increases personal self-esteem.

Material: mirror.

What color are your eyes?

What are they like (large, small);

What color is your hair?

What are they like (long, short, straight, wavy);

What shape is the face (round, oval).

"My name"

Target: the game helps you remember the names of your comrades, causes positive emotions and builds a sense of group unity.

Content: children sit in circle. The presenter chooses one child, the rest come up with endearing derivatives of his name. Then the child says which name he was most pleased to hear. This is how they come up with names for each child. Next, the presenter talks about how names grow with children. “When you grow up, your name will also grow and become complete, you will be called by your first name and patronymic. Word "surname" came from the word "father", it is given by the name of the father. Children say their first and middle names.

"Do as I do"

Target

"Understand me"

Target: development of imagination, expressive movements, group cohesion.

"Me in the future"

Target: development of group cohesion, imagination.

"We are different"

Target: the game makes you feel important, evokes positive emotions, and increases self-esteem.

Which of us is the tallest?

Which of us is the shortest?

Which one of us has the darkest (light) hair?

Who has a bow, etc.

The presenter sums up that we are all different, but we are all very good, interesting and most importantly - we are together!

Novikov D.A., Chkhartishvili A.G.
Reflective games
M.: SINTEG, 2003.- 160 p.

Materials provided by the site "Theory of Management of Organizational Systems"

annotation

The monograph is devoted to a discussion of modern approaches to the mathematical modeling of reflection. The authors introduce into consideration a new class of game-theoretic models - reflexive games that describe the interaction of subjects (agents) making decisions based on a hierarchy of ideas about essential parameters, ideas about representations, etc.

Analysis of the behavior of phantom agents that exist in the representations of other real or phantom agents, and the properties of the information structure reflecting the mutual awareness of real and phantom agents, allows us to propose an information equilibrium as a solution to a reflexive game, which is a generalization of a number of well-known equilibrium concepts in non-cooperative games.

Reflective games provide an opportunity to:

Model the behavior of reflective subjects;
- explore the dependence of agents’ gains on the ranks of their reflection;
- set and solve reflexive management problems;
- uniformly describe many phenomena associated with reflection: hidden control, information control through the media, reflection in psychology, works of art, etc.

The book is addressed to specialists in the field of mathematical modeling and management of socio-economic systems, as well as university and graduate students.

INTRODUCTION
CHAPTER 1. Information in decision making
1.1. Individual decision making: a model of rational behavior
1.2. Interactive decision making: games and equilibria
1.3. General approaches to describing awareness
CHAPTER 2. Strategic reflection
2.1. Strategic reflection in two-person games
2.2. Reflection in bimatrix games
2.3. Limitation of the rank of reflection
CHAPTER 3. Information reflection
3.1. Information reflection in two-person games
3.2. Information structure of the game
3.3. Information balance
3.4. Reflexive game graph
3.5. Regular awareness structures
3.6. Reflection rank and information balance
3.7. Reflexive management
CHAPTER 4. Applied models of reflective games
4.1. Hidden control
4.2. Media and information management
4.3. Reflection in psychology
4.3.1. Psychology of chess creativity
4.3.2. Transactional Analysis
4.3.3. Johari window
4.3.4. Model of ethical choice
4.4. Reflection in works of art
CONCLUSION
LITERATURE

Electronic version of the book:[Download, PDF, 29 pages, 250 KB].

Adobe Acrobat Reader is required to view the book in PDF format. new version which can be downloaded for free from the Adobe website.

You can write a book review and share your experiences. Other readers will always be interested in your opinion of the book s you"ve read. Whether you"ve loved the book or not, if you give your honest and detailed thoughts then people will find new books that are right for them.

Russian Academy of Sciences Institute of Management Problems named after. V.A. Trapeznikova D.A. NOVIKOV, A.G. CHARTISHVILI REFLECTIVE GAMES SYNTEG Moscow - 2003 UDC 519 BBK 22.18 N 73 Novikov D.A., Chkhartishvili A.G. Reflective N 73 games. M.: SINTEG, 2003. – 149 p. ISBN 5-89638-63-1 The monograph is devoted to a discussion of modern approaches to the mathematical modeling of reflection. The authors introduce into consideration a new class of game-theoretic models - reflexive games that describe the interaction of subjects (agents) making decisions based on a hierarchy of ideas about essential parameters, ideas about representations, etc. Analysis of the behavior of phantom agents that exist in the representations of other real or phantom agents, and the properties of the information structure reflecting the mutual awareness of real and phantom agents, allows us to propose an information equilibrium as a solution to a reflexive game, which is a generalization of a number of well-known equilibrium concepts in non-cooperative games. Reflective games provide the opportunity to: - model the behavior of reflective subjects; - explore the dependence of agents’ gains on the ranks of their reflection; - set and solve reflexive management problems; - uniformly describe many phenomena associated with reflection: hidden control, information control through the media, reflection in psychology, works of art, etc. The book is addressed to specialists in the field of mathematical modeling and management of socio-economic systems, as well as university and graduate students. Reviewers: Doctor of Technical Sciences, Prof. V.N. Burkov, Doctor of Technical Sciences, Prof. A.V. Shchepkin UDC 519 BBK 22.18 N 73 ISBN 5-89638-63-1 O D.A. Novikov, A.G. Chkhartishvili, 2 2003 CONTENTS INTRODUCTION................................................. ........................................................ .......... 4 CHAPTER 1. Information in decision making.................................... .......... 21 1.1. Individual decision making: a model of rational behavior.................................................... ........................................................ ........................... 21 1.2. Interactive decision making: games and equilibria.................................... 24 1.3. General approaches to describing awareness.................................... 31 CHAPTER 2. Strategic reflection...... ........................................................ 34 2.1. Strategic reflection in two-person games.................................................... 34 2.2. Reflection in bimatrix games. ........................................................ .......... 41 2.3. Limitation of the rank of reflection................................................................. ............. 57 CHAPTER 3. Information reflection.................................... ...................... 60 3.1. Information reflection in two-person games.................................................... 60 3.2. Information structure of the game................................................................... ............. 64 3.3. Information balance................................................... ................... 71 3.4. Reflexive game graph........................................................ ........................... 76 3.5. Regular awareness structures.................................................... 82 3.6. Reflection rank and information balance.................................................... 91 3.7. Reflexive management......................................................... ....................... 102 CHAPTER 4. Applied models of reflective games................... ............ 106 4.1. Hidden control................................................... ................................ 106 4.2. Media and information management................................................................... ...... 117 4.3. Reflection in psychology........................................................ ........................... 121 4.3.1. Psychology of chess creativity.................................................... 121 4.3 .2. Transactional analysis................................................... ................. 124 4.3.3. Johari Window................................................... .................................. 126 4.3.4. Model of ethical choice................................................................... .............. 128 4.4. Reflection in works of art.................................................... 129 CONCLUSION..... ........................................................ .................................... 137 REFERENCES......... ........................................................ ................................... 142 3 – Minnows frolic freely, this is their joy! - You’re not a fish, how do you know what its joy is? “You’re not me, how do you know what I know and what I don’t know?” From a Taoist parable - The point, of course, is, venerable archbishop, that you believe in what you believe in because you were raised that way. - May be so. But the fact remains that you believe in what I believe in, because I was raised that way, for the reason that you were raised that way. From the book by D. Myers “Social Psychology” INTRODUCTION This work is devoted to a discussion of modern approaches to the mathematical modeling of reflection and, first of all, to the introduction into consideration of a new class of game-theoretic models - reflexive games that describe the interaction of subjects making decisions based on a hierarchy of ideas about essential parameters, ideas about representations, etc. Reflection. One of the fundamental properties of human existence is that, along with natural (“objective”) reality, there is its reflection in consciousness. At the same time, there is an inevitable gap, a discrepancy, between natural reality and its image in consciousness (we will consider this image to be part of a special - reflexive reality). The purposeful study of this phenomenon is traditionally associated with the term “reflection”, to which the “Philosophical Dictionary” gives the following definition: “REFLECTION (Latin reflexio - turning back). A term meaning reflection, as well as the study of the cognitive act.” The term “reflection” was introduced by J. Locke; in different philosophical systems (J. Locke, G. Leibniz, D. Hume, G. Hegel, etc.) it had different content. A systematic description of reflection from the point of view of psychology began in the 60s of the 20th century (school 4 of V.A. Lefebvre). In addition, it should be noted that there is an understanding of reflection in another meaning related to reflex - “the body’s reaction to the stimulation of receptors.” This paper uses the first (philosophical) definition of reflection. To clarify the understanding of the essence of reflection, let us first consider a situation with one subject. He has ideas about natural reality, but he can also be aware of (reflect, reflect) these ideas, as well as be aware of the awareness of these ideas, etc. This is how reflective reality is formed. Reflection of the subject regarding his own ideas about reality, the principles of his activities, etc. called autoreflection or reflection of the first kind. Note that in most humanities research we're talking about , first of all, about autoreflection, which in philosophy is understood as the process of an individual’s reflection on what is happening in his mind. Reflection of the second kind takes place regarding ideas about reality, principles of decision-making, self-reflection, etc. other subjects. Let us give examples of reflection of the second kind, illustrating that in many cases correct one’s own conclusions can be made only if one takes the position of other subjects and analyzes their possible reasoning. The first example is the classic "Dirty Face Game", sometimes called the "wise men and caps problem" or "husbands and unfaithful wives" problem. Let's describe it by following. “Let's imagine that Bob and his niece Alice are in the compartment of a Victorian carriage. Everyone's face is dirty. However, no one blushes with shame, although any Victorian passenger would blush knowing that another person sees him as dirty. From this we conclude that none of the passengers knows that his face is dirty, although everyone sees the dirty face of his companion. At this time, the Conductor looks into the compartment and announces that there is a man with a dirty face in the compartment. After that, Alice blushed. She realized that her face was dirty. But why did she understand this? Didn't the Guide tell her what she already knew? 5 Let's follow Alice's chain of reasoning. Alice: Let's say my face is clean. Then Bob, knowing that one of us is dirty, must conclude that he is dirty and blush. Since he doesn’t blush, it means that my premise about my clean face is false, my face is dirty and I should blush. The conductor added information about Bob's knowledge to the information known to Alice. Before that, she didn't know that Bob knew that one of them was dirty. In short, the conductor’s message turned the knowledge that there was a person with a dirty face in the compartment into general knowledge.” The second textbook example is the “Coordinated Attack Problem”; there are problems close to it on the optimal protocol for information exchange - Electronic Mail Game, etc. (see reviews in). The situation looks like this. There are two divisions on the tops of two hills, and the enemy is located in the valley. Victory can only be won if both divisions attack the enemy at the same time. The general - commander of the first division - sends a messenger to the general - commander of the second division with the message: “We attack at dawn.” Since the messenger can be intercepted by the enemy, the first general needs to wait for a message from the second general that the first message has been received. But since the second message can also be intercepted by the enemy, the second general needs to receive confirmation from the first that he has received confirmation. And so on ad infinitum. The task is to determine after what number of messages (confirmations) it makes sense for the generals to attack the enemy. The conclusion is the following: under the described conditions, a coordinated attack is impossible, and the solution is to use probabilistic models. The third classic problem is the “two broker problem” (see also speculation models in). Let's assume that two brokers playing on the stock exchange have their own expert systems that are used to support decision making. It happens that a network administrator illegally copies both expert systems and sells his opponent’s expert system to each broker. After this, the administrator tries to sell each of them the following information: “Your opponent has your expert system.” Then the administrator tries to sell information - “Your opponent knows that you have his expert system,” etc. The question is how should brokers use the information they receive from the administrator, and what information is relevant at which iteration? Having completed our consideration of examples of reflection of the second kind, we will discuss in what situations reflection is essential. If the only reflective subject is an economic agent who seeks to maximize its objective function by choosing one of the ethically permissible actions, then natural reality is included in the objective function as a certain parameter, and the results of reflection (ideas about ideas, etc.) are not arguments of the objective function. Then we can say that self-reflection is “not needed” since it does not change the action chosen by the agent. Note that the dependence of the subject’s actions on reflection can occur in a situation where the actions are ethically unequal, that is, along with the utilitarian aspect, there is a deontological (ethical) aspect - see. However, economic decisions, as a rule, are ethically neutral, so let us consider the interaction of several subjects. If there are several subjects (the decision-making situation is interactive), then the goal function of each subject includes the actions of other subjects, that is, these actions are part of natural reality (although they themselves, of course, are determined by reflexive reality). In this case, reflection (and, therefore, the study of reflexive reality) becomes necessary. Let us consider the main approaches to mathematical modeling of reflection effects. Game theory. Formal (mathematical) models of human behavior have been created and studied for more than a century and a half (see review in) and are increasingly used both in control theory, economics, psychology, sociology, etc., and in solving specific applied problems. The most intensive development has been observed since the 40s of the 20th century - the moment of the emergence of game theory, which is usually dated to 1944 (the publication of the first edition of the book by John von Neumann and Oscar Morgenstern “Game Theory and Economic Behavior”). 7 In this work, by game we mean the interaction of parties whose interests do not coincide (note that another understanding of the game is possible - as “a type of unproductive activity, the motive of which lies not in its results, but in the process itself” - see also, where the concept of play is interpreted much more broadly). Game theory is a branch of applied mathematics that studies decision-making models in conditions of divergent interests of the parties (players), when each party seeks to influence the development of the situation in its own interests. Further, the term “agent” is used to refer to the decision-maker (player). This paper considers non-cooperative static games in normal form, that is, games in which agents choose their actions once, simultaneously and independently. Thus, the main task of game theory is to describe the interaction of several agents whose interests do not coincide, and the results of activity (winning, utility, etc.) of each depend, in the general case, on the actions of all. The result of such a description is a forecast of a reasonable outcome of the game - the so-called solution of the game (equilibrium). The description of the game consists of setting the following parameters: - a set of agents; - preferences of agents (dependence of payoffs on actions): it is assumed (and this reflects the purposefulness of behavior) that each agent is interested in maximizing his payoff; - sets of permissible actions of agents; - awareness of agents (the information they have at the time of making decisions about the chosen actions); - order of operation (order of moves – sequence of choice of actions). Relatively speaking, a set of agents determines who participates in the game. Preferences reflect what agents want, sets of acceptable actions - what they can do, awareness - what they know, and order of functioning - when they choose actions. 8 The listed parameters define the game, but they are not sufficient to predict its outcome - the solution of the game (or the equilibrium of the game), that is, a set of actions of agents that are rational and stable from one point of view or another. Today in game theory there is no universal concept of equilibrium - by accepting certain assumptions about the principles of decision-making by agents, one can obtain various solutions. Therefore, the main task of any game-theoretic research (including this work) is to construct an equilibrium. Since reflexive games are defined as such interactive interaction of agents in which they make decisions based on the hierarchy of their ideas, the awareness of agents is essential. Therefore, let us dwell on its qualitative discussion in more detail. The role of awareness. General knowledge. In game theory, philosophy, psychology, distributed systems and other areas of science (see review in), not only agents’ beliefs about essential parameters are important, but also their beliefs about the beliefs of other agents, etc. The set of these ideas is called the hierarchy of beliefs and in this work is modeled by the tree of the information structure of a reflexive game (see section 3.2). In other words, in interactive decision-making situations (modeled in game theory), each agent must predict the behavior of its opponents before choosing its action. To do this, he must have certain ideas about how his opponents see the game. But opponents must do the same, so uncertainty about the game that will be played gives rise to an infinite hierarchy of ideas among the participants in the game. Let's give an example of a hierarchy of views. Let us assume that there are two agents - A and B. Each of them can have their own non-reflective ideas about the uncertain parameter q, which we will further call the state of nature (state of nature, state of the world). Let us denote these representations qA and qB, respectively. But each of the agents, as part of the first-rank reflection process, can think about the opponent’s ideas. We denote these representations (second-order representations) as qAB and qBA, where qAB are agent A’s representations about agent B’s representations, 9 qBA are agent B’s representations about agent A’s representations. But this is not limited to this - each of the agents within the framework of the process of further reflection (reflection second rank) may think about what his opponent’s ideas are about his ideas. This is how third-order representations are generated – qABA and qBAB. The process of generating representations of higher orders can continue indefinitely (there are no logical restrictions on increasing the rank of reflection). The set of all representations – qA, qB, qAB, qBA, qABA, qBAB, etc. – forms a hierarchy of representations. A special case of awareness is when all ideas, ideas about ideas, etc. coincide to infinity - is general knowledge. More correctly, the term “common knowledge” was introduced to denote a fact that satisfies the following requirements: 1) it is known to all agents; 2) all agents know 1; 3) all agents know 2, etc. ad infinitum A formal model of general knowledge has been proposed and developed in many works - see. In game theory, this work is practically entirely devoted to models of agents’ awareness—the hierarchy of representations and general knowledge—in game theory, so we will give examples illustrating the role of general knowledge in other areas of science—philosophy, psychology, etc. (see also review). From a philosophical point of view, general knowledge was analyzed in the study of conventions. Consider the following example. The Traffic Rules state that each participant traffic must comply with these rules, and also has the right to expect that other road users comply with them. But other road users also need to make sure that others follow the rules, etc. to infinity. Therefore, the agreement to “obey traffic rules” should be common knowledge. In psychology, there is the concept of discourse - “(from Latin discursus - reasoning, argument) - human verbal thinking mediated by past experience; acts as a process of connected logical 10 reasoning, in which each subsequent thought is conditioned by the previous one.” The role of general knowledge in understanding discourse is illustrated in the following example. Two people leave the cinema. One asks the other: “How did you like the movie?” In order for the other person to understand the question, he must understand that he is being asked about the movie that they just watched together. In addition, he must understand that the first one understands it. The person asking the question, in turn, must be sure that the second one will understand that we are talking about the movie they watched, etc. That is, for adequate interaction (communication), the “film” must be shared knowledge (people must reach an agreement on the use of language). Mutual awareness of agents is also essential in distributed computing systems, artificial intelligence and other areas. In game theory, as a rule, it is assumed that all1 parameters of the game are common knowledge, that is, each agent knows all the parameters of the game, and also that this is known to all agents, etc. to infinity. This assumption corresponds to an objective description of the game and makes it possible to use the concept of Nash equilibrium2 as the predicted outcome of a non-cooperative game (that is, a game in which negotiations between agents are impossible to create coalitions, exchange information, joint actions, redistribute winnings, etc.). Thus, the shared knowledge assumption allows us to say that all agents know what game they are playing and their beliefs about the game are the same. Instead of an agent's action, we can consider something more complex - its strategy, that is, the mapping of the information available to the agent into the set of its permissible actions. Examples include: strategies in a multi-stage game, mixed strategies, strategies in Howard meta-games (see also information1 If the original model contains uncertain factors, then uncertainty elimination procedures are used that allow one to obtain a deterministic model. 2 The vector of agents’ actions is a Nash equilibrium if none of them benefits from a unilateral (that is, provided that the remaining agents choose the appropriate components of the equilibrium) deviation from the equilibrium - see the correct definition below. 11 game expansions). However, even in these cases, the rules of the game are common knowledge. Finally, the game can be considered to be randomly selected according to some distribution that is common knowledge - so-called Bayesian games. In the general case, each of the agents can have their own ideas about the parameters of the game, each of which corresponds to some subjective description of the game. In this case, it turns out that the agents participate in the game, but objectively do not know which one, or have different ideas about the game being played - its rules, goals, roles and awareness of opponents, etc. There are no universal approaches to constructing equilibria with insufficient general knowledge in game theory today. On the other hand, within the framework of the “reflective tradition” of the humanities, for each agent the world around him contains (includes) other agents, and ideas about other agents are reflected in the process of reflection (differences in ideas may be due, in particular, to unequal information). However, until now, no constructive formal results have been obtained in this area. Consequently, there is a need to develop and study mathematical models of games in which the awareness of agents is not general knowledge and the agents make decisions based on the hierarchy of their ideas. We will call this class of games reflexive games (the formal definition is given in Section 3.2 of this work). It should be recognized that the term “reflexive games” was introduced by V.A. Lefebvre in 1965 in . However, this work, as well as the work of the same author, contains mainly a qualitative discussion of the effects of reflection in the interaction of subjects, and no general solution concept has been proposed for this class of games. The same remark is also true for , in which a number of special cases of awareness of game participants were considered. Thus, it is relevant to study reflexive games and build a unified concept of equilibrium for them, which motivates the present study. 12 Before moving on to presenting the main content of the work, we will discuss at a qualitative level the main approaches used below. Basic approaches and structure of work. The first chapter, “Information in Decision Making,” which is mainly of an overview and introductory nature, presents models of individual and interactive decision making, analyzes the information required to implement certain well-known equilibrium concepts, and also discusses well-known models of general knowledge and hierarchy of representations. As defined above, a reflexive game is one in which the agents' awareness is not shared knowledge3 and the agents make decisions based on the hierarchy of their beliefs. From the point of view of game theory and reflexive decision-making models, it is advisable to separate strategic and information reflection. Information reflection is the process and result of an agent’s thoughts about what the values ​​of uncertain parameters are, what his opponents (other agents) know and think about these values. In this case, there is no actual “game” component, since the agent does not make any decisions. Strategic reflection is the process and result of an agent’s reflection on what decision-making principles his opponents (other agents) use within the framework of the awareness that he attributes to them as a result of information reflection. Thus, information reflection is usually associated with insufficient mutual awareness, and its result is used in decision-making (including strategic reflection). Strategic reflection takes place even in the case of complete information, preceding the agent’s decision on the chosen action. In other words, information and strategic reflection can be studied independently, but in conditions of incomplete and insufficient information, both of them take place. 3 If in the model under consideration awareness is general knowledge, then all the results of the study of reflexive games go into the corresponding classical results of game theory - see below. 13 Strategic reflection is discussed in the second chapter of this work. It turns out that if we assume that the agent, modeling the behavior of opponents, ascribes to them and to himself certain ranks of reflection, then the original game turns into a new game in which the agent’s strategy is to choose a rank of reflection. If we consider the process of reflection in a new game, we get a new game, etc. At the same time, even if in original game the set of possible actions was finite, then in the new game the set of possible actions—the number of different ranks of reflection—is infinite. Consequently, the main task solved when studying strategic reflection is to determine the maximum appropriate rank of reflection. The answer to this question was obtained in the second chapter for bimatrix games (section 2.2) and models that take into account the limited capabilities of humans to process information (section 2.3). Let's give an example of strategic reflection - “Penalty” (see also the examples “Game of Hide and Seek” and “Demolition on the Minor” in section 2.2). The agents are the kicker and the goalkeeper. Let's assume for simplicity that the player has two actions - “shoot into the left corner of the goal” and “shoot into the right corner of the goal.” The goalkeeper also has two actions - “catch the ball in the left corner” and “catch the ball in the right corner”. If the goalkeeper guesses which corner the player is kicking into, then he catches the ball. Let's model the agents' reasoning. Let the goalkeeper know that this player usually shoots into the right corner. Therefore, he needs to catch the ball in the right corner. But, if the goalie knows that the player knows that the goalie knows what the player usually does, then the goalie should model the player's reasoning. He may think like this: “The player knows that I know his usual tactics. So he expects me to catch the ball in the right corner and maybe shoot to the left corner. In this case, I need to catch the ball in the left corner.” If the player has sufficient depth of reflection, then he can guess the goalkeeper's reasoning and try to outsmart him by hitting him in the right corner. The goalkeeper can follow the same line of reasoning and, on this basis, catch the ball in the right corner. Both the player and the goalkeeper can increase the depth of reflection indefinitely, conducting reasoning for each other, and neither of them has rational grounds to stop at some final step. Therefore, within the framework of modeling reciprocal 14 reasoning, it is impossible to determine a priori the outcome of the game in question. The game itself, in which each agent has two possible actions, can be replaced by another game in which the agents choose reflection ranks assigned to their opponent. But in this game there is no reasonable solution, since each agent can model the behavior of the opponent, considering a “doubly reflexive” game, etc. to infinity. The only thing that can be done to help the agents in the situation under consideration is to limit the depth of their reflection, noting that starting from the second rank of reflection (due to the finiteness of the initial set of possible actions), the situation begins to repeat itself - being both at zero and at the second (and, in general, at any even level of reflexion, the player will hit the right corner. Consequently, the goalkeeper is left to guess the parity of the player’s level of reflection. The maximum rank of reflection that an agent should have in order to cover the entire variety of game outcomes (by overlooking some of the opponent’s strategies, the agent risks reducing his winnings) will be called the maximum appropriate rank of reflection. It turns out that in many cases this rank is finite - the corresponding formal results are given in sections 2.2 and 3.6). In the “Penalty” example, the maximum reasonable rank of agents’ reflection is two. If the goalkeeper does not have information about where the attacker usually kicks, the latter’s actions are symmetrical (the left and right angles are “equal”). However, there remain opportunities to artificially introduce asymmetry in order to try to take advantage of it for one’s own purposes. For example, the goalkeeper can move towards one of the corners, as if inviting the attacker to hit the other (and rushes precisely to that “far” corner). A more complex strategy is as follows. A player from the goalkeeper’s team approaches him and shows where the attacker is going to shoot, and does it in such a way that the attacker sees it (after which, at the moment of impact, the goalkeeper catches the ball not in the corner that his teammate pointedly showed him, but in the opposite one) . Note that both described techniques were taken “from life” and turned out to be successful. The first took place in an international match of the USSR national team, the second - in the final of the USSR Football Cup in a penalty shootout. 15 The third chapter is devoted to the study of formal models of information reflection. Since the key factor in reflexive games is the awareness of agents - the hierarchy of representations, then for its formal description the concept of an information structure is introduced - a tree (in the general case - infinite), the vertices of which correspond to information (representations) of agents about essential parameters, representations of other agents, etc. .d. (see example view hierarchy above). The concept of the structure of awareness (information structure) allows us to give a formal definition of some intuitively clear concepts, such as: adequate awareness of one agent about another, mutual awareness, equal awareness, etc. One of key concepts , used in this work to analyze reflexive games, is the concept of a phantom agent. Let us discuss it at a qualitative level (postponing the strict mathematical definition until Section 3.2). Let two agents interact in a certain situation - A and B. It is quite natural that in the minds of each of them there is a certain image of the other: A has an image of B (let's call it AB), and B has an image of A (let's call it BA). These images may coincide with reality, or may differ from it. In other words, an agent, for example, A, may have an adequate idea of ​​B (this fact can be written in the form of the identity AB = B), or may not have it. Here the question immediately arises: can the identity AB = B be satisfied in principle, since B is a real agent, and AB is only his image? Without going into a discussion of this essentially philosophical question, we note the following two circumstances. Firstly, we are not talking about a complete understanding of personality in its entirety, but about its modeling in a given specific situation. At the ordinary, everyday level of human communication, we are constantly faced with situations of both adequate and inadequate perception by one person of another. Secondly, within the framework of formal (game-theoretic) modeling of human behavior, an agent – ​​a participant in a situation – is described by a relatively small set of characteristics. And these characteristics can be fully known to another agent to the same extent that they are known to the researcher. 16 Let us consider in more detail the case when there is a difference between B and AB (this difference may stem, formally speaking, from the incompleteness of A’s information about B, or from trust in false information). Then A, when deciding on any of his actions, does not mean B, but the image of him that he has, that is, AB. We can say that subjectively A interacts with AB. Therefore, AB can be called a phantom agent. It does not exist in reality, but it is present in the consciousness of the real agent A and, accordingly, influences his actions, that is, reality. Let's give a simple example. Let A believe that he and B are friends, and B, knowing this, is A’s enemy (this situation can be described by the word “betrayal”). Then, obviously, in the situation there is a phantom agent AB, who can be described as follows: “B, who is a friend of A”; in reality there is no such subject. Note that in this case, B is adequately informed about A, that is, BA = A. Thus, in addition to real agents actually participating in the game, it is proposed to consider phantom agents, that is, agents that exist in the minds of real and other phantom agents. Real and phantom agents, as part of their reflection, endow phantom agents with a certain awareness, which is reflected in the information structure. There can be an infinite number of real and phantom agents participating in the game, which means the potential infinity of the implementation of acts of reflexive reflection (infinite depth of the information structure tree). Indeed, even in the simplest situation, it is possible to endlessly develop reasoning of the form “I know...”, “I know that you know...”, “I know that you know that I know...”, “I know that you know that I know that you know...”, etc. However, in practice, such “bad infinity” does not take place, since starting from a certain moment, ideas “stabilize”, and increasing the rank of reflection does not give anything new. Thus, in real situations, the structure of awareness has finite complexity: the corresponding tree has a finite number of pairwise different subtrees. In other words, a finite number of real and phantom agents4 participate in the game. The introduction of the concept of phantom agents allows us to define a reflexive game as a game of real and phantom agents, and also to define information equilibrium as a generalization of the Nash equilibrium for the case of a reflexive game, within which it is assumed that each agent (real and phantom) when calculating its subjective equilibrium (equilibrium in the game that he plays from his point of view) uses his existing hierarchy of ideas about objective and reflexive reality. A convenient tool for studying information equilibrium is the reflexive game graph, in which the vertices correspond to real and phantom agents, and each agent vertex includes arcs (their number is one less than the number of real agents) coming from the agent vertices, on whose actions the winnings in subjective equilibrium depend this agent. A reflexive game graph can be constructed without specifying the agents’ target functions. At the same time, it reflects, if not the quantitative relationship of interests, then the qualitative relationship of the awareness of reflecting agents, and is a convenient and expressive means of describing the effects of reflection (see section 3.4). For the example of two agents described above, the reflexive game graph has the form: B ¬ A « AB – real agent B (traitor) is adequately informed about agent A, who interacts with the phantom agent AB (B, who is A’s friend). Let us give another example of a graph that reflects reflexive interaction (although it is not formally a reflexive game graph in the sense of the definition introduced above). The cover of this book features the painting “Death's Head” by E. Burne-Jones, painted in 1886-1887. based on the myth of Perseus and Andromeda. There are three real agents involved in the situation: Perseus (let’s denote him by the letter P), Andromeda (A) and the gorgon Medusa (M). In addition, 4 In the limiting case - when there is general knowledge - the phantom agent of the first level coincides with its real prototype and the tree has unit depth (more precisely, all other subtrees repeat trees of a higher level). 18 there are the following “phantom” agents: reflection of Perseus (OP), reflection of Andromeda (OA) and reflection of Medusa (OM). The graph is shown in Figure 1. M P A OP OA OM Fig. 1. Graph of the painting “Deadly Head” by E. Burne-Jones (see cover) 19 The awareness of real agents in the example under consideration is as follows: Perseus sees Andromeda; Andromeda does not see Perseus, but sees his reflection, her reflection and the reflection of the gorgon Medusa; the reflection of Perseus sees the reflection of Andromeda; Andromeda's reflection sees all the real agents. Fortunately, none of the real agents sees the gorgon Medusa herself. The introduction of an information structure, information equilibrium and a reflexive game graph, firstly, allows us to describe and analyze various situations of collective decision-making by agents with different information from a unified methodological position and using a unified mathematical apparatus, to study the influence of reflexive ranks on the winnings of agents, to study the conditions existence and feasibility of information equilibria, etc. Numerous examples of application models are given below. Secondly, the proposed model of a reflexive game makes it possible to study the influence of reflexive ranks (depth of information structure) on the payoffs of agents. The results obtained in sections 2.2, 3.5 and 3.6 of this work indicate that, with minimal assumptions, it is possible to show the limitation of the maximum expedient rank of reflection. In other words, in many cases an unlimited increase in the rank of reflection is inappropriate from the point of view of the agents' payoffs. Thirdly, the presence of a reflexive game model allows us to determine the conditions of existence and properties of information equilibrium, as well as constructively and correctly formulate the problem of reflexive control, which consists in the search by the governing body of such an information structure that the information equilibrium realized in it is most beneficial from its point of view. The problem of reflexive control is posed and solved for a number of cases in Section 3. 7. The theoretical results of its solution are used in a number of applied models given in the fourth chapter - hidden control, information control through the media, etc. And, finally, fourthly, the language of reflexive games (information structures, reflexive game graphs, etc.) is convenient to describe the effects of reflection as in psychology (as illustrated by the example chess game , transactional analysis, 20 models of ethical choice, etc.), and in works of art - see the fourth chapter of this work. Having completed a qualitative review of the content of the work, we note that several approaches can be proposed to familiarize yourself with the material in this book. The first is linear, consisting in sequential reading of all four chapters. The second is intended for the reader more interested in formal models, and consists of reading the second and third chapters and briefly familiarizing themselves with the examples in the fourth chapter. The third is aimed at the reader who does not want to delve into mathematical subtleties, and consists of reading the introduction, chapter four and conclusion. CHAPTER 1. INFORMATION IN DECISION MAKING The first chapter of this work provides a model of individual decision making (section 1.1), reviews the basic concepts of solving non-cooperative games, discusses the assumptions used in these concepts about the awareness and mutual awareness of agents (section 1.2), and analyzes well-known models awareness and general knowledge (section 1.3). 1.1. INDIVIDUAL DECISION MAKING: A MODEL OF RATIONAL BEHAVIOR Let us describe, following , a model of decision making by a single agent. Let an agent be able to choose some action x from the set X of admissible actions. As a result of choosing the action x Î X, the agent receives a payoff f(x), where f: X ® Â1 is a real-valued objective function reflecting the agent’s preferences. Let us accept the hypothesis of rational behavior, which consists in the fact that the agent, taking into account all the information available to him, chooses the actions that are most preferable from the point of view of the values ​​of his objective function (this hypothesis is not the only possible one - see, for example, the concept of bounded rationality). According to the hypothesis of rational behavior, the agent chooses an alternative from a set of “best” alternatives. In the case under consideration, this set is the set of alternatives that achieve the maximum of the objective function. Consequently, the choice of action by the agent is determined by the rule of individual rational choice P(f, X) Í X, which identifies the set of actions that are most preferable from the agent’s point of view5: P(f, X) = Arg max f(x). xО X Let us complicate the model, namely, let us assume that the agent’s gain is determined not only by his own actions, but also by the value of the uncertain parameter q О W – the state of nature. That is, as a result of choosing the action x О X and realizing the state of nature q О W, the agent receives a payoff f(q, x), where f: W ´ X ® В1. If the agent's payoff depends, in addition to his actions, on an uncertain parameter - the state of nature, then in the general case there is no clearly “best” action - when deciding on the chosen action, the agent must “predict” the state of nature. Therefore, we introduce the hypothesis of determinism, which consists in the fact that the agent strives to eliminate the existing uncertainty, taking into account all the information available to him, and make decisions in conditions of complete information (in other words, the final criterion that guides the decision-making agent should not contain uncertain parameters). That is, the agent must, in accordance with the hypothesis of determinism, eliminate uncertainty regarding parameters independent of it (perhaps by introducing certain assumptions about their values). Depending on the information I that the agent has about uncertain parameters, the following are distinguished: - interval uncertainty (when only the set W of possible values ​​of uncertain parameters is known); 5 When using maximums and minimums, it is assumed that they are achieved. 22 - probabilistic uncertainty (when, in addition to the set W of possible values ​​of uncertain parameters, their probability distribution p(q) is known); - fuzzy uncertainty (when, in addition to the set W of possible values ​​of uncertain parameters, the membership function of their values ​​is known). This paper considers the simplest – “point” – case when agents have ideas about the specific meaning of the state of nature. The possibility of generalizing the results obtained to the case of interval or probabilistic uncertainty is discussed in the conclusion. Let us introduce the following assumption regarding the uncertainty elimination procedures used by the agent: interval uncertainty is eliminated by calculating the maximum guaranteed result (MGR), probabilistic - the expected value of the objective function, fuzzy - the set of maximally non-dominated alternatives6.) Let us denote f Þ f - the procedure for eliminating uncertainty, I, that is, the transition process from the objective function f(q, x) to the objective function f(x), which does not depend on the uncertain parameters. In accordance with the introduced assumption, in the case of interval uncertainty f (x) = min f(q, x), in the case of probabilistic uncertainty f (x) = q ОW ò f (x,q) p(q)dq and etc. . W Having eliminated the uncertainty, we obtain a deterministic model, that is, the rule of individual rational choice has the form:) P(f, X, I) = Arg max f (x), xО X 6 The introduced assumptions are not the only possible ones. The use of other assumptions (for example, the hypothesis about the use of MGR can be replaced by the optimism hypothesis, or the “weighted optimism-pessimism” hypothesis, etc.) will lead to other solution concepts, but the process of obtaining them will follow the general scheme implemented below. 23 where I is the information used by the agent when eliminating uncertainty f Þ f . I So far we have looked at individual decision making. Let us now consider game uncertainty, within the framework of which the agent’s assumptions about the set of possible values ​​of the game setting (the actions of other agents chosen by them within the framework of certain principles of behavior inaccurately known to the agent in question) are essential. 1.2. INTERACTIVE DECISION MAKING: GAMES AND EQUILIBRIA Game model. To describe the collective behavior of agents, it is not enough to determine their preferences and the rules of individual rational choice separately. As noted above, in the case when there is a single agent in the system, the hypothesis of its rational (individual) behavior assumes that the agent behaves in such a way as to maximize the value of its objective function by choosing an action. In the case when there are several agents, it is necessary to take into account their mutual influence: in this case, a game arises - an interaction in which the gain of each agent depends both on its own action and on the actions of other agents. If, by virtue of the hypothesis of rational behavior, each of the agents strives to maximize its objective function by choosing an action, then it is clear that in the case of several agents, the individually rational action of each of them depends on the actions of other agents7. Let's consider a game-theoretic model of interaction between n agents. Each agent selects an action xi belonging to the admissible set Xi, i О N = (1, 2, …, n) – the set of agents. The agents select actions once, simultaneously and independently. 7 Game theoretic models assume that players' rationality, that is, following their hypothesis of rational behavior, is shared knowledge. In this work, this assumption is also accepted. 24 The payoff of the i-th agent depends on its own action xi О Xi, on the vector actions x-i= (x1, x2, …, xi-1, xi+1, …, xn) О Xi= Õ X j opponents N\(i) and from the state of nature8 q О W, and jОN \ (i) is described by a real-valued payoff function fi = fi(q, x), where x = (xi, x-i) = (x1, x2, …, xn) Î X" = Õ X j is the vector of actions of all jÎN agents. For a fixed value of the state of nature, the set Г = ( N, (Xi)i О N, (fi(×))i О N) sets of agents, sets of their admissible actions and objective functions is called a game in normal form. The solution of the game (equilibrium) is the set of action vectors that are stable in one sense or another agents... Due to the hypothesis of rational behavior, each agent will strive to choose the best actions for him (from the point of view of the value of his objective function) in a given situation. The situation for him will be the totality of the situation games x-i О X-i and states of nature q О W. Consequently, the principle of making a decision on the chosen action can be written as follows (BR denotes the best response): (1) BRi(q, x-i) = Arg max fi(q, xi, x-i), i Î N. xi Î X i Let us consider the possible principles of decision-making by agents, each of which generates a corresponding concept of equilibrium, that is, it determines in what sense the predicted outcome of the game should be stable. At the same time, we will discuss the awareness that is necessary to realize balance. Equilibrium in dominant strategies. If for some agent set (1) does not depend on the situation, then it constitutes the set of his dominant strategies (the set of dominant strategies of agents is called equilibrium in dominant strategies - EDS). If each of the agents has a dominant strategy, then they can make decisions independently, that is, choose actions without having any information and without doing anything 8 The state of nature can, among other things, be a vector, the components of which reflect the individual characteristics of the agents. 25 assumptions about the situation. Unfortunately, RDS does not exist in all games. For agents to realize equilibrium in dominant strategies, if the latter exists, it is sufficient for each of them to know only their objective function and the admissible sets X" and W. Guaranteed equilibrium. Agents must have the same awareness to realize a guaranteeing (maximin) equilibrium, which exists almost in all games: (2) xi Î Arg max min min fi(q, xi, x-i), i Î N. xi Î X i x -i Î X -i q ÎW If for at least one of the agents set (1) depends on the situation (that is, there is no RDS), then the situation is more complicated. Let us study the corresponding cases. Nash equilibrium. Let us define the multivalued mapping (3) BR(q, x) = (BR1(q, x-1); BR2(q, x -2), …, BRn(q, x-n)). The Nash equilibrium for a state of nature q (more precisely, the parametric Nash equilibrium) is the point x*(q) О X" that satisfies the following condition: (4) x*(q) О BR(q, x*(q)). Embedding (4) can also be written in the form: " i О N, " yi О Xi fi(q, x*(q)) ³ fi(q, yi, x-* i (q)). The set EN(q) of all points of type (4) can be described as follows: (5) EN(q) = (x О X’ | xi О BRi(q, x-i), i О N). For the case of two agents, an alternative equivalent way to define the set EN(q) is to specify it in the form of a set of pairs of points (x1* (q), x2* (q)), simultaneously satisfying the following conditional relations: (6) x1* (q) О BR1(q, BR2(q, BR1(q, . ..BR2(q, x2* (q))...))), (7) x2* (q) О BR2(q, BR1(q, BR2(q, ...BR1(q, x1* ( q))...))). Let us consider what kind of information agents must have in order to realize the Nash equilibrium by simultaneously and independently choosing their actions. By definition, the Nash equilibrium is the point from which a one-sided deviation is unprofitable for any of the agents (provided that the remaining agents choose the corresponding components of the Nash equilibrium vector of actions). If agents repeatedly select actions, then the Nash point is in a certain sense (see details in) stable and can be considered realized within the framework of knowledge, as in the case of RDS, by each agent only its objective function and admissible sets X" and W ( At the same time, however, it is necessary to introduce additional assumptions about the principles of decision-making by agents on the choice of actions depending on the history of the game.) In this work, the consideration is limited to one-step games, therefore, in the case of agents choosing their actions once, they know only their objective functions and sets X" and W is no longer sufficient to realize the Nash equilibrium. Therefore, we introduce the following assumption, which we will consider fulfilled during the entire subsequent presentation: information about the game Г, the set W and the rationality of agents is general knowledge. The meaningfully introduced assumption means that each of the agents is rational, knows the set of participants in the game, the objective functions and admissible sets of all agents, and also knows the set of possible values ​​of states of nature. In addition, he knows that other agents know it, and also that they know that he knows it, etc. to infinity (see above). Such awareness can, in particular, be achieved by public (that is, simultaneously to all agents gathered together) communication of relevant information, which ensures the possible achievement by all agents of an infinite rank of information reflection. Note that the introduced assumption does not say anything about the awareness of agents regarding the specific value of the state of nature. If the meaning of the state of nature is general knowledge, then this is sufficient to realize the Nash equilibrium. To substantiate this statement, let us use the example of a two-person game to simulate the course of reasoning of the first agent (the second agent reasons completely similarly, and his reasoning will be considered separately only if it differs from the reasoning of the first agent). He reasons as follows (see expression (6)): “My action, by virtue of (1), should be the best response to the action of the second agent under a given state of nature. Therefore, I need to model his behavior. About him (by virtue of assumptions that objective functions and admissible sets are general knowledge) I know that he will act within the framework of (1), that is, he will look for the best response to my actions for a given state of nature (see (7)). he needs to model my actions, and he will (again, due to the introduced assumptions that objective functions and admissible sets are common knowledge) reason in the same way as I do, etc. ad infinitum (see ( 6))." In game theory, for such reasoning, a successful physical analogy of reflection in mirrors is used - see, for example,. Thus, to realize the Nash equilibrium, it is sufficient that all parameters of the game, as well as the value of the state of nature, are common knowledge (a weakening of this assumption is considered in). The reflexive games considered in this work are characterized by the fact that the meaning of the state of nature is not general knowledge, and each agent in the general case has his own ideas about this meaning, the ideas of other agents, etc. Subjective balance. The considered types of equilibrium are special cases of subjective equilibrium, which is defined as a vector of agents’ actions, each component of which is the best response of the corresponding agent to the game environment that can be realized from his subjective point of view. Let's consider possible cases. Suppose that the i-th agent expects the implementation of the game setup x-Bi (“B” denotes beliefs; the terms “assumption” and “guess” are sometimes used) and state) of nature q i , then he will choose)) (8 ) xiB О BRi(q i , x-Bi), i О N. The vector xB is a point subjective equilibrium. Note that this definition of “equilibrium” does not require the validity of agents’ assumptions about the actions of their opponents, that is, it may turn out that $ i О N: x-Bi ¹ x-Bi . A justified subjective equilibrium, that is, such that x-Bi = x-Bi , i О N, is a Nash equilibrium (for this, in particular, it is sufficient that all parameters of the game are common knowledge, and that each agent at 28 ) in the construction of x-Bi modeled the rational behavior of opponents). In the special case, if the best response of each agent does not depend on assumptions about the situation, then the subjective equilibrium is an equilibrium in dominant strategies. In a more general case, the i-th agent can count on opponents choosing actions from the set X -Bi Í X-i and realizing the state of nature from the set Wi Í W i Î N. Then the best answer will be a guarantee of subjective equilibrium:) (9) xi (X -Bi , Wi) О Arg max minB min) fi(q, xi, x-i), i О N. xi О X i B -i x ОX q ОW i -i -i) = X-i, Wi = W, i Î N, then xi(X -Bi) = xiг, i Î N, that is, guaranteeing- If X guaranteeing subjective equilibrium is a “classical” guaranteeing equilibrium. A type of guaranteeing subjective equilibrium is P-equilibrium, described in detail in. In an even more general case, the best response of the i-th agent can be considered the probability distribution pi(xi), where pi(×) О D(Xi) is the set of all possible distributions on Xi, which maximizes the agent’s expected payoff, taking into account his ideas about the distribution probabilities mi(x-i) О D(X-i) of actions chosen by other agents, and probability distribution qi(q) О D(W) of the state of nature (we obtain the Bayesian decision-making principle): (10) pi(mi(×), qi( ×), ×) = = arg max ò fi (q , xi , x-i) pi (xi) qi (q) mi (x-i) dq dx , i О N. p i ОD (X i) X ", W Thus, To implement subjective equilibrium, the agents must be minimally informed - each of them must know their objective function fi(×) and the admissible sets W and X'. However, with such awareness, the set of agents' assumptions about the state of nature and the behavior of opponents may be inconsistent. To achieve consistency , that is, in order for the assumptions to be justified, additional assumptions about the mutual awareness of the agents are necessary. The strongest assumption is the general knowledge assumption, which turns a subjective point equilibrium into a Nash equilibrium and a set of Bayesian decision principles into a Bayes–Nash equilibrium. Bayes–Nash equilibrium. If the game has incomplete information (see ), then the Bayesian game is described by the following set: - a set of N agents; - a set of K possible types of agents, where the type of the i-th agent is ki Î Ki, i Î N, the vector of types k = (k1, k2, …, kn) Î K’ = Õ K i ; - set X’ = Õ Xi iОN of admissible action vectors of agents; - a set of utility functions ui: K’ ´ X’ ® Â1; - representations mi(×|ki) О D(K-i), i О N, agents. The Bayes-Nash equilibrium in a game with incomplete information is defined as a set of agent strategies of the form si: Ki ® Xi, i Î N, which maximize the corresponding expected utilities (11) Ui(ki, si(×), s-i(×)) = ò ui (k, si(ki), s-i(k-i)) mi(k-i| ki) dk-i, i О N. k -i ÎÕ K j j ¹i In Bayesian games, as a rule, it is assumed that the representations (mi(×|×))i Î N are shared knowledge. For this, in particular, it is enough that they are consistent, that is, they are derived by each of the agents using the Bayes formula from the distribution m(k) О D(K’), which is general knowledge. For Bayesian games in which (mi(×|×))iО N is general knowledge, the concept of rationalizable strategies Di Í D(Xi), i Î N is introduced, such that Di Í BRi(D-i), i Î N. In two-person games, the set of rationalizable strategies coincides with the set of strategies obtained as a result of iterative elimination of strictly dominated strategies9. Generalization of rationalizable strategies to the case of maximin 9 Recall that a strategy of an agent is called strictly dominated such that there is another strategy of his that, in any situation, provides this agent with strictly bigger win . Iterative elimination of strictly dominated strategies consists of their sequential (in general infinite) elimination from the set of agent strategies under consideration, which leads to finding the “weakest” solution to the game - the set of non-dominated strategies. 30 (guaranteing) equilibrium is carried out in . It is possible to complicate the constructions of subjective equilibrium by introducing prohibitions on certain combinations of actions of agents, etc. Thus, the implementation of the RDS, guarantee and subjective equilibria (if they exist) requires that each agent has, at a minimum, information about its objective function and all admissible sets, and the implementation of the Nash equilibrium, if it exists, additionally requires that the values ​​of all essential parameters were common knowledge. Let us note once again that the realizability of the Nash equilibrium implies the ability of agents (and the governing body - the center, or the operations researcher, if they have the appropriate information) to a priori and independently calculate the Nash equilibrium and, in a one-step game, immediately choose Nash equilibrium actions (in this case, a separate question is in which of the equilibria the agents and the center will choose if there are several Nash equilibria). Qualitatively, general knowledge is necessary so that each of the agents (and the center) can model the principles of decision-making by other agents, including those taking into account its own principles of decision-making, etc. Therefore, we can conclude that the concept of solving a game is closely related to the awareness of agents. Decision concepts such as RDS and Nash equilibrium are, in a sense, limiting cases - the first requires minimal information, the second requires an infinite rank of information reflection of all agents. Therefore, below we will describe other (“intermediate”) cases of agents’ awareness—of a hierarchy of representations—and construct game solutions corresponding to them. Before implementing this program, we will review well-known models of general knowledge and hierarchy of representations. 1.3. GENERAL APPROACHES TO DESCRIBING INFORMATION In the equilibrium concepts discussed in the previous section (with the possible exception of Nash and Bayes-Nash equilibria, which assume the presence of general knowledge), reflection is absent31, since each agent does not try to take the position of its opponents. Reflection occurs in the case when an agent has and uses a hierarchy of representations when making decisions - its representations about the representations of other agents, their representations about its representations and the representations of each other, etc. The analysis of ideas about uncertain factors corresponds to information reflection, and the analysis of ideas about the principles of decision-making corresponds to strategic reflection. In terms of subjective equilibrium, strategic reflection corresponds to the agent’s assumptions that the opponent will calculate this or that specific, for example, subjective guaranteeing, equilibrium, and information reflection corresponds to what specific assumptions about the situation the opponent will use. Let us consider currently known10 approaches to describing the hierarchy of representations and general knowledge. As noted in, there are two approaches to describing awareness - syntactic and semantic (recall that “syntactics is the syntax of sign systems, that is, the structure of the combination of signs and the rules of their formation and transformation, regardless of their meanings and functions of sign systems”, “semantics - studies sign systems as a means of expressing meaning; its main subject is the interpretation of signs and symbol combinations." The foundations of these approaches were laid in mathematical logic. In the syntactic approach, the hierarchy of representations is described explicitly. If representations are specified by a probability distribution, then the hierarchy of representations at a certain level of the hierarchy corresponds to distributions on the product of a set of states of nature and distributions reflecting representations of previous levels. An alternative is to use “formulas” (in the logical sense), that is, rules for transforming elements of the original set based on the application of logical 10 It should be noted that representation hierarchies and general knowledge have become the subject of research in game theory quite recently - the pioneering books are the above-mentioned books by D. Lewis (1969) and article by R. Aumann (1976). An analysis of the chronology of publications (see bibliography) indicates a growing interest in this problem area. 32 operations and operators of the form “player i believes that the probability of the event ... is not less than a.” In this case, knowledge is modeled by sentences (formulas) constructed in accordance with certain syntactic rules. Within the semantic approach, agents' representations are specified by probability distributions on a set of states of nature. The hierarchy of representations is generated based only on these distributions. In the simplest deterministic case, knowledge is represented by a set W of possible values ​​of an uncertain parameter and partitions (Ri)i О N of this set. The partition element Ri, including q О W, represents the knowledge of the ith agent - a set of values ​​of an uncertain parameter that are indistinguishable from his point of view given the known fact q. The correspondence (relatively speaking, “equivalence”) between the syntactic and semantic approaches is established in. Of particular note are the experimental studies of representation hierarchies in - see review in. Conducted short review indicates that there are two “extremes”. The first “extreme” is general knowledge (the merit of J. Harsanyi is that he reduced all information about the agent that influences his behavior to his only characteristic - type - and built an equilibrium (Bayes-Nash) within the framework of the hypothesis that the probability distribution of types is common knowledge). The second “extreme” is an endless hierarchy of consistent or inconsistent ideas. An example of the latter is the construction given in, which, on the one hand, describes all possible Bayesian games and all possible hierarchies of representations, and, on the other hand, (due to its generality) is so cumbersome that it does not allow constructively posing and solving specific problems. Most studies of awareness are devoted to answering the question: in what cases the hierarchy of agents’ beliefs describes general knowledge and/or adequately reflects the awareness of agents. The dependence of the game solution on the final hierarchy of consistent or inconsistent representations of agents (that is, the entire range between the two “extremes” noted above) has been practically not studied. Exceptions33 include, firstly, the work in which Bayes–Nash equilibria for three-level hierarchies of inconsistent probabilistic representations of two agents were constructed under the assumption that at the lower level of the hierarchy the representations coincide with the representations of the previous level - see also assumptions of type Pm and the corresponding equilibria in . Secondly, the third chapter of this work, which describes arbitrary (finite or infinite, consistent or inconsistent) hierarchies of “point” representations for which information equilibrium is constructed and studied - the equilibrium of a reflexive game (the possibility and feasibility of generalizing the results obtained to the case of interval or probabilistic representations of agents are discussed in the conclusion). Thus, both the study of strategic reflection (Chapter 2 of this work) and the construction of a solution to a reflexive game, and the study of the dependence of this equilibrium on the hierarchy of agents’ representations (Chapter 3 of this work), are relevant. CHAPTER 2. STRATEGIC REFLECTION This chapter examines game-theoretic models of strategic reflection. Section 2.1 studies the model of strategic reflection in a two-person game, which in Section 2.2 allows us to solve the problem of the maximum appropriate rank of strategic reflection in bimatrix games. Section 2.3 is devoted to a discussion of the finiteness of the rank of reflection, generated by the limited ability of a person to process information. 2.1. STRATEGIC REFLECTION IN TWO-PERSON GAMES Let us consider sequentially, in order of increasing awareness, reflexive decision-making models in two-person games. Zero rank of reflection. Let us consider the problem of an agent making a decision in the case of a complete lack of information about the state of nature (recall that the assumption that the target functions and admissible sets are general knowledge is considered fulfilled). It seems reasonable, on the one hand, the principle of decision-making based on the maximum guaranteed result, according to which the i-th agent will choose a strategy that guarantees (according to the state of nature and the action of the opponent) (12) 1 xiг = arg max min min fi(q, xi ,x-i). xi Î X i q ÎW x -i Î X -i On the other hand, hypothetically, the decision-making principle (12) is not the only possible one - an agent can count on his opponent choosing not the worst action, but his own guaranteeing strategy (note that each agent can calculate the opponent's guaranteeing strategy). Then the best answer will be (13) 2 xig = arg max min fi(q, xi, 1 x-g i). xi Î X i q ÎW But the opponent of the agent in question can reason in a similar way. If the agent in question allows for such a possibility, then its guaranteeing strategy will be (14) 3 xiг = arg max min fi(q, xi, 2 x-г i), xi О X i q ОW г -i where 2 x is calculated in accordance with ( 13) replacing the index “i” with “i” and vice versa. The chain of increasing the “rank of reflection” (the agent’s assumptions about the rank of the opponent’s reflection) can be continued further (see analogies in the dynamic models discussed in ), defining recursively (15) k xiг = arg max min fi(q, xi, k -1 x-г i), k = 2, 3, ..., xi О X i q ОW г 1 i where x, i = 1, 2, are determined by (12). We will call a set of actions of type (15) a set of reflexive guaranteeing strategies. Let's look at an illustrative example. Example 1. Let the agents’ target functions have the form: f1(x1, x2) = x1 – x12 /2x2, f2(x1, x2) = x2 – x22 /2(x1 + d), where d > 0. Regarding admissible sets, we assume , that X1 = X2 = , 0< e < 1. Будем считать, что каждая из констант e и 35 d много меньше единицы. Гарантирующие стратегии агентов приведены в таблице 1. Табл. 1. Гарантирующие стратегии агентов в примере 1 k г k x1 1 e 2 e+d 3 e+d 4 e + 2d 5 e + 2d 6 e + 3d 7 e + 3d ... ... x2г e+d e+d e + 2d e + 2d e + 3d e + 3d e + 4d ... k Видно, что, во-первых, значения гарантирующих действий увеличиваются с ростом «ранга рефлексии». Во-вторых, различным «рангам рефлексии» агентов соответствуют в общем случае различные гарантирующие действия (отметим, что равновесием11 Нэша в данном примере является вектор (1; 1)) ·12. Вопрос о том, какое действие следует выбирать агенту, остается открытым. Единственно, можно констатировать, что, обладая информацией только о множестве возможных значений состояния природы, i-ый агент может выбирать одно из действий k xiг, i = 1, 2; k = 1, 2, ..., определяемых выражениями (12) и (15). Доопределить рациональный выбор агента в рассматриваемой модели можно следующим образом. Если агенту неизвестна целевая функция оппонента (что исключено в рамках предположения о том, что целевые функции и допустимые множества являются общим знанием), то единственным его рациональным действием является выбор (12), то есть классический МГР. В рамках введенных предположений агенту известна целевая функция оппонента, а также известно, что оппоненту известен этот факт и т.д. Поэтому с точки зрения агента нерационально использование классического МГР, и ему следует рассчитывать, как минимум, что оппонент будет ис11 В качестве отступления заметим, что, если в рассматриваемом примере целевая 2 функция второго агента имеет вид f2(x1, x2) = x2 + x2 /2x1, то у него существует доминантная стратегия (равная единице), и последовательность гарантирующих стратегий первого агента стабилизируется уже на втором члене: 2 г i x x 2 xiг. Символ «·» здесь и далее обозначает окончание примера или доказательства. 36 = e, = 1/2. Если первый агент может вычислить доминантную стратегию своего оппонента, то представляется рациональным выбор им действия 12 г 1 i пользовать МГР, что приведет к выбору 2 xiг. Но, опять же, в силу того, что целевые функции являются общим знанием, агент может предположить, что такой ход его рассуждений может быть восстановлен оппонентом, что сделает целесообразным выбор 3 xiг и т.д. до бесконечности. Следовательно, с точки зрения агента остается неопределенность относительно «ранга рефлексии» оппонента13. Относительно этого параметра он не имеет никакой информации (если у агента имеются некоторые убеждения по этому поводу, то может реализоваться соответствующее субъективное равновесие), что делает рациональным использование гарантированного результата по «рангу рефлексии» оппонента: (16) x’i = arg max min min fi(q, xi, j x-г i). xi Î X i j =1, 2,... q ÎW Отметим, что, во-первых, x’i может отличаться от классической гарантирующей стратегии 1 xiг, определяемой выражением (12). Вовторых, при использовании стратегии (16) факт наличия доминантной стратегии оппонента будет учтен агентом (см. сноску в примере 1). В таблице 2 приведены значения целевой функции первого агента в примере 1 в зависимости от «ранга рефлексии» оппонента и соответствующие действия оппонента. Видно, что при использовании стратегии (16) выигрыш i-го агента равен e + d, что превышает выигрыш e, получаемый при использовании классического МГР. Табл. 2. Выигрыши первого агента в примере 1 j г 2 г 2 2 f1(BR1(j x), j x) j x2г 1 2 3 4 5 6 7 e+d e+d e + 2d e + 2d e + 3d e + 3d e + 4d e+d e+d e + 2d e + 2d e + 3d e + 3d e + 4d 13 Другими словами, исходная игра может быть заменена на игру, в которой агенты выбирают ранги своей рефлексии. Для new game reflexive analogues can also be constructed, etc. ad infinitum (see examples: “Penalty” - in the introduction, “Hide and Seek” and “Demolition on the Minor” - in section 2.2). One of the possible ways to combat such “infinity” is to use a guaranteed result based on the opponent’s reflection rank. Another possible way, effective for finite games, is to determine the maximum reasonable rank of reflection of agents - see section 2.2. 37 Thus, in the model under consideration, the agent’s use of strategy (15) or (16) can be considered rational. First rank of reflection. Let us now assume that the agent has certain information about the state of nature that he believes to be true, and that he knows nothing else with certainty. Within the framework of the existing uncertainty, due to the principle of determinism, the agent carrying out strategic reflection has two alternatives - either to assume that his opponent does not have any information, or to assume that the latter has the same information as himself14. If the agent does not introduce any assumptions about the awareness and principles of behavior of the opponent, then he is forced to apply the principle of maximum guaranteed result (MGR) - no additional (compared to the zero-rank reflection model discussed above) information about the opponent has been added to the agent15 - that is, count on the worst choice for him of the second agent from a set of strategies of type (16). The guaranteeing strategy will be: (17) xig (qi) = arg max min fi(qi, xi, j x-g i). xi О X i j =1, 2,... Note that, being in the information situation corresponding to the model under consideration, calculating (17), the agent considers the opponent as being in the information situation corresponding to the previous model. This general principle—having some information, an agent can consider an opponent as having either the same or one lower reflexive rank—will be used in a number of other reflexive decision-making models. If the first agent believes that his opponent has the same information as himself (the second agent can reason similarly - see assumption P1 in), then he calculates the subjective 14 This principle (and its generalizations) will be widely used below in determining finite information structures - indeed, having information Ii, the i-th agent can, in case of uncertainty, attribute to other agents only information consistent with Ii. 15 Of course, the agent may assume that the opponent has some information, but since this information does not appear in the model, we will not consider such assumptions. 38 equilibrium (that is, the “Nash equilibrium” for the corresponding subjective description of the game) EN(q1) = ((x11 (q1), x12 (q1))) of the following form: * * * (18) " x1 О X1 f1 (q1, x11 (q1), x12 (q1)) ³ f1(q1, x1, x12 (q1)), * * * " x2 О X2 f2(q1, x11 (q1), x12 (q1)) ³ f1( q1, x11 (q1), x2). In essence, the above systems of inequalities reflect the calculation by the first agent of “his” Nash equilibrium and the choice of the appropriate coordinate of this equilibrium. In the general case, the agent and his opponent will calculate different equilibria - a coincidence is possible if the awareness is such that xij* (qi) = x*jj (qj), i, j = 1, 2. Thus, rational in the model of the first rank of reflection can be consider the agent's choice of either a reflexive guaranteeing strategy (17) or a subjective equilibrium (18). Subjective equilibrium (18), determined by the first agent, can be conventionally depicted as a graph with two vertices x12 x1 and x1 and x12, corresponding to the first agent and his ideas about the second agent16 (see Fig. 1. Subjective node 1). Incoming arrows during equilibrium in the first model reflect the information of strategic rank that each agent uses from the agents’ reflection about the opponent. Second rank of reflection. In the model of the second rank of reflection, the ith agent has information about the opponent’s ideas qij about the state of nature and about his own ideas qii about the state of nature (we will assume that qi = qii - see the axiom of self-information below). The agent can expect that his opponent will choose a strategy that guarantees (within the knowledge of qij). Then the best answer would be 16 Such agents that exist in the representations of other agents are called phantom agents. 39 (19) 2 xiг = arg max fi(qi, xi, x-г i (qij)), xi О X i г -i where x (qi,-i) is defined by (17). In addition to the guaranteeing strategy (19), the first agent can calculate a subjective equilibrium * * EN(q1, q12) = ((x11 (q1, q12), x12 (q1, q12))) of the following form: * * * (q1,q12) , x12 (q1,q12)) ³ f1(q1, x1, x12 (q1,q12)), (20) " x1 О X1 f1(q1, x11 * * * " x2 О X2 f2(q12, x121 (q1, q12), x12 (q1,q12)) ³ f2(q12, x121 (q1,q12), x2), * * * " x1 О X1 f1(q12, x121 (q1,q12), x12 (q1,q12)) ³ f2(q12, x1, x12 (q1,q12)). As in the previous model, in the general case, the first agent and his opponent will calculate different equilibria. Thus, rational in the model of the second rank of reflection can be considered the choice by the agent of either a reflexive guaranteeing strategy (19), or subjective equilibrium (20). Note that the first two systems of inequalities in (20) reflect the Nash equilibrium from the point of view x12 x1 of the first agent, and the second and third systems of inequalities reflect the Nash equilibrium, which must be determined by the second agent from the point from the perspective of the first agent - see graph in Figure 3, in which Fig. is circled with a dotted line. 3. Subjective “model” of the second agent, which is used by the first agent during equilibrium in the RDM2 decision-making model. The analysis of the simplest models of strategic reflection of the first few ranks indicates that in the case of several agents and their lack of information, it is possible to consider their decision-making processes independently - each of them models the behavior of their opponents, that is, strives to build their own closed model of the game (see discussion of the differences in subjective and an objective description of the game in). In the case of general knowledge, subjective models coincide. 40 Above we looked at reflection of zero, first and second ranks. The ranks of reflection can be further increased by analogy. Essential in all models are the agent’s assumptions about what rank of reflection his opponent has, that is, in fact, the rank of the agent’s reflection is determined by what rank of reflection he attributes to his opponent. A priori, no reasonable recommendations that limit the growth of the rank of one’s own reflection can be offered to the agent. From this point of view, it can be stated that there is no universal concept of equilibrium for games with strategic reflection. The only way out is to use in this case either MGR based on the ranks of the opponent’s reflection, or subjective equilibrium, within the framework of which each agent introduces certain assumptions about the rank of the opponent’s reflection and chooses its action, which is optimal within the framework of these assumptions. Therefore, we will concentrate our main attention on studying cases when an unlimited increase in the rank of reflection does not occur. There are two reasons why the rank of reflection may be finite. Firstly, it is inappropriate to increase the rank of reflection beyond a certain point from the point of view of the agent’s gain (when a further increase in the rank of reflection obviously does not lead to an increase in gain). Secondly, a person’s ability to process information is limited, and the infinite rank of reflection is nothing more than a mathematical abstraction. Therefore, in the subsequent sections of this chapter, models are presented that take into account both of these reasons - in section 2.2, using the example of bimatrix games, the maximum appropriate rank of strategic reflection is determined, and in section 2.3 the role of information restrictions is explored. 2.2. REFLECTION IN BIMATRIX GAMES The main idea developed in this section is that in bimatrix games17 in which there is no Nash equilibrium, or in which, given the existing Nash equilibrium, agents choose subjective guaranteeing strategies (see. 17 Recall that finite games of two persons are called bimatrix. 41 previous section of this work), the payoff of each agent depends both on its reflection rank and on the opponent’s reflection rank. In addition, it is shown that an unlimited increase in the rank of strategic reflection does not lead to an increase in winnings. Let's move on to the formal description. Let us consider a bimatrix game18 in which the payoffs of the first and second agents are given by the matrices A = ||aij|| and B = ||bij|| dimensions n ´ m respectively. Let us denote19 I = (1, 2, …, n) is the set of actions of the first agent (choosing a row), J = (1, 2, …, m) is the set of actions of the second agent (choosing a column). In the game under consideration, the guaranteeing strategies of the agents are as follows: i0 О Arg max min aij, j0 О Arg max min bij. iÎI jÎJ jÎJ iÎI Let us introduce the following assumptions. Let the payoff matrices be such that each action of each agent is the best response to some action of the opponent, and let, in addition, the best response to each action of the opponent is unique (if there are several best answers, then a rule can be introduced that further determines the choice of the agent).20 Consequently, when determining the best answers, instead of the expressions “i… Î Arg max …” and iÎI “j… Î Arg max …”, you can use, respectively, the expressions jÎJ “i… = arg max …” and “j… = arg max …”. iÎI jÎJ Let us denote a0 = max min aij, b0 = max min bij – the maximum iÎI jÎJ jÎJ iÎI guaranteed results (MGR) of the first and second agents, respectively. 18 Since matrix games(antagonistic finite games of two persons) are a special case of bimatrix games, then all the results given in this section are also valid for matrix games. 19 Let us hope that using the same (historical) designation for the information structure and multiple actions of the first agent will not lead to confusion. 20 If we abandon these assumptions, then all the results obtained in this section will remain in force, since the introduced assumptions allow us to obtain an upper estimate for the maximum appropriate rank of strategic reflection. 42 Let us define a reflexive bimatrix game MGkl (matrix game) as a bimatrix game with matrices A and B, in which the first and second agents have reflexion ranks equal to k and l, respectively, k, l О └, where └ is the set of natural numbers. Let us explain what is meant by the rank of reflection (more precisely, the rank of strategic reflection) in bimatrix games. In bimatrix (and not only bimatrix - see) games, the choice of actions by agents can be carried out based on knowledge of the opponent’s reflection ranks. Reflection ranks are determined as follows. “An agent has zero reflection rank if he knows only the payment matrix. An agent has the first rank of reflection if he believes that his opponents have a zero rank of reflection, that is, they know only the payment matrix. In general, an agent with the kth rank of reflection assumes that its opponents have the k–1st rank of reflection. He carries out the necessary reasoning for them to choose a strategy and chooses his strategy based on knowledge of the payment matrix and extrapolation of the actions of his opponents.” Let's give an illustrative example. Example 2 (Hide and seek game). The first agent hides in one of several rooms of different lighting, and the other agent must choose the room where he will look for him. The degrees of illumination are known to both agents. The agents' strategies are as follows. The seeker, all other things being equal, prefers to look where it is lighter (it is easier to find there). It is clear to the one hiding that in more dark room the chances of finding it are less than in a lit one. An increase in the rank of reflection means that it becomes clear to the agent that this is also clear to his opponent, etc. Let us present the agents' reflection ranks and the corresponding actions for choosing rooms in the form of Table 3. Table. 3. Rank of agents' reflection and corresponding actions for choosing rooms Rank of agent's reflection Room chosen by the hider 0 Darkest 1 Any except the lightest 2 Any, except the darkest 3 Lightest 4 Darkest 43 Room chosen by the seeker Lightest Darkest Any, except the lightest Any, except the darkest The lightest You can see that after the second rank of reflection the entire set of permissible actions is exhausted, and after the third rank of reflection the room selection strategies begin to repeat. This fact illustrated the fact that in a two-person game, increasing the ranks of reflection above a certain objective objectively does not give anything new, although the subjective increase in complexity may continue. The discrepancy between the ranks of reflection on the success of activities is as follows. Let the hider have rank 0 (hiding in the darkest room). If the seeker has rank 1, then he always wins (searches in the darkest room). But if the seeker has rank 3 (searches in any room except the darkest), then he always loses to the hider with rank 0, since he, as we remember, does not bother to reason about what the enemy thinks, hides precisely in this darkest room, where the seeker, after a series of reflective reasonings, will never look. Thus, it is impossible to unequivocally state that a higher rank of reflection is better than a lower one. The preference of a particular rank is determined by its interaction with the opponent’s reflection rank. · Since in bimatrix games it is assumed that each agent has a certain belief about the opponent’s reflection rank, this allows the use of the concept of a subjective guaranteeing strategy. Let us define subjective guaranteeing strategies in the bimatrix game MGkl: (21) ik = arg max aijk -1 , jl = arg max bil -1 j , k, l О À. iÎI jÎJ Thus, the game MG00 coincides with the original game, and the “equilibrium” in the game MGkl is (aik jl ; bik jl), k, l Î À. Let us note two interesting facts. Firstly, the payoff of any agent in the game MGkl for k ³ 1, l ³ 1 may turn out to be less than the maximum guaranteed one (see the example “Demolition on a minuscule scale” below). Secondly, the attribution44 of each agent to the opponent with a reflection rank one less than his own is contradictory, since in the game MGkl for k ³ 1, l ³ 1 this means that l = k – 1 and k = l – 1 must be satisfied simultaneously, which is obviously impossible. Consequently, the equilibrium in a reflexive game is essentially subjective, and a priori the agents do not know what game they are playing (the reflection ranks of both agents cannot be common knowledge, since this would contradict the very definition of the reflection rank). Therefore, a promising direction for future research seems to be the study of information reflection regarding the ranks of agents’ reflection in bimatrix games. The internal inconsistency of strategic reflection in bimatrix games can be illustrated by the following diagram: Figure 4a shows a subjective description of the game MGkl in terms of the reflexive game graph from the point of view of the first agent, Figure 4b shows a subjective description of the same game from the point of view of the second agent. i0 j0 i0 j0 i1 j1 i1 j1 … … ik-2 jk-2 il-2 jl-2 ik-1 jk-1 il-1 jl-1 ik ? Rice. 4a. Subjective description of the game MGkl from the point of view of the first agent? jl Fig. 4b. Subjective description of the game MGkl from the point of view of the second agent 45 Looking ahead somewhat (see Section 3.4), we note that the graph of a reflexive game has the property that the number of arcs included in each of its vertices must be one less than the number of agents ( that is, in bimatrix games it is equal to one). Subjective equilibrium actions are in bold and result in “equilibrium” (ik, jl). Actions ik-1 for the first agent and jl-1 for the second are not used in the corresponding subjective descriptions of the game (see question marks in Figure 4), that is, each of them turns out to be internally open. Having completed a brief discussion of the internal inconsistency in determining the rank of strategic reflection in bimatrix games, let us return to the study of the dependence of the subjective equilibrium and payoffs of agents on the ranks of their reflection. Let us denote IK = ik, JL = jl, K = 0, 1, 2, …, U U k =0,1,...,K l =0,1,...,L L = 0, 1, 2, … . By I¥ and J¥ we mean the corresponding unions over all ranks of reflection from zero to infinity. If one agent (or both agents) does not know the opponent’s reflection rank, then it is advisable to consider the game MG¥¥, in which each agent calculates the guaranteed result based on the opponent’s reflection rank. Let us introduce guaranteeing strategies corresponding to complete uncertainty regarding the opponent’s reflection rank: (22) i¥ = arg max min aij, j¥ = arg max min bij. iÎI jÎJ ¥ jÎJ iÎI ¥ Similarly, one can define guaranteeing strategies within the framework of information that the rank of the opponent’s reflection does not exceed a known value (that is, the first agent believes that the rank of the second’s reflection is not higher than L, and the second – that the rank of the first’s reflection is not higher than K ): (23) iL = arg max min aijl , jK = arg max min bik j . iÎI lÎJ L jÎJ kÎI K Note that in (23), unlike (21), the strategy of each agent does not depend on its own reflexion rank, but is determined by information about the opponent’s reflexion rank. Expressions (21)-(23) do not exhaust the entire variety of possible situations, since, for example, the first agent may assume that the second will choose j¥, and then his best answer will be arg max aij¥, etc. In addition, although only “strong” agents are capable of increasing the rank of reflection iÎI, it is intuitively clear that with an increase in this rank, that is, with a lengthening of the chain of reasoning “I think that he thinks that I think...” there is a danger of “overthinking " A strong agent with a high reflexion rank overestimates the enemy, assuming that he also has a high reflexion rank. But, if the opponent's rank is actually low, this leads to a loss to a weaker opponent - see the examples "Hide and Seek" and "Demolition on Minor". Therefore, a systematic study of the ratio of agents' payoffs depending on the type of game played is necessary. Let us present the results of this study. Essential for our consideration is the presence or absence of a Nash equilibrium, as well as the choice by agents (and the use in constructing subjective equilibria) of guaranteeing strategies or actions that are Nash equilibrium. Thus, the following four situations are possible. Option 1 (Nash equilibrium exists in pure strategies, and agents are guided by Nash equilibrium actions). Let us denote (i*; j*) the numbers of Nash equilibrium pure strategies. Then, if, by analogy with (21), we assume that in a reflexive game each agent chooses its best response to the opponent’s choice of the corresponding equilibrium component, we obtain that (24) ik = arg max aij* , jl = arg max bi* j , k, l О А. iÎI jÎJ From (24), due to the definition of Nash equilibrium, it follows that ik = i*, jl = j*, k, l О А, that is, within the framework of option 1, strategic reflection is meaningless21 (except, perhaps, for the case when the best answers are determined in such a way that agents choose components of different Nash equilibria in the case where there are several of them). Option 2 (Nash equilibrium exists in pure strategies, but agents choose guaranteeing strategies (21)). 21 By the meaninglessness of strategic reflection in bimatrix games we mean the case when the equilibrium in a reflexive game with any combination of non-zero agent reflection ranks coincides with the equilibrium in the original game. 47 If guaranteeing strategies form a Nash equilibrium (as is the case in antagonistic games with a saddle point), then we find ourselves in the conditions of option 1. Consequently, strategic reflection makes sense only if, within the framework of option 2, the Nash equilibrium does not coincide with the equilibrium in guaranteeing strategies (i0, j0). Option 3 (Nash equilibrium does not exist in pure strategies, and agents focus on Nash equilibrium mixed strategies22). If agents, when determining their best responses by analogy with (24), count on the opponent choosing Nash equilibrium mixed strategies, then it is easy to show that the maximum expected payoff of each agent will be achieved if he also chooses the corresponding Nash equilibrium mixed strategy. Consequently, within the framework of option 3, any equilibrium coincides with the Nash equilibrium in mixed strategies, that is, strategic reflection in this case is meaningless. Option 4 (Nash equilibrium does not exist in pure strategies, and agents are guided by guaranteeing strategies (21)). In the fourth option, the analysis of reflection obviously makes sense. Thus, having considered all four possible options for the behavior of agents, we obtain that the validity of the following statement is justified. Statement 1. Strategic reflection in bimatrix games makes sense if agents use subjective guaranteeing strategies (21), which are not Nash equilibrium. Let us denote (25) Kmin = min (K Î À | IK = I¥), (26) Lmin = min (L Î À | JL = J¥). In essence, Kmin and Lmin are the minimum ranks of reflection of the first and second agents, at which their sets of subjective equilibrium actions coincide with the maximum possible sets of subjective guaranteeing strategies in the game under consideration. 22 Recall that in bimatrix games the Nash equilibrium in mixed strategies always exists. 48 By definition " K, L Î À IK Í IK+1, JL Í JL+1. This means " K ³ Kmin IK = I¥, " L ³ Lmin JL = J¥. If the rank of reflection of the first and second agents does not exceed K and L, respectively, then the sets of subjective guaranteeing strategies of the first and second agents from the opponent’s point of view are equal to IL-1 and JK-1, respectively.This means that an increase in the ranks of reflection can lead to an expansion of the set of subjective guaranteeing strategies if (27) L – 1< Kmin, (28) K – 1 < Lmin. Отметим, что с рассматриваемой точки зрения максимальный целесообразный ранг рефлексии23 первого агента зависит от свойств субъективных гарантирующих стратегий второго агента (см. (28)), и наоборот. С другой стороны, агенту не имеет смысла увеличивать ранг своей рефлексии, если он уже «исчерпал» собственное множество возможных субъективных равновесных действий. С этой точки зрения увеличение рангов рефлексии может приводить к расширению множества субъективных гарантирующих стратегий, если (29) K < Kmin, (30) L < Lmin. Объединяя (28) и (29), а также (27) и (30), получаем, что первому агенту не имеет смысла увеличивать свой ранг рефлексии выше (31) Kmax = min {Kmin, Lmin + 1}, а второму агенту не имеет смысла увеличивать свой ранг рефлексии выше (32) Lmax = min {Lmin, Kmin + 1}. Обозначим (33) Rmax = max {Kmax, Lmax}. Таким образом, доказана справедливость следующего утверждения. 23 Под максимальным целесообразным рангом рефлексии агента будем понимать такое его значение, что увеличение ранга рефлексии выше данного не приводит к появлению новых субъективных (с точки зрения данного агента) равновесий. 49 Утверждение 2. Использование агентами в биматричной игре рангов стратегической рефлексии выше, чем (31) и (32), не имеет смысла24. Утверждение 2 дает возможность в каждом конкретном случае (для конкретной разыгрываемой игры) каждому агенту (и исследователю операций) вычислить максимальные целесообразные ранги стратегической рефлексии обоих агентов. Так как величины (31)-(33) зависят от игры (матриц выигрышей), то получим оценки зависимости этих величин от размерности матриц выигрышей (очевидно, что |I¥| £ |I| = n, |J¥| £ |J| = m, а для игр размерности два справедлива более точная оценка – см. утверждение 3). Для этого введем в рассмотрение граф наилучших ответов. Графом наилучших ответов G = (V, E) назовем конечный двудольный ориентированный граф, в котором множество вершин V = I È J, а дуги проведены от каждой вершины (соответствующей действию одного из агентов) к наилучшему на нее ответу оппонента. Опишем свойства введенного графа: 1. Из каждой вершины множества I выходит дуга в вершину множества J (у второго агента есть наилучший ответ на любое действие первого агента), из каждой вершины множества J выходит дуга в вершину множества I (у первого агента есть наилучший ответ на любое действие второго агента). 2. В каждую вершину множества V входит ровно одна дуга (так как каждое действие каждого агента является наилучшим ответом на какое-либо действие оппонента). 3. Если любой путь дважды прошел через одну и ту же вершину, то по определению наилучших ответов его часть является контуром, и в дальнейшем новых вершин в этом пути не появится. 4. Максимальное число попарно различных действий первого агента, содержащихся в пути, начинающемся в вершине i0, равно min (n; m + 1). 5. Максимальное число попарно различных действий второго агента, содержащихся в пути, начинающемся в вершине i0, равно min (n; m). 24 То есть для любого ранга рефлексии, превышающего указанные оценки, найдется ранг рефлексии, удовлетворяющий указанным оценкам и приводящий к тому же субъективному равновесию. 50 6. Максимальное число попарно различных действий первого агента, содержащихся в пути, начинающемся в вершине j0, равно min (n; m). 7. Максимальное число попарно различных действий второго агента, содержащихся в пути, начинающемся в вершине j0, равно min (n + 1; m). Выявленные свойства графа наилучших ответов позволяют получить оценки сверху целесообразных рангов стратегической рефлексии в биматричных играх. Утверждение 3. В биматричных играх 2 ´ 2, в которых не существует равновесия Нэша, I¥ = I, J¥ = J. Доказательство. Рассмотрим произвольную биматричную игру 2 ´ 2, в которой не существует равновесия Нэша. Пусть X1 = {x1, x2}, X2 = {y1, y2}. Вычислим гарантирующие стратегии i0 и j0. Положим для определенности x1 = i0, y1 = j0. Возможны два взаимоисключающих варианта: j1 = y1 и j1 = y2. Если j1 = y1, то i1= i2 = x2 (иначе (x1, y1) – равновесие Нэша). Тогда j2 = j3 = y2 (иначе (x2, y1) – равновесие Нэша). Следовательно, i3 = i4 = x1 (иначе (x2, y2) – равновесие Нэша). То есть в первом случае I¥ = I, J¥ = J. Если j1 = y2, то i2 = x2 (иначе (x1, y2) – равновесие Нэша). Тогда j3 = y1 (иначе (x2, y2) – равновесие Нэша). Следовательно, i4 = x1 (иначе (x2, y1) – равновесие Нэша). То есть во втором случае также I¥ = I, J¥ = J. · Качественно, утверждение 3 означает, что в биматричной игре 2 ´ 2, в которой не существует равновесия Нэша, любой исход может быть реализован как субъективное равновесие. Перспективным направлением дальнейших прикладных исследований можно считать анализ субъективных равновесий в базовых ординарных играх двух лиц 2 ´ 2 (напомним, что существуют 78 структурно различных ординарных игр, то есть игр, в которых оба агента, каждый из которых имеет два допустимых действия, может строго упорядочить собственные выигрыши от лучшего к худшему ). Утверждение 3 наводит на мысль, что, быть может, во всех биматричных играх, в которых не существует равновесия Нэша, выполнено I¥ = I, J¥ = J. Контрпримером служит приведенный на 51 рисунке 5 граф наилучших ответов в игре 4 ´ 4, в котором вершины i0 и j0 затенены. I¥ I J¥ J Рис. 5. Пример графа наилучших ответов в биматричной игре 4 ´ 4, в которой I¥ Ì I, J¥ Ì J Имея грубые оценки сверху (|I¥| £ n, |J¥| £ m) «размеров» множеств I¥ и J¥, исследуем, как быстро (при каких минимальных рангах стратегической рефлексии) эти множества «покрываются» соответствующими субъективными равновесиями. Третье свойство графа наилучших ответов означает, что в биматричной игре целесообразное увеличение ранга стратегической рефлексии, начиная со второго шага, обязательно изменяет множество стратегий, которые должны быть субъективными гарантирующими при рангах рефлексии меньших или равных данному. Так как в биматричных играх множества допустимых действий конечны, то конечны множества I¥ и J¥, следовательно, в силу свойств 4-7 графа наилучших ответов конечны и величины Lmin и Kmin, то есть в биматричных играх неограниченное увеличение ранга рефлексии заведомо нецелесообразно. Опять же в силу конечности допустимых множеств, величины (31) и (32), определяющие максимальные целесообразные ранги рефлексии, могут быть легко рассчитаны для любой конкретной биматричной игры. Но свойства графа наилучших ответов позволяют получить конкретные оценки сверху максимальных целесообразных рангов рефлексии. 52 В биматричной игре n ´ m гарантированные оценки25 величин (31)-(33), очевидно, будут зависеть от размерности матриц выигрышей, то есть Kmin = Kmin(n), Lmin = Lmin(m). Следовательно, (34) Kmax(n, m) = min {Kmin(n), Lmin(m) + 1}, (35) Lmax(n, m) = min {Lmin(m), Kmin(n) + 1}. Выражение (33) примет при этом вид: (36) Rmax(n, m) = max {Kmax(n, m), Lmax(n, m)}. Из свойств 4-7 графа наилучших ответов и выражений (34)-(36) следует справедливость следующего утверждения. Утверждение 4. В биматричных играх n ´ m максимальные целесообразные ранги стратегической рефлексии первого и второго агентов удовлетворяют следующим неравенствам (37) Kmax(n, m) £ min {n, m + 1}, (38) Lmax(n, m) £ min {m, n + 1}, (39) Rmax(n, m) £ max {min {n, m + 1}, min {m, n + 1}}. Следствие 1. В биматричной игре n ´ n, n ³ 2, максимальный целесообразный ранг стратегической рефлексии любого агента26 Rmax(n, n) £ n. Для случая двух допустимых действий (в силу его распространенности в прикладных моделях) сформулируем отдельное следствие. Следствие 2. В биматричной игре 2 ´ 2 максимальный целесообразный ранг рефлексии не превосходит двух. Еще раз отметим, что оценки (37)-(39) являются оценками сверху – существование нескольких наилучших ответов на одно и то же действие, наличие в исходной игре равновесия Нэша или доминируемых стратегий может привести

Solitaire Solitaire