University of Modena and Reggio Emilia
Why bother?
Social pressure: unproductive mind work and non-qualified time consuming activities might be delegated to artificial systems
Who does what?
Mostly, this is no longer an issue: artificial systems are generally very welcome to do whatever we like
Who takes the decision?
Autonomy is at least about deliberation as much as about action
Is there just one single notion of autonomy?
How do we model autonomy in artificial / computational systems?
How do we engineer autonomy in artificial / computational systems?
From [Unmanned Systems Integrated Roadmap FY 2011-2036]:
Systems are self-directed toward a goal in that they do not require outside control, but rather are governed by laws and strategies that direct their behavior
These control algorithms are created and tested by teams of human operators and software developers. However, if machine learning is utilized, autonomous systems can develop modified strategies for themselves by which they select their behavior
Various levels of autonomy in any system guide how much and how often humans need to interact or intervene with the autonomous system:
Level of automation for cars from [SAE J3016, 2018]:
ADS: automated driving system, DDT: dynamic driving task, ODD: operational design domain, OEDR: object and event detection and response
Autonomy related to decision making:
centralised decision making, as in service-oriented and object-based applications, achieves global goal via design by delegation of control
distributed decision making, as in agent-based applications, achieves global goal via design by delegation of responsibility (agents can say no)
A software agent is a component that is:
Agents can decide to autonomously activate towards the pursuing of the goal, without the need of any specific event or solicitation:
proactivity is a sort of extreme expression of autonomy
Clearly, it is not always black and white, as modern objects have features that can make objects resemble agents
In effect, several systems for "agent-oriented programming" can be considered simply as advanced tools for object-oriented programming or for more dynamic forms of service-oriented programming
For many researchers, agents do not simply have to be goal-oriented, autonomous, situated, but they have to be "intelligent", that typically means they have to integrate "artificial intelligence" tools: neural networks, logic-based reasoning, conversational capabilities (interact via natural language), ...
Treating a program as if it is intelligent (e.g. "the program wants the input in a different format") is called the intentional stance, and it's helpful to us as programmers to think this way
The intentional stance leads us to program agents at the knowledge level (Newell), that means reasoning about programs in terms of:
A possible definition of intelligence is:
"the capability to act purposefully towards the achievement of goals"
Hence for an agent to be regarded (observed) as being intelligent it is enough simply to know how to achieve a given goal, which implies some sort of reasoning
E.g. a "smart" thermostat:
if temp < 27:
temp.increase()
else if temp > 27:
temp.decrease()
Too simplistic, too low level of abstraction
Imagine a self-driving car:
Obviously not: theoretical + practical reasoning
Our focus will be on the latter
Newell's Principle of rationality:
if an agent has the knowledge that an action will lead to the accomplishment of one of its goals (or to the maximization of its utility), then it will select that action (Game Theory / Decision Theory)
BDI is a very successful and general model to "think" at software agents:
The agent has:
What types of architectures can we conceive for agents
JADE stands for Java Agent DEvelopment Framework
JADE is a notable example of a distributed, object-based, agent-oriented infrastructure, hence an interesting example about how to face a design / programming paradigm shift
Despite being Java objects, JADE agents have a wide range of features promoting their autonomy:
A behaviour can be seen as an activity to perform with the goal of completing a task
It can represent a proactive activity, started by the agent on its own, as well as a reactive activity, performed in response to some events (timeouts, messages, etc.)
Behaviours are Java objects executed concurrently by a round-robin scheduler
Get the code here: https://github.com/smarianimore/phdcourse-2020-jade
public class FSMAgent extends Agent {
// State names
private static final String STATE_A = "A";
private static final String STATE_B = "B";
private static final String STATE_C = "C";
private static final String STATE_D = "D";
private static final String STATE_E = "E";
private static final String STATE_F = "F";
protected void setup() {
FSMBehaviour fsm = new FSMBehaviour(this) {
public int onEnd() {
System.out.println("FSM behaviour completed.");
myAgent.doDelete();
return super.onEnd();
}
};
// Register state A (first state)
fsm.registerFirstState(new NamePrinter(), STATE_A);
// Register state B
fsm.registerState(new NamePrinter(), STATE_B);
// Register state C
fsm.registerState(new RandomGenerator(3), STATE_C);
// Register state D
fsm.registerState(new NamePrinter(), STATE_D);
// Register state E
fsm.registerState(new RandomGenerator(4), STATE_E);
// Register state F (final state)
fsm.registerLastState(new NamePrinter(), STATE_F);
// Register the transitions
fsm.registerDefaultTransition(STATE_A, STATE_B);
fsm.registerDefaultTransition(STATE_B, STATE_C);
fsm.registerTransition(STATE_C, STATE_C, 0);
fsm.registerTransition(STATE_C, STATE_D, 1);
fsm.registerTransition(STATE_C, STATE_A, 2);
fsm.registerDefaultTransition(STATE_D, STATE_E);
fsm.registerTransition(STATE_E, STATE_F, 3);
fsm.registerDefaultTransition(STATE_E, STATE_B);
addBehaviour(fsm);
}
/**
Inner class NamePrinter.
This behaviour just prints its name
*/
private class NamePrinter extends OneShotBehaviour {
public void action() {
System.out.println("Executing behaviour "+getBehaviourName());
}
}
/**
Inner class RandomGenerator.
This behaviour prints its name and exits with a random value
between 0 and a given integer value
*/
private class RandomGenerator extends NamePrinter {
private int maxExitValue;
private int exitValue;
private RandomGenerator(int max) {
super();
maxExitValue = max;
}
public void action() {
System.out.println("Executing behaviour "+getBehaviourName());
exitValue = (int) (Math.random() * maxExitValue);
System.out.println("Exit value is "+exitValue);
}
public int onEnd() {
return exitValue;
}
}
}
Jason is a BDI agent programming language and development framework
Jason language is a variant of AgentSpeak, whose main constructs are:
Jason agent reasoning cycle:
Get the code here: https://gitlab.com/pika-lab/courses/as/ay1920/jason-agents
target(20).
+temperature(X) <- !regulate_temperature(X).
+!regulate_temperature(X) : target(Y) & X - Y > 0.5 <-
.print("Temperature is ", X, ": need to cool down");
spray_air(cold).
+!regulate_temperature(X) : target(Y) & Y - X > 0.5 <-
.print("Temperature is ", X, ": need to warm up");
spray_air(hot).
+!regulate_temperature(X) : target(Y) & Z = X - Y & Z >= -0.5 & Z <= 0.5 <-
.print("Temperature is ", X, ": it's ok.").
-!regulate_temperature(X) <-
.print("Failed to spray air. Retrying.");
!regulate_temperature(X).
(kindly provided by Giovanni Ciatto)
Get the code here: https://gitlab.com/pika-lab/courses/as/ay1920/jason-agents
(kindly provided by Giovanni Ciatto)
What is adaptiveness?
The capability of a system and/or of its individuals of changing its behavior according to contingent situations
Autonomy (of decision making) enables adaptiveness, and adaptiveness is a way of exhibiting autonomy, in turn!
Can be achieved via context-awareness: knowledge of the environment, conditions under which the agent operates
As situated components, agents are inherently context-aware!
Where agents cannot reach, suitable middleware provides desired services...
Multiagent Systems (MAS):
systems or "organizations" of autonomous agents, possibly distributed over multiple computers and/or in an environment, possibly belonging to different stakeholders / organization (federated / open systems), collaborating and/or competing to, respectively, achieve a shared global goal or maximise their own utility, possibly interacting with a computational or physical environment (that could also mediate interactions)
MAS are "paradigmatic" of modern distributed systems:
made up of decentralized autonomous components (sensors, peers, mobile devices, etc.) interacting with each other in complex ways (P2P networks, MANETs, pervasive computing environments) and situated in some environment, computational or physical
In a MAS, agents participate by providing the capability of achieving a goal in autonomy (vs. objects/services offering interfaces)
Assume we have:
How should agents decide which action to carry out (their strategy), assuming they cannot communicate?
To reply we need game theory:
"analysis of strategies for dealing with competitive situations where the outcome of a participant's choice of action depends critically on the actions of other participants"
Notice that agents influence each other even if they do not communicate, as long as they act within a shared environment!
Agents are rational (recall Newell's principle?):
The behaviour of rational agents of favouring actions that maximise $u$ is called preference
Utility functions are used to model preferences: $$u_{i} = \Omega \rightarrow \mathbb{R} \; \text{where} \; \Omega = {\omega_{1}, ..., \omega_{M}}$$
Utility functions enable to define preference orderings over outcomes (hence, actions): $$\omega >=_i \omega' \; \text{means} \; u_i(\omega) >= u_i(\omega')$$ as for different agents $u$ and preferences may vary
Utility is one of the many things that can work as a driver to adaptation: the lower the utility of an action, the greater the chance for the agent to change action / behaviour / goal...
As utility of an action may vary dynamically based on context (environment), adaptation of the agent may act and vary dynamically, too
If multiple agents either act (almost) simultaneously or do not have means to observe each other actions, the outcome of their behaviours will still be some combination of each individual outcome
The criteria according to which the choice of actions happens reflect the agent strategy (e.g. rational vs random)
In zero-sum games the utilities add up to $0$: $$u_i(\omega) + u_j(\omega) = 0 \; \forall \omega \in \Omega$$
Zero-sum games are strictly competitive: no agent can gain something if no other loses something (every win-lose game basically)
In these games there is no rational choice without information about other players' strategies!
Real-life situations are usually non zero-sum: there is always some "compromise"
Check out this scene from the movie "A beautiful mind"
Nash equilibrium: each player's predicted strategy is the best response to the predicted strategies of other players
In other words, two strategies $s_1$ and $s_2$ are in Nash equilibrium if:
Unfortunately there are both games with no Nash equilibrium (zero sum ones) and with more than one Nash equilibrium (hence no trivial strategy to adopt)
In games where there are common dominant strategies, they represent a Nash equilibrium
Two men are collectively charged with a crime and held in separate cells, with no way of meeting or communicating
We can measure utilities in term of, e.g., years of prison saved over the case of 4 years of prison (anything else could be ok)
The individual rational action is defect: it guarantees a payoff of no worse than 2, whereas cooperating guarantees a payoff of at most 1
This apparent paradox is the fundamental problem of multiagent interactions: it appears to imply that cooperation will not occur in societies of self-interested agents...
A possible answer: play the game repeatedly
Suppose you play iterated prisoner's dilemma against a set of opponents: what strategy should you choose, so as to maximize your overall (long-term) payoff?
Axelrod (1984) investigated this problem, with a computer tournament for programs playing the prisoner's dilemma: many different agents using different strategies, interacting hundreds of times with other agents
Strategies:
In the long run, TIT-FOR-TAT is best strategy: cooperation wins!
Emerging "rules":
Axelrod Tournament shows that a group of agents can change behaviour (i.e., strategy), to eventually learn what is the most suitable way of behaving to maximize own success or/while maximize overall success of the group
A whole research topic on its own: Multi-agent Reinforcement Learning
Long story short: no
However
Differently from game theory, let's now assume that agents can interact:
As they have a mean to affect each other's actions and beliefs, they can strategically act based on what the other agents do (observation), and agree on common courses of actions
Interactions may imply:
In distributed algorithms, too, there is need of reaching agreement on how to act or on a common perspective of the world (leader election, mutual exclusion, validation of a blockchain transaction, etc.)
However, there is no concept such as "goal", "utility" of actions, etc.: either the algorithms works, or fails
Defines the rules of the encounter between agents: the set of available interaction actions, their dependencies, how they affect the "state of the world", etc.
In this context, agents' strategy regards freedom in deciding what to do at each step of the protocol amongst admissible actions
E.g. "battle of the sexes":
Designing an interaction protocol implies devising a sequence of interactions with desired properties
Strategy: as soon as the counter-proposal reaches a sufficiently high utility, agree
"Negotiation is an economically-inspired form of distributed decision making where two or more partners jointly search a space of possible solutions to reach a common consensus" (P. Maes)
Applications
public class Initiator extends Agent {
@Override
protected void setup() {
this.addBehaviour(new ContractNetInitiator(this, new ACLMessage(ACLMessage.CFP)) {
@Override
protected Vector prepareCfps(ACLMessage cfp) {
/* prepare ACL message Call for Proposals */
return super.prepareCfps(cfp);
}
@Override
protected void handlePropose(ACLMessage propose, Vector acceptances) {
/* handle proposal coming from responders */
super.handlePropose(propose, acceptances);
}
@Override
protected void handleRefuse(ACLMessage refuse) {
/* handle refusals coming from responders */
super.handleRefuse(refuse);
}
@Override
protected void handleAllResponses(Vector responses, Vector acceptances) {
/* when all replies to CFP have been collected, do something (select winner) */
super.handleAllResponses(responses, acceptances);
}
@Override
protected void handleInform(ACLMessage inform) {
/* handle confirmations of tasks carried out coming from responders */
super.handleInform(inform);
}
});
}
}
Get the code here: https://github.com/smarianimore/phdcourse-2020-jade
public class Responder extends Agent {
@Override
protected void setup() {
MessageTemplate template = MessageTemplate.and(
MessageTemplate.MatchProtocol(FIPANames.InteractionProtocol.FIPA_CONTRACT_NET),
MessageTemplate.MatchPerformative(ACLMessage.CFP) );
this.addBehaviour(new ContractNetResponder(this, template) {
@Override
protected ACLMessage handleCfp(ACLMessage cfp) throws RefuseException, FailureException, NotUnderstoodException {
/* react to reception of call for proposals */
return super.handleCfp(cfp);
}
@Override
protected ACLMessage handleAcceptProposal(ACLMessage cfp, ACLMessage propose, ACLMessage accept) throws FailureException {
/* react to acceptance of own proposal (do the task, then inform result) */
return super.handleAcceptProposal(cfp, propose, accept);
}
});
}
}
Get the code here: https://github.com/smarianimore/phdcourse-2020-jade
When agents have competing interests and no interest in cooperating, the only solution for cooperation is to "pay" the actions/tasks/resources that agents provide to others
Auctions are the negotiation mechanisms that determines the values of a good/resource/actions to be "sold" by an offering (seller) agents to buyer agent(s)
All the form of direct interaction seen can be replicated as indirect ones, based on some sort of environment mediation
Get the code here: https://gitlab.com/pika-lab/courses/ds/aa1920/lab-02
public class SteAgent extends AbstractTucsonAgent {
public static void main(String[] args) throws TucsonInvalidAgentIdException {
new SteAgent().go();
}
public SteAgent() throws TucsonInvalidAgentIdException {
super("ste_agent");
}
@Override
protected void main() {
try {
SynchACC helloOps = getContext();
final TucsonTupleCentreId defaultTC = new TucsonTupleCentreId("default", "localhost", "20504");
final LogicTuple helloTuple = LogicTuple.parse("msg(gio,hello)");
final LogicTuple steTemplate = LogicTuple.parse("msg(ste,_)");
helloOps.out(defaultTC, helloTuple, Long.MAX_VALUE);
helloOps.in(defaultTC, steTemplate, Long.MAX_VALUE);
} catch (final OperationTimeOutException | TucsonInvalidTupleCentreIdException | InvalidLogicTupleException | TucsonOperationNotPossibleException | UnreachableNodeException e) {
e.printStackTrace();
}
}
}
(kindly provided by Giovanni Ciatto)
Get the code here: https://gitlab.com/pika-lab/courses/ds/aa1920/lab-02
public class GioAgent extends AbstractTucsonAgent {
public static void main(String[] args) throws TucsonInvalidAgentIdException {
new GioAgent().go();
}
public GioAgent() throws TucsonInvalidAgentIdException {
super("gio_agent");
}
@Override
protected void main() {
try {
SynchACC helloOps = getContext();
final TucsonTupleCentreId defaultTC = new TucsonTupleCentreId("default", "localhost", "20504");
final LogicTuple worldTuple = LogicTuple.parse("msg(ste,world)");
final LogicTuple gioTemplate = LogicTuple.parse("msg(gio,_)");
helloOps.in(defaultTC, gioTemplate, Long.MAX_VALUE);
helloOps.out(defaultTC, worldTuple, Long.MAX_VALUE);
} catch (final OperationTimeOutException | TucsonInvalidTupleCentreIdException | InvalidLogicTupleException | TucsonOperationNotPossibleException | UnreachableNodeException e) {
e.printStackTrace();
}
}
}
(kindly provided by Giovanni Ciatto)
In several cases, ensembles of very simple, purely "reactive" components, may globally exhibit adaptive behaviours, as a system (e.g. single cell vs. organism)
In these cases, adaptiveness is a capability that emerges not from individual behaviour in isolation, but from individuals' interactions (nature, medium, rate, ...)
Programmable modeling environment for simulating complex systems ($\approx$ many individuals, many interactions, emergent behaviour)
Either download or use online: https://ccl.northwestern.edu/netlogo/
Autonomous systems already are "among us"
Individual behaviour is half of the story: interactions, hence collective behaviour, is equally relevant (or even more)
Many ways to design individual autonomy: BDI, game theory, learning (not explored here), ...
Many ways to design system autonomy: interaction protocols, mediated interaction, game theory, ...
Engineering of autonomous yet predictable, controllable, and understandable systems is an open challenge, still
University of Modena and Reggio Emilia