Actor model

From Wikipedia, the free encyclopedia
The Actor model in computer science is a mathematical model of concurrent computation that treats "actors" as the universal primitives of concurrent digital computation: in response to a message that it receives, an actor can make local decisions, create more actors, send more messages, and determine how to respond to the next message received. The Actor model originated in 1973.[1] It has been used both as a framework for a theoretical understanding of computation, and as the theoretical basis for severalpractical implementations of concurrent systems.[2] The relationship of the model to other work is discussed in Indeterminacy in concurrent computation and Actor model and process calculi.

Contents

History

According to Carl Hewitt, unlike previous models of computation, the Actor model was inspired by physics including general relativity and quantum mechanics. It was also influenced by the programming languages LispSimula and early versions of Smalltalk, as well as capability-based systems and packet switching. Its development was "motivated by the prospect of highly parallel computing machines consisting of dozens, hundreds or even thousands of independent microprocessors, each with its own local memory and communications processor, communicating via a high-performance communications network."[3] Since that time, the advent of massive concurrency through multi-core computer architectures has revived interest in the Actor model.
Following Hewitt, Bishop, and Steiger's 1973 publication, Irene Greif developed an operational semantics for the Actors model as part of her doctoral research.[4] Two years later, Henry Baker and Hewitt published a set of axiomatic laws for Actor systems.[5] Other major milestones include William Clinger's dissertation, in 1981, introducing adenotational semantics based on power domains,[3] and Gul Agha's 1985 dissertation which further developed a transition-based semantic model complementary to Clinger's.[6]This resulted in the full development of actor model theory.
Major software implementation work was done by Russ Atkinson, Giuseppe Attardi, Henry Baker, Gerry Barber, Peter Bishop, Peter de Jong, Ken Kahn, Henry Lieberman, Carl Manning, Tom Reinhardt, Richard Steiger, and Dan Theriault, in the Message Passing Semantics Group at Massachusetts Institute of Technology (MIT). Research groups led by Chuck Seitz at California Institute of Technology (Caltech) and Bill Dally at MIT constructed computer architectures that further developed the message passing in the model. See Actor model implementation.
Research on the Actor model has been carried out at Caltech Computer Science, Kyoto University Tokoro Laboratory, MCCMIT Artificial Intelligence LaboratorySRI,Stanford UniversityUniversity of Illinois at Urbana-Champaign,[7] Pierre and Marie Curie University (University of Paris 6), University of PisaUniversity of Tokyo Yonezawa Laboratory and elsewhere.

Fundamental concepts

The Actor model adopts the philosophy that everything is an actor. This is similar to the everything is an object philosophy used by some object-oriented programming languages, but differs in that object-oriented software is typically executed sequentially, while the Actor model is inherently concurrent.
An actor is a computational entity that, in response to a message it receives, can concurrently:
  • send a finite number of messages to other actors;
  • create a finite number of new actors;
  • designate the behavior to be used for the next message it receives.
There is no assumed sequence to the above actions and they could be carried out in parallel.
Decoupling the sender from communications sent was a fundamental advance of the Actor model enabling asynchronous communication and control structures as patterns ofpassing messages.[8]
Recipients of messages are identified by address, sometimes called "mailing address". Thus an actor can only communicate with actors whose addresses it has. It can obtain those from a message it receives, or if the address is for an actor it has itself created.
The Actor model is characterized by inherent concurrency of computation within and among actors, dynamic creation of actors, inclusion of actor addresses in messages, and interaction only through direct asynchronous message passing with no restriction on message arrival order.

Formal systems

Over the years, several different formal systems have been developed which permit reasoning about systems in the Actor model. These include:
There are also formalisms that are not fully faithful to the Actor model in that they do not formalize the guaranteed delivery of messages including the following (See Attempts to relate Actor semantics to algebra and linear logic):

Applications

The Actors model can be used as a framework for modelling, understanding, and reasoning about, a wide range of concurrent systems. For example:
  • Electronic mail (e-mail) can be modeled as an Actor system. Accounts are modeled as Actors and email addresses as Actor addresses.
  • Web Services can be modeled with SOAP endpoints modeled as Actor addresses.
  • Objects with locks (e.g. as in Java and C#) can be modeled as a Serializer, provided that their implementations are such that messages can continually arrive (perhaps by being stored in an internal queue). A serializer is an important kind of Actor defined by the property that it is continually available to the arrival of new messages; every message sent to a serializer is guaranteed to arrive.
  • Testing and Test Control Notation (TTCN), both TTCN-2 and TTCN-3, follows Actor model rather closely. In TTCN, Actor is a test component: either parallel test component (PTC) or main test component (MTC). Test components can send and receive messages to and from remote partners (peer test components or test system interface), the latter being identified by its address. Each test component has a behaviour tree bound to it; test components run in parallel and can be dynamically created by parent test components. Built-in language constructs allow the definition of actions to be taken when an expected message is received from the internal message queue, like sending a message to another peer entity or creating new test components.

Message-passing semantics

The Actor model is about the semantics of message passing.

Unbounded nondeterminism controversy

Arguably, the first concurrent programs were interrupt handlers. During the course of its normal operation, a computer needed to be able to receive information from outside (characters from a keyboard, packets from a network, etc.). So when the information arrived, execution of the computer was "interrupted" and special code called an interrupt handler was called to put the information in a buffer where it could be subsequently retrieved.
In the early 1960s, interrupts began to be used to simulate the concurrent execution of several programs on a single processor.[15] Having concurrency with shared memorygave rise to the problem of concurrency control. Originally, this problem was conceived as being one of mutual exclusion on a single computer. Edsger Dijkstra developedsemaphores and later, between 1971 and 1973,[16] Tony Hoare[17] and Per Brinch Hansen[18] developed monitors to solve the mutual exclusion problem. However, neither of these solutions provided a programming-language construct that encapsulated access to shared resources. This encapsulation was later accomplished by the serializer construct([Hewitt and Atkinson 1977, 1979] and [Atkinson 1980]).
The first models of computation (e.g. Turing machines, Post productions, the lambda calculusetc.) were based on mathematics and made use of a global state to represent a computational step (later generalized in [McCarthy and Hayes 1969] and [Dijkstra 1976] see Event orderings versus global state). Each computational step was from one global state of the computation to the next global state. The global state approach was continued in automata theory for finite state machines and push down stack machines, including their nondeterministic versions. Such nondeterministic automata have the property of bounded nondeterminism; that is, if a machine always halts when started in its initial state, then there is a bound on the number of states in which it halts.
Edsger Dijkstra further developed the nondeterministic global state approach. Dijkstra's model gave rise to a controversy concerning unbounded nondeterminism. Unbounded nondeterminism (also called unbounded indeterminacy), is a property of concurrency by which the amount of delay in servicing a request can become unbounded as a result of arbitration of contention for shared resources while still guaranteeing that the request will eventually be serviced. Hewitt argued that the Actor model should provide the guarantee of service. In Dijkstra's model, although there could be an unbounded amount of time between the execution of sequential instructions on a computer, a (parallel) program that started out in a well defined state could terminate in only a bounded number of states [Dijkstra 1976]. Consequently, his model could not provide the guarantee of service. Dijkstra argued that it was impossible to implement unbounded nondeterminism.
Hewitt argued otherwise: there is no bound that can be placed on how long it takes a computational circuit called an arbiter to settle (see metastability in electronics).[19]Arbiters are used in computers to deal with the circumstance that computer clocks operate asynchronously with input from outside, e.g. keyboard input, disk access, network input, etc. So it could take an unbounded time for a message sent to a computer to be received and in the meantime the computer could traverse an unbounded number of states.
The Actor Model features unbounded nondeterminism which was captured in a mathematical model by Will Clinger using domain theory.[3] There is no global state in the Actor model.

Direct communication and asynchrony

Messages in the Actor model are not necessarily buffered. This was a sharp break with previous approaches to models of concurrent computation. The lack of buffering caused a great deal of misunderstanding at the time of the development of the Actor model and is still a controversial issue. Some researchers argued that the messages are buffered in the "ether" or the "environment". Also, messages in the Actor model are simply sent (like packets in IP); there is no requirement for a synchronous handshake with the recipient.

Actor creation plus addresses in messages means variable topology

A natural development of the Actor model was to allow addresses in messages. Influenced by packet switched networks [1961 and 1964], Hewitt proposed the development of a new model of concurrent computation in which communications would not have any required fields at all: they could be empty. Of course, if the sender of a communication desired a recipient to have access to addresses which the recipient did not already have, the address would have to be sent in the communication.
A computation might need to send a message to a recipient from which it would later receive a response. The way to do this is to send a communication which has the message along with the address of another actor called the resumption (sometimes also called continuation or stack frame) along with the message. The recipient could then cause a response message to be sent to the resumption.
Actor creation plus the inclusion of the addresses of actors in messages means that Actors have a potentially variable topology in their relationship to one another much as the objects in Simula also had a variable topology in their relationship to one another.

Inherently concurrent

As opposed to the previous approach based on composing sequential processes, the Actor model was developed as an inherently concurrent model. In the Actor model sequentiality was a special case that derived from concurrent computation as explained in Actor model theory.

No requirement on order of message arrival

Hewitt argued against adding the requirement that messages must arrive in the order in which they are sent to the Actor. If output message ordering is desired, then it can be modeled by a queue Actor that provides this functionality. Such a queue Actor would queue the messages that arrived so that they could be retrieved in FIFO order. So if an Actor X sent a message M1 to an Actor Y, and later X sent another message M2 to Y, there is no requirement that M1 arrives at Y before M2.
In this respect the Actor model mirrors packet switching systems which do not guarantee that packets must be received in the order sent. Not providing the order of delivery guarantee allows packet switching to buffer packets, use multiple paths to send packets, resend damaged packets, and to provide other optimizations.
For example, Actors are allowed to pipeline the processing of messages. What this means is that in the course of processing a message M1, an Actor can designate the behavior to be used to process the next message, and then in fact begin processing another message M2 before it has finished processing M1. Just because an Actor is allowed to pipeline the processing of messages does not mean that it must pipeline the processing. Whether a message is pipelined is an engineering tradeoff. How would an external observer know whether the processing of a message by an Actor has been pipelined? There is no ambiguity in the definition of an Actor created by the possibility of pipelining. Of course, it is possible to perform the pipeline optimization incorrectly in some implementations, in which case unexpected behavior may occur.

Locality

Another important characteristic of the Actor model is locality.
Locality means that in processing a message an Actor can send messages only to addresses that it receives in the message, addresses that it already had before it received the message and addresses for Actors that it creates while processing the message. (But see Synthesizing Addresses of Actors.)
Also locality means that there is no simultaneous change in multiple locations. In this way it differs from some other models of concurrency, e.g., the Petri net model in which tokens are simultaneously removed from multiple locations and placed in other locations.

Composing Actor Systems

The idea of composing Actor systems into larger ones is an important aspect of modularity that was developed in Gul Agha's doctoral dissertation,[6] developed later by Gul Agha, Ian Mason, Scott Smith, and Carolyn Talcott.[9]

Behaviors

A key innovation was the introduction of behavior specified as a mathematical function to express what an Actor does when it processes a message including specifying a new behavior to process the next message that arrives. Behaviors provided a mechanism to mathematically model the sharing in concurrency.
Behaviors also freed the Actor model from implementation details, e.g., the Smalltalk-72 token stream interpreter. However, it is critical to understand that the efficient implementation of systems described by the Actor model require extensive optimization. See Actor model implementation for details.

Modeling other concurrency systems

Other concurrency systems (e.g. process calculi) can be modeled in the Actor model using a two-phase commit protocol.[20]

Computational Representation Theorem

There is a Computational Representation Theorem in the Actor model for systems which are closed in the sense that they do not receive communications from outside. The mathematical denotation denoted by a closed system S is constructed from an initial behavior S and a behavior-approximating function progressionS. These obtain increasingly better approximations and construct a denotation (meaning) for S as follows [Hewitt 2008; Clinger 1981]:
DenoteS ≡  \lim_{i \to \infty} progressionSi(⊥S)
In this way, S can be mathematically characterized in terms of all its possible behaviors (including those involving unbounded nondeterminism). Although DenoteS is not an implementation of S, it can be used to prove a generalization of the Church-Turing-Rosser-Kleene thesis [Kleene 1943]:
A consequence of the above theorem is that a finite Actor can nondeterministically respond with an uncountable number of different outputs.

Relationship to mathematical logic

The development of the Actor model has an interesting relationship to mathematical logic.[21] One of the key motivations for its development was to understand and deal with the control structure issues that arose in development of the Planner programming language. Once the Actor model was initially defined, an important challenge was to understand the power of the model relative to Robert Kowalski's thesis that "computation can be subsumed by deduction". Kowalski's thesis turned out to be false for the concurrent computation in the Actor model (see Indeterminacy in concurrent computation). This result is still somewhat controversial and it reversed previous expectations because Kowalski's thesis is true for sequential computation and even some kinds of parallel computation, e.g. the lambda calculus.
Nevertheless attempts were made to extend logic programming to concurrent computation. However, Hewitt and Agha [1991] claimed that the resulting systems were not deductive in the following sense: computational steps of the concurrent logic programming systems do not follow deductively from previous steps (see Indeterminacy in concurrent computation). Recently, Logic Programming has been integrated into the Actor Model in a way that maintains logical semantics.[22]

Migration

Migration in the Actor model is the ability of Actors to change locations. E.g., in his dissertation, Aki Yonezawa modeled a post office that customer Actors could enter, change locations within while operating, and exit. An Actor that can migrate can be modeled by having a location Actor that changes when the Actor migrates. However the faithfulness of this modeling is controversial and the subject of research.[citation needed]

Security

The security of Actors can be protected in the following ways:

Synthesizing addresses of actors

A delicate point in the Actor model is the ability to synthesize the address of an Actor. In some cases security can be used to prevent the synthesis of addresses (see Security). However, if an Actor address is simply a bit string then clearly it can be synthesized although it may be difficult or even infeasible to guess the address of an Actor if the bit strings are long enough. SOAP uses a URL for the address of an endpoint where an Actor can be reached. Since a URL is a character string, it can clearly be synthesized although encryption can make it virtually impossible to guess.
Synthesizing the addresses of Actors is usually modeled using mapping. The idea is to use an Actor system to perform the mapping to the actual Actor addresses. For example, on a computer the memory structure of the computer can be modeled as an Actor system that does the mapping. In the case of SOAP addresses, it's modeling the DNS and rest of the URL mapping.

Contrast with other models of message-passing concurrency

Robin Milner's initial published work on concurrency[23] was also notable in that it was not based on composing sequential processes. His work differed from the Actor model because it was based on a fixed number of processes of fixed topology communicating numbers and strings using synchronous communication. The original Communicating Sequential Processes model[24] published by Tony Hoare differed from the Actor model because it was based on the parallel composition of a fixed number of sequential processes connected in a fixed topology, and communicating using synchronous message-passing based on process names (see Actor model and process calculi history). Later versions of CSP abandoned communication based on process names in favor of anonymous communication via channels, an approach also used in Milner's work on theCalculus of Communicating Systems and the π-calculus.
These early models by Milner and Hoare both had the property of bounded nondeterminism. Modern, theoretical CSP ([Hoare 1985] and [Roscoe 2005]) explicitly provides unbounded nondeterminism.

Influence

The Actor Model has been influential on both theory development and practical software development.

Theory

The Actor Model has influenced the development of the Pi-calculus and subsequent Process calculi. In his Turing lecture, Robin Milner wrote:[25]
Now, the pure lambda-calculus is built with just two kinds of thing: terms and variables. Can we achieve the same economy for a process calculus? Carl Hewitt, with his Actors model, responded to this challenge long ago; he declared that a value, an operator on values, and a process should all be the same kind of thing: an Actor.
This goal impressed me, because it implies the homogeneity and completeness of expression ... But it was long before I could see how to attain the goal in terms of an algebraic calculus...
So, in the spirit of Hewitt, our first step is to demand that all things denoted by terms or accessed by names--values, registers, operators, processes, objects--are all of the same kind of thing; they should all be processes.

Practice

The Actor Model has had extensive influence on commercial practice. For example Twitter has used actors for scalability.[26] Also, Microsoft has used the Actor Model in the development of its Asynchronous Agents Library.[27] There are numerous other Actor libraries listed in the Actor Libraries and Frameworks section below.

Current issues

According to Hewitt [2006], the Actor model faces issues in computer and communications architecture, concurrent programming languages, and Web Services including the following:
  • scalability: the challenge of scaling up concurrency both locally and nonlocally.
  • transparency: bridging the chasm between local and nonlocal concurrency. Transparency is currently a controversial issue. Some researchers have advocated a strict separation between local concurrency using concurrent programming languages (e.g. Java and C#) from nonlocal concurrency using SOAP for Web services. Strict separation produces a lack of transparency that causes problems when it is desirable/necessary to change between local and nonlocal access to Web Services (seedistributed computing).
  • inconsistency: Inconsistency is the norm because all very large knowledge systems about human information system interactions are inconsistent. This inconsistency extends to the documentation and specifications of very large systems (e.g. Microsoft Windows software, etc.), which are internally inconsistent.
Many of the ideas introduced in the Actor model are now also finding application in multi-agent systems for these same reasons [Hewitt 2006b 2007b]. The key difference is that agent systems (in most definitions) impose extra constraints upon the Actors, typically requiring that they make use of commitments and goals.
The Actor model is also being applied to client cloud computing.[28]

Early Actor researchers

There is a growing community of researchers working on the Actor Model as it is becoming commercially more important. Early Actor researchers included:[2]
  • Important contributions to the implementation of Actors have been made by: Bill Athas, Russ Atkinson, Beppe Attardi, Henry Baker, Gerry Barber, Peter Bishop, Nanette Boden, Jean-Pierre Briot, Bill Dally, Peter de Jong, Jessie Dedecker, Travis Desell, Ken Kahn, Carl Hewitt, Henry Lieberman, Carl Manning, Tom Reinhardt, Chuck Seitz, Richard Steiger, Dan Theriault, Mario Tokoro, Carlos Varela, Darrell Woelk.

Programming with Actors

A number of different programming languages employ the Actor model or some variation of it. These languages include:

Early Actor programming languages

Later Actor programming languages

Actor libraries and frameworks

Actor libraries or frameworks have also been implemented to permit actor-style programming in languages that don't have actors built-in. Among these frameworks are:

NameStatusLatest releaseLicenseLanguages
ActorActive2013-05-30MITJava
Actor FrameworkActive2013-06-05Apache 2.0.NET
AkkaActive2013-05-14Apache 2.0Java and Scala
Ateji PXActive ? ?Java
F# MailboxProcessorActivesame as F# (built-in core library)Apache LicenseF#
KorusActive2010-02-01GNU GPL 3Java
Kilim[41]Active2011-10-13[42]MITJava
ActorFoundry (based on Kilim)Active?2008-12-28 ?Java
ActorKitActive2011-09-14[43]BSDObjective-C
Cloud HaskellActive2012-11-07[44]BSDHaskell
NActActive2012-02-28LGPL 3.0.NET
RetlangActive?2011-05-18[45]New BSD.NET
JActorActive2013-01-22LGPLJava
JetlangActive2012-02-14 [46]New BSDJava
Haskell-ActorActive?2008New BSDHaskell
GParsActive2012-12-19Apache 2.0Groovy
PARLEYActive?2007-22-07GNU GPL 2.1Python
PykkaActive2012-12-12 [47]Apache 2.0Python
Termite SchemeActive?2009LGPLScheme (Gambit implementation)
TheronActive2012-08-20[48]MIT[49]C++
LibactorActive?2009GPL 2.0C
Actor-CPPActive2012-03-10[50]GPL 2.0C++
S4Active2011-11-28[51]Apache 2.0Java
libcppaActive2012-08-22[52]LGPL 3.0C++11
CelluloidActive2012-07-17[53]MITRuby
LabVIEW Actor FrameworkActive2012-03-01[54] ?LabVIEW
QP frameworks for real-time embedded systemsActive2012-09-07[55]GPL 2.0 and commercial (dual licensing)C and C++
libprocessActive2012-07-13Apache 2.0C++