I am wondering conceptually how load-balancing works on the EJB-level (not web session replication) with Java EE containers like Glassfish. From what I have gleaned your remote
I am wondering conceptually how load-balancing works on the EJB-level (not web session replication) with Java EE containers like Glassfish. From what I have gleaned your remote interface is a proxy that delegates your call to one of many servers you may have in an environment.
You are right. In Glassfish, the initial lookup will try to contact one of the server listed in the jndi.properties
file. The server then know all the other node in the cluster that will be used for round robin. The remote reference (proxy) will do that for you transparently. Theoretically nodes can be added/removed from the cluster dynamically. See Glassfish RMI-IIOP load balancing and fail-over.
If things fail are they supposed to be able to "finish" on another server? I want to understand the basic theory behind this load balancing, why is it better than a bunch of servers all running a plain web application with session affinity on a load-balancer?
If the bean is stateless, you don't even need any kind of affinity and the request can be processed on any node. Each remote reference act as a load balancer on its own.
If the bean is statefull, it's more hairy. The cluster will try to maintain 2 replica of the bean. And the request are dispatched against these two replica. If one of the node crashes, the cluster will recreate another replica until the node is back -- It's indeed similar to HTTP session replication with session affinity.
But on the contrary to a web server, bean are transactional components. So if an exception occurs, the transaction is rolled back and the stateful bean is invalidated because its state may not be consistent any longer.
As pointed out by Pascal, there is some kind of fail-over for certain kind of failure. I the node is not available, the request can re-routed to another node. But if the node fails while the request is processed, I don't know whether it can resubmit it somewhere else.
If you want to know more, I suggest you read Guide to GlassFish High Availability and Cluster Support in Glassfish.
If things fail are they supposed to be able to "finish" on another server?
Failover (you are referring to failover here, not load-balancing) is not part of the spec as far as I know. However, most vendors support failover and multiple EJB containers can be clustered to provide this feature. Basically, the progress of each open transactions is transmitted to backup server(s) and, if the primary container fails while the transaction is still open, a backup server can take over and, under some circumstances, it might be able to continue the transaction (for example, WebLogic requires methods to be declared as idempotent). Most often, the backup container will rollback the transaction and signal the client to retry its original request.
I want to understand the basic theory behind this load balancing, why is it better than a bunch of servers all running a plain web application with session affinity on a load-balancer?
Too much concepts mixed up here to provide an answer. Failover != load-balancing, session affinity is not really related to failover (it just means a request will be send to the server which holds the session). Failover can be achieved at the web layer using HTTP Session state replication (in-memory replication, in database, etc). You need to clarify the question.