Wednesday, May 2, 2012

Follow-up on the 2nd Cloud Workshop

The second OSGi Cloud Workshop was held during EclipseCon/OSGi DevCon 2012 last March. It was a very interesting morning with some good presentations and some great discussion. You can still find the presentations linked from here: http://www.osgi.org/Design/Cloud.

We learned that people are already widely using OSGi in Cloud environments, and part of the morning was spent discussing what OSGi could do to make it even more suitable for use in the Cloud. As a result of that a number of topics were proposed for people active in the OSGi Alliance to look at. You can find a summary of these topics here: https://mail.osgi.org/pipermail/cloud-workshop/2012-March/000112.html.

Last week the OSGi Enterprise Expert Group and the Residential Expert Group met to discuss these topics and to find potential routes to address them. Below you can find the results of these discussions. In this list I'll start each topic with the requirement as posted earlier to the cloud-workshop mailing list. The follow-ups below describe the thinking that we came to during the recent EEG/REG meeting.


1. Topic: Make it possible to describe various Service unavailable States. A service may be unavailable in the cloud because of a variety of reasons:

  • Maybe the amount of invocations available to you is exhausted for today.
  • Maybe your credit card expired 
  • Maybe the node running the service crashed. 
  • etc...

It should be possible to model these various failure states and it should also be possible to register 'callback' mechanisms that can deal with these states in whatever way is appropriate (blacklist the service, wait a while, send an email to the credit card holder, etc).

1. Follow-up: A potential new RFP is under discussion around monitoring and management. This RFP is currently being discussed in the Residential Expert Group, but it should ultimately be useful to all contexts in which OSGi is run. The requirements in this RFP could address some of the service quality issues referred to in this topic.

Additionally, there was a discussion whether it would make sense to extend the OSGi ServiceException so that various types of service failures could be reported (i.e. payment needed, quota exceeded, etc).


2. Topic: WYTIWYR (What you test is what you run) It should be possible to quickly deploy and redeploy.

2. Follow-up: One of the requirements that this expresses is the need to remotely run a test suite in an existing (remote) framework. There are test OSGi test frameworks that support this kind of behavior today (Pax Exam, Arquillian and others), but possibly they need to be enhanced with a remote deployment/managament solution that is cloud-friendly, for example the REST-based OSGi Framework management as is being discussed in RFC 182.


2b. Topic: There was an additional use-case around reverting the data (and configuration) changes made during an upgrade. If we need to downgrade after an upgrade then we may need to convert the data/configuration back into its old state.
2b. Follow-up: It might be possible to achieve this by providing an OSGi API to snapshot the framework state. This API could allow the user to save the current state and to retrieve a past saved state. When reverting to a past deployment this operation could be combined with a pluggable compensation process that converts the data back, if applicable.
The idea of snapshotting the framework state will be explored in a new RFP that is to be created soon. The data compensation process itself is most likely out of scope for OSGi.


3. Topic: Come up with a common and agreed architecture for Discovery. This should include consideration of Remote Services, Remote Events and Distributed Configuration Admin.

3. Follow-up: This is the topic of the new RFC 183 Cloud Discovery.


4. Topic: Resource utilization. It should be possible to measure/report this for each cloud node. Number of threads available, amount of memory, power consumption etc… Possibly create OSGi Capability namespaces for this. 

4. Follow-up: This relates to the monitoring RFP mentioned above.


5. Topic: OBR scaling. Need to be able to use OBR in a highly available manner. Should support failover and should hook in with discovery. 

5. Follow-up: The Repository service as defined in OSGi Enterprise R5 spec chapter 132 (see http://www.osgi.org/News/20120326 for download instructions of the latest draft) provides a stateless API which can work with well-known HA solutions (replication, failover, etc). Additionally, the Repository supports the concept of referrals, allowing multiple, federated repositories to be combined into a single logical view.
The discovery piece is part of RFC 183.


6. Topic: We need subsystems across frameworks. Possibly refer to them as 'Ecosystems'. These describe a number of subsystems deployed across a number of frameworks. 

6. Follow-up: While the general usefulness of this isn't disputed, there is nobody at this point in time driving this. If people strongly feel it should be addressed they should come forward and help out with defining the solution to addressing the issue.


7. Topic: Asynchronous services and asynchronous remote services. 

7. Follow-up: This is the topic of RFP 132 which was recently restarted. RFP 132 is purely about asynchronous OSGi services. Once this is established, asynchronous remote services can be modeled as a layer on top.


8. Topic: Isolation and security for applications 
  • For multi-tenancy 
  • Protect access to file system 
  • Lifecycle handling of applications 
  • OBR - isolated OBR (multiple tenants should not see each other's OBR) 
This all needs to be configurable.

8. Follow-up: Clearly separate VMs provide the best isolation, while separate JavaVMs within a single OS-level VM also provide fairly strong isolation (however, be aware of possible side effects of native code and possible resource exhaustion). Nested OSGi Frameworks and Subsystem Regions also provide isolation to a certain degree (see Graham's post on Subsystems), but the level of protection that is required clearly depends on the required security for the given application. The deployer can choose from these options as a target for deploying bundles and/or subsystems.


9. Topic: It should be possible to look at the cloud system state: 
  • where am I (type of cloud, geographical location)? 
  • what nodes are there and what is their state? 
  • what frameworks are available in this cloud system? 
  • where's my OBR? 
  • what state am I in? 
  • what do I need here in order to operate? 
  • etc… 
9. Follow-up: This is part of what is being discussed in RFC 183 Cloud Discovery.


10. Topic: There should be a management mechanism for use in the cloud 
  • JMX? Possibly not 
  • REST? Most likely 
Management of application state should also be possible in addition to bundle/framework state 

10. Follow-up: A cloud-friendly REST-based management API for the framework is currently being worked on in RFC 182. Once that is established it can also form the baseline for Subsystems management technology which can be used for application-level management.


11. Topic: Deployment - when deploying replicated nodes it should be possible to specify that the replica should not be deployed on certain nodes, to avoid that all the replicas are deployed on the same node.

11. Follow-up: This also relates to discovery as discussed in RFC 183. A management agent can use this information to enforce such a constraint.


12. Topic: Single Sign-on for OSGi.

12. Follow-up: One member company has done a project in relation to this on top of the User Admin Service. A new RFP will be created to discuss this requirement further.


So there you are - the ideas from the cloud workshop were greatly appreciated and provide very useful input into future work. If you're interested in following the progress, as usual we're planning to release regular early access drafts of the documents that are relatively mature. Or, if you're interested in taking part in the creation of these specs, join in! For more information see: http://www.osgi.org/About/Join or contact me (david at redhat.com) or anyone else active in OSGi for more information.

2 comments:

  1. Hi David, is there any ppt or pdf available from this workshop, if yes please share.

    Thanks
    Javin

    ReplyDelete
  2. The documentation of this event is very good. Thanks for the reference docs.
    @javin, it is available at osgi event.
    The first time I came across OSGi was in reference to Eclipse IDE.

    ReplyDelete