31 May 2012

MIDDLEWARE INSIGHT - C2B2 Newsletter Issue 1



MIDDLEWARE INSIGHT
at the Hub of the Middleware Industry
delivered to you by C2B2 Consulting - The Leading Independent Middleware Experts

Welcome to the very first issue of Middleware Insight – a newsletter that brings you only the most recent, most important and most interesting news from the middleware industry.
It may be difficult to keep up with the current dynamic world of Java middleware technology – finding out what is really important from the huge amount of information that we are bombarded with every day is difficult and time consuming. That is why, as the independent middleware experts, we have decided that we want to share our experience and expertise with you by selecting only the most important news about the industry, products and what is happening out there, and sending them to you in a form of a bi-monthly newsletter. We are more than happy to share our knowledge and opinions, or to give you some advice and help, either via our blog or personally, if you wish to contacts us directly.
In this issue of Middleware Insight we are giving you an overview of what vendors are up to, what interesting stuff is going on in the open source world, bringing you some of the latest news about Java and reminding you about conferences and other events you may want to attend.
There have been a lot of things happening lately in the middleware world, including various new releases (Apache CXF 2.6.0, JBoss EAP 6 Beta and more), updates ( Oracle patches, Tomcat upgrades ), and reports coming out (TechTarget IT Forecast), so if you want to find out more - read on!
We hope that you will find this newsletter interesting and informative. If you have any questions, feedback or suggestions – please do not hesitate to contact us at info@c2b2.co.uk

Many thanks
C2B2 Consulting Team




Vendor News

JBoss EAP6 Beta is here! Read more on the Red Hat website
JBoss Data Grid 6 is now available, click here to find out more from the Red Hat website and visit the C2B2 blog to read about getting started with JBoss EDG in an overview by Mark Addy
Oracle released a number of the new security patches across their middleware products family - click here to find out more
Hyperic Upgrades Management of Apache Tomcat  - read VMware blog 


Open Source News

Apache announced the release of Apache TomEE v1.0 expanding Tomcat to provide JEE6 web profile support - read C2B2 blog post by Steve Millidge to find out why this is important
Apache CXF 2.6.0 released with enhanced SAML and OAuth support! Find out more
OpenShift - Red Hat's PAAS offering - is now Open Source so now you can build your own internal PAAS, read a blog post by Sean Kerner and find out more on the official OpenShift website

Tomcat 7.0.27 released with the number of significant new features including support for theWebSockets protocol.Tomcat joins GlassFish which also recently announced WebSockets support -find out more 

Other Middleware & Java News

Java 7 update 4 G1GC is now a supported garbage collector and HotRockit convergence starts - read more
Java 7u4 has been released with some cool new features - read the blog post by Steve Millidge
Make Over for Infinispan's Distributed Execution Framework - by Mark Addy, Infinispan expert  and JUDCon Boston speaker, read more on the C2B2 blog 


Analyst Happenings

Gartner Predicts 2012: Cloud and In-Memory Drive Innovation in Application Platforms - read more
2012 Global IT Forecast - see the TechTarget report to find out more about the direction of IT budgets and emerging technologies - the data tells the story!

Events

London JBoss User Group next meetup announced - 13th of June, 'Infinspan from POC to Production' by Mark Addy, click here to find out more and register
JUDCon 2012 - Boston (June) - 
C2B2 is speaking! - register here
Vmworld 2012 - San Francisco (August) and Barcelona (October) - register here
SOA, Cloud Computing & Service Technology Symposium - 
C2B2 is speaking! London (September) - register here
JavaOne 2012 -
San Francisco (September & October) - register here
Jax 2012 - London (October) - C2B2 is speaking! - register her
SpringOne 2GX 2012 - Washington (October) - register here 
Devoxx 2012 - Antwerp, Belgium (November) - register & submit your papers here
UK Oracle User Group Conference - Birmingham (December) - register & submit your papers here
JAX Innovation Awards - nominations revealed, Steve Millidge - C2B2 Director - nominated for the Top Java Ambassador, community voting opens on June 1st click here to vote
C2B2 Webinar: 'iAS to WebLogic Migration - Increase Performance, Scalability and Reliability' - see the webinar recording on the C2B2 YouTube Channel


Job Opportunites at C2B2
Due to projected growth and after a record 2011 we are looking to recruit the best middleware engineers in the UK. Currently we are looking for Senior Consultants, Consultants, Junior Engineers and Java Middleware Engineers. See our current opportunities page to find out more

Make Over for Infinispan's Distributed Execution framework


Several years ago Infinispan succeeded JBossCache as JBoss's de-facto caching solution, well JBossCache 3 actually got renamed, rewritten and released as Infinispan version 4.  One of the most important changes was that support for the old buddy replication model was immediately dropped in favour of the distributed cache, thereby earning it the right to be classified as a Data Grid.  What buddy replication addressed to some degree and Infinispan and data grids do far better are the inherent limitations of scaling fully replicated caches.  As the size of the cluster increases the network and CPU overhead associated to the replication process also increases until you hit a brick wall.

So the distributed caching model lets you cache more data then you could imagine, even in your wildest dreams.  Inevitably you'll want to make full use of this additional capacity to make your application content richer and gets delivered even faster so chances are you'll by increasing your cache size dramatically.  Careful though as this can generate a whole new set of issues on its own...

Lets consider a standard cache.get(K) operation on an Infinispan distributed cache and a high level overview of the steps

  1. Build a list of the nodes owning the data
  2. Check if the data held locally? 
  3. Yes, then return the data from the local node
  4. No, then issue n remote get commands to nodes owning the data where n = the configured number of data owners (numOwners)
So if requests for data are round-robin'ed over a distributed cache with 10 nodes and numOwners=2 then 80% of requests will result in 2 remote get commands being issued to retrieve the required information from the owning cache nodes.  If you've got a high volume of requests and a lot of cached data you may find yourself running into the same old network latency issues as with the replication model.

And this brings us to one of the key features of any enterprise ready data grid and the recent announcement by the Infinispan team here.  The ability to send dataprocessing jobs over the grid network to the nodes owning the data rather than retrieving data for processing locally, more commonly known as Grid Exection or Map Reduce.

This approach can be used to make the previously impossible, possible.  Tasks can be sent to the data owners in parallel, data is processed as required and the results returned for aggregation by the initiating node.  Network latency again becomes an issue of the past and you can process huge amounts of data in record time.

All the players in this arena offer the ability to process grid data in this way, for example Coherence's Entry Processors, Gemfires FunctionService and Infinispan's MapReduce and DistributedExecutionService models.  Without them it's impossible to reap all of the benefits a Data grid can offer.

Up until now Infinispans original implementation always had a major drawback in that should a node running an executable task fail there was no fail-over or migration of the task to another node owning the data.  Now it looks like its full steam ahead for the implementation of this feature which should provide guaranteed execution and bring Infinispan closer in functionality to its more established competitors.

Mark Addy

30 May 2012

TomEE 1.0 is released. Why we think this is important.

TomEE 1.0 was released recently. http://cloud.dzone.com/articles/apache-tomee-tomcat-cloud Thereby expanding Tomcat to have full support for EJBs, JPA, JSF web beans etc.

We think is a very important development for customers for a number of reasons;

If you are developing JEE6 web profile compliant applications this now enables you to include Tomcat into your list of deployment platforms and since most of the other opensource application servers (JBoss, Glassfish etc.) embed Tomcat why not just use Tomcat? If you are currently evaluating which modern JEE6 application server to use for your latest developments you must now add TomEE to the list of platforms under consideration. Chances are your devs love Tomcat so it could be a no-brainer to get them to move to JEE6.

If you are using Tomcat previously as a lightweight servlet container with Hibernate you can now consider using the full JEE6 web profile capabilities without having to change to a radically different deployment platform. While also enabling a level of portability of your applications across all the major JEE6 web profile application servers. Something which couldn't be done before without moving to Geronimo and frankly we never see Geronimo in deployment in the wild. Once you add in other components of the Apache stack including ActiveMQ you can deliver a full JEE6 Apache experience.

For anybody who still classes JEE as heavyweight and competitor frameworks as lightweight then you should revisit JEE6 and actually look at it rather than repeating late-noughties dogma and mythology. TomEE 1.0 maybe the catalyst some of you need as you keep your "light-weight" container and start to use JEE6.

Steve Millidge

25 May 2012

C2B2 Webinar: 'iAS to WebLogic Migration - Increase Performance, Scalability and Reliability'

Learn how, by switching to Oracle WebLogic Server, you can reduce costs and risks, increase operational efficiency and gain rich new functionality.

You can now watch the recording of the webinar delivered on the 24th of May 2012 by Matt Brasier - C2B2 Head of Consultancy (follow Matt on Twitter @mbrasier )



17 May 2012

Getting Started with JBoss Enterprise Data Grid

JBoss have recently granted early access to Enterprise Data Grid (EDG), the supported version of Infinispan.  EDG is a high performance key/value based cache supporting local, replicated, invalidation and distributed modes.  Distributed mode is the one we'll be focusing on in this blog post because its the one that's gaining lots of attention at the moment due to its elastic scaling properties.

You can sign up now for a copy of EDG to download and have a play around with here http://www.jboss.com/edg6-early-access/.  So to help you get started lets create a distributed grid and check out the Hotrod client-server protocol.

Fully replicated caches don't tend to scale too well past a relatively small number of nodes as the overhead of full replication has a significantly detrimental effect on performance.  Distributed caches alleviate this bottleneck by providing redundancy through a configurable number of back-ups or copies of the data allowing you to scale caching architectures linearly.  When grid nodes join and leave the cluster data is re-balanced between the nodes to maintain the configured level of redundancy (number of back-ups) and an even distribution.

EDG can be run in embedded or client-server mode with Memcached, REST and Hotrod clients provided as part of the distribution.  Hotrod is a smart java based client which provides in-built connection pooling, failover and smart-routing.  The smart-routing feature of Hotrod optimises client calls into the EDG server grid as the client has up to date knowledge of the server side topology and is able to route requests directly to the nodes owning the data.

Once you've signed up and downloaded the distribution (we are using JBoss Data Grid Server 6.0.0 Beta1), unzip it into a location of your choice.


If you take a look around you'll soon discover that the directory structure is a bit different to Infinispan, in fact it bears a striking resemblance to JBossAS7 (or JBossEAP 6 if you are paying) and that's because the EDG server in runs inside JBoss's new modular application server framework.

JBossAS7 runs in two modes, standalone (a single unmanaged instance) or domain (a centrally managed set of servers).  Domain mode permits a number of server instances to be set up and started quickly with only a small amount of configuration - so lets do that then...

Firstly we need to create a management user so we can access the domain console to start and stop managed server instances.  There's an add-user script in the bin directory at the root of the installation, we can run this to perform this task:


Now lets configure the domain to create 4 servers.  Managed servers are defined in the hosts.xml file located in the $EDG_HOME/domain/configuration directory.  By default two are already configured but two servers doesn't constitute a grid so we will have to increase this.  Here is the initial configuration, I'd already changed the highlighted "port-offset" field from 150 to 100...


And this is what it looked like after I added two more servers:


Note that for each of these servers the port-offset increases by another 100 (so we don't get port binding conflicts), both are set to "auto-start" (so the domain controller will start these servers automatically when it starts) and also the "topology.machine" is different (not important for this example but EDG can intelligently distribute data to ensure back-ups or copies are held on separate machines to ensure redundancy).

Now we have 4 servers or data nodes ready for our grid, next we should define a distributed cache to store our key/value data in.  As we will be running EDG in domain mode we can make this change globally in the domain.xml configuration file, again this is located in the $EDG_HOME/domain/configuration directory.  EDG already comes with some predefined caches but we will add our own anyway named "test-dist-cache"


Notice the attribute start="EAGER" tells EDG to initialize this cache on start up and virtual-nodes="10" will improve the data distribution across the nodes by sub-dividing positions on EDG's internal key hashing algorithm used to assign individual keys to data nodes.

Ok, so lets start up the EDG servers using the domain controller:


And use the user credentials we created in the first step to access the domain console application at the default URL http://localhost:9990


You should see all 4 data grid nodes running and may also notice that you can stop and start individual nodes directly from the console.  This is a great management feature and a vast improvement over previous JBoss console incarnations, as you've also seen it's super quick to get a cluster of servers up and running.

For more detailed information on the EDG cache instances connect jconsole to the running server instances and navigate the MBean tree to the jboss.infinispan domain, from here we can view the number of entries, hit ratios etc.


So we now have an EDG distributed cache up and running, we can control the individual nodes using the JBoss management console and we can view cache statistics using any JMX based tool.  All we need now is a client!

Here's a simple Hotrod client that will put a bunch of records into the grid and then run on-demand checks to ensure all the records can still be found.


public class HotrodClient {


public static void main(String[] args) throws IOException {


int records = 100;
String cache = "test-dist-cache";
Properties properties = new Properties();
properties.setProperty("testOnBorrow", "true");
properties.setProperty("testWhileIdle", "true");
properties.setProperty(ConfigurationProperties.SERVER_LIST, "127.0.0.1:11222;127.0.0.1:11322;127.0.0.1:11422;127.0.0.1:11522");


System.out.println("Starting the Hotrod Client\n");


RemoteCacheManager remoteCacheManager = new RemoteCacheManager(properties);
RemoteCache<String, String> remoteCache = remoteCacheManager.getCache(cache);


for (int i = 0; i < records; i++) {
remoteCache.put("key" + i, "value" + i);
}

System.out.println("Loaded " + records + " records into the EDG cache\n");


BufferedReader bs = new BufferedReader(new InputStreamReader(System.in));


System.out.println("Press any key to check the records in the cache or 'X' to exit");
while (!(bs.readLine().equalsIgnoreCase("X"))) {
System.out.println("Checking to see how many of the " + records + " records can be found in the cache");
int found = 0;
for (int i = 0; i < records; i++) {
if (remoteCache.get("key" + i) != null) {
found++;
}
}
System.out.println("Found " + found + " of " + records + " records.");
}

remoteCacheManager.stop();
System.out.println("Exiting");


}


}

This dependencies for the client application can be found in the client/java folder of the distribution:


Lets take a quick look at some of the interesting parts in the code.

As I mentioned earlier the EDG Hotrod client provides connection pooling, this is provided by the Apache commons-pool library so we can set connection pool properties for the client using the set of available parameters for this library.  See http://commons.apache.org/pool/ for a full list.


properties.setProperty("testOnBorrow", "true");
properties.setProperty("testWhileIdle", "true");



We also need to tell the client how to connect to the EDG grid, this is done using the Hotrod client property "infinispan.client.hotrod.server_list".

properties.setProperty(ConfigurationProperties.SERVER_LIST,"127.0.0.1:11222;127.0.0.1:11322;127.0.0.1:11422;127.0.0.1:11522");


Don't be alarmed that all the data grid nodes are listed here, the Hotrod client only needs to be able to connect to one server in the list, once a connection is established the entire current EDG topology view is returned.  In practice you only need specify one EDG server address that is always available to ensure the client can connect to the whole grid.  A full list of client properties can be found here: http://docs.jboss.org/infinispan/5.1/apidocs/org/infinispan/client/hotrod/RemoteCacheManager.html

If we run the client we should see the output similar to the following:


You can see that the EDG cluster topology is returned immediately to the client on start up.  If we check the number of records in the cache now we'll get this response, hopefully!


Lets stop one of the EDG servers using the console:


 and repeat the test:


Look at the exception, the client attempted to use a connection to the server we just stopped.  There's no need to panic because its intelligent enough to discard the dead connection from the pool and also remove the server 4 from the clients view of the topology.  The data held on the EDG server is re-balanced and all records are still found.

And if we start the server we just stopped using the console and repeat the test we should see something like this in the client output:


Note that the restarted server is added to the clients topology view and into the connection pool.

You should try stopping more servers, restarting, checking the cache statistics out in jconsole etc All the original records should still be present with their redundant copies.

So that's it a quick look at JBoss Enterprise Data Grid and the Hotrod client, a powerful client-server caching architecture.

Mark Addy



15 May 2012

Understanding Strengths and Weaknesses in JMS Implementations

If your goal is shifting information from point A to point B then JMS is a mature mechanism that offers a number of powerful features such as guarantees for zero message loss and resilience against hardware failure.

There are many vendors of JMS implementations and it can be hard to determine which is the right choice. In this blog we'll have a look at some of the options that are available to you and highlight some personal pros and cons of each.


If speed is your goal, HornetQ is a powerful solution. HornetQ is a highly available messaging platform that supports real time monitoring and dynamic scalability. As well as a number of runtime optimisations, HornetQ leverages Asynchronus file IO for disk interaction. In terms of raw throughput it's hard to beat, as proven by SPEC2007 JMS testing (linked on the HornetQ blog http://planet.jboss.org/post/8_2_million_messages_second_with_specjms ) which showed message throughput of up to 8.3 million messages a second - highly impressive. It should be noted that this test contained 4 JMS clients producing and sending messages within a single blade enclosure across Gigabit Ethernet. You should not expect the same throughput across wider network links. Busy, spinning disks can impact your deployment too, again reducing your expected performance. Finally the runtime optimisations that grant the fastest speeds bring the potential for message loss; depending on your usage requirements this may be acceptable under certain conditions, configurations can also be made to tune this from your implementation.
Personally I am a big admirer of the work undertaken by Clebert and the team on this product, my main reservation stems simply from the fact that it is no silver bullet. I see too many instances of people expecting high returns from stock configurations, and being unsure as to the reasons for some of its runtime behaviours which are undertaken in the name of performance.


JBoss Messaging is the precursor to HornetQ. Whilst technically no longer the messaging platform of choice for the JBoss Application Server going forwards, it remains in considerable use in production systems. Another highly available system, the main feature that sets JBoss Messaging apart from HornetQ is its default utilisation of database persistence compared to HornetQ's file based mechanism. This makes centralisation of persisted message data trivial in an enterprise. Thus it becomes easier to process messages when JMS Servers fail; neighbour nodes can easily identify and adopt the messages of a node which has left the cluster. Cluster failure detection becomes a critical enabler for this process to successfully occur. This is often ignored during enterprise rollout, causing obscure failure scenarios and making disaster recovery a manual process.


ActiveMQ from Apache (and FuseSource as Fuse Message Broker) is a mature JMS solution offering a nice suite of features. Disk and database persistence mechanisms are available, it is highly available and it contains a number of libraries that support clients written in multiple languages. I have seen it widely used and well proven in a number of deployment topologies and environments.
What sets it apart for me is the out of the box coupling with the Apache Camel, allowing easy transformation of your messaging fabric into a smart routing and decision making engine. Effectively it serves as a lightweight ESB with many extension points and possibilities. The default connectors enable a number of enterprise integration patterns but the ability to spin your own plugins make this platform truly powerful, if you have the time to deploy, test and support them.


RabbitMQ from VMWare is another option in this space. Written in Erlang this product is easy to set up and get going. Multiple clients are supported along with plugins to enable monitoring and provide a WAN style clustering configuration. It is a little different from the others here, being AMQP ( http://www.amqp.org/ ) based. In a nutshell, this theoretically creates a common on-the-wire representation of your data allowing for full interoperation between disparate messaging systems. In my opinion this is a little difficult to achieve in practice, as part of what makes a fast implementation fast, or an scalable system dynamic, are all the custom configuration attributes and proprietary data exchanges between brokers and clients. It will be hard to standardise these sorts of attributes. Futhermore RabbitMQ is currently a little bit lacking when it comes to the client view of clusters, requiring things such as hardware load balancers or DNS hostname resolution for accessing and failing over within your cluster topology. Still, it's integration with the vFabric Java Stack makes it an interesting and well integrated choice.

WebLogic Messaging is a powerful product, supporting full transitional failover of it’s infrastructure thanks to the segregated but associated concept of JMS Servers and File System resources. Projects such as OpenMQ and WS02 Message Broker are further robust choices, but lack some of the more advanced features you will find in the other products mentioned here.

There are plenty of alternatives for enterprise messaging systems, but few offer the suite of features present in many of the above products. ZeroMQ for instance, whilst quick, lacks any ability to determine and monitor the health of the messaging platform for peer health or queue depth. I would not want to have to answer questions to management if messages start disappearing without access to those diagnostics.

In conclusion we encourage you need to think hard about what you need from your messaging platform. Is lightning performance a big requirement? Or perhaps a fault tolerant system with 100% uptime and a guarantee of no message loss. If I could encourage you to pick one cherry from my post today it is simply that it is very, very unlikely that there is one system which will outperform all of the others across all of your requirements. Take your time and assess your options; there is no such thing as a free lunch.

Nick Wright