Wednesday, 7 July 2010

IIS Wcf and msmq Load balancing proposal

So, the system I'm currently re-architecting is essentially a bunch of wcf (tcp and msmq) windows services sat on a single box and various types of client, for various reasons its approaching capacity.

We have been looking at various options to provide the magic, scalable available reliable platform. Some are obvious (and previously spoken about), and simply involve replacing certain heavy layers with simpler (designed for usee) push systems and efficient data access (don't get me started on loading the entire data set to get a customer name)...

Anyway, after a lot of research, looking at hardware and software options i think we have finally settled on an idea (at least until i have prototyped it all).

So couple of thoughts before i divulge. Firstly we only actually use netTcpBinding to provide simple binary data transport, which in fact is supported by http by specifying the binary transport in the binding. Http is significantly more flexible, and opens our options greatly. Tcp binding by definition doesn't strictly support per call load balancing, its a session transport protocol, and requires careful tweaking to provide something similar to good per call scaling. Finally after spending far too much time playing, IIS7 and Windows activation services for IIS hosted services is fiddly at best.

So http binding is the way to go, IIS support out of the box, and a lot of options opened up, so i started to look at IIS 7.5 and web farms. First thing i find is Shared Config and Content, store your content in a secure shared environment (obvious and supported in iis 6 with some playing). but the important bit is the shared config (there are some great videos out there demoing the tech, so get google-ing). Basically set your site/service up on your first (master) server. The goto Configuration Sharing (in iis manager) and export the config to a shared location (create a password as appropriate), then switch on shared config pointing at the export you created and check everything still work.
Now on to the interesting part... Go to your additional servers in the farm, in config sharing switch on sharing (pointing at your config again). IIS needs to restart, and on doing so you'll find all your sites / applications you set up on the master have been configured for you on the additional boxes.
Ok so that takes care of deployment, but what about load balancing, NLB doesn't really hack it for most uses, if your hammering the boxes, and cpu is high, you need to check your code and architecture (could be done better), its very easy and possible though to max out threads on a wcf service, especially if it has relatively long running processes, this is why we want a web farm, instance load balancing. Yes you can go for something like a Cisco ACE module for hardware balancing, but that has a big price tag attached to it. NLB isn't a true load balancer (and this bit is only my opinion) it hammers one box until that box is unhappy (very high load) before it moves over to the next in the cluster. In comes a much more sophisticated software solution from microsoft, Application Request Routing module for IIS (ARR). This only handle http url routing, hence http bindings for wcf, essentially you set up a new box (or active/passive failover cluster for safety). In iis manager install, then create your web farm, defining the incoming URL, and the target instances in the farm. Whats really impressive is the load balancing, you have some common options out of the box (such as round robin, weighted etc) or fully customised algorythms, the examples on the IIS site talk about geographic datacenter routing (ie you'll get routed to your nearest available datacenter).
All this true load balancing (thank you microsoft, i was starting to bang my head against a brick wall) is wonderful and all, but doesn't resolve our push requirements, i intend to rely heavily on MSMQ for this, (using durable messaging for improved safety). I've looked at nServiceBus and a couple of others, but to be honest its firstly another technology for me to guide my team through, and secondly i'm unsure how well proven it is at enterprise level. I don't really have particularily complex requirements, (its all in the high and mid level architecture) so I'm looking at remote routing, i've heard tell that this is only supported in the legacy MsmqIntegrationBinding, however it looks like an AD addressing issue which won't affect my project (You would have to try very very hard to convice me to use public queues with AD integration in MSMQ - its a nice idea but over complicates what is essentially - pass this message to that server/farm over there - i always know where i'm pointing to, or someone has designed something badly). So the plan is to host the queue on the routing server, and have service instances (IIS 7 with WAS msmq services under appfabric) pointing at the remote host. Now i have to be honest, not sure if this is going to work, i know msmq can handle remote hosts (with a small overhead during locking of a message for pickup - blocking other services picking up a message briefly), I just need to test this configuration (finger crossed).
Finally since i mentioned it already appfabric (specifically dublin components for now), this amongst many things provides improved was support, and importantly instance monitoring and logging (stunning configurable wcf logging can be used).
So i've gone on a lot, and nothing particularily concrete at this time, i will however post the results of my prototyping.

No comments:

Post a Comment