When I say web service scale testing I mean testing to figure out how many instances of your service you need. This type of testing is really easy to explain but typically hard to get right.
People who are not used to this type of testing typically make one (or two) of two common mistakes. The most common problem is that testing is done with a single (or very few) test users. This is typically (but not always) a problem since the user data is the same and hence you might end up using cached data from your database. Most databases optimize themselves when the same data is retrieved over and over again.
The second common problem is that the traffic generated not represents expected usage patterns. For example the test might read and write the same amount of data while expected usage is more read or write heavy. Depending on your implementation you might be optimized for one or the other.
When you are testing a service for scale there are so many parameters that come into play that it is often hard to predict which parameters actually matter and what hidden bottle necks you have. That is why I always advocate doing scale testing based on real (or predicted) usage patterns.
For example I once had a co-worker who did both mistakes and ended up with a result that indicated the service could handle a lot of load. Essentially a single instance would cover our needs. However when a real life scenario was used the service quickly grinded to a halt surfacing a bug causing us to use several instances even though the expected load was much less than in the original test.
So how do I know the usage pattern you might ask. Well if you are lucky you know based on older products. Otherwise you need to guess. the key is that you do your scale testing based on some scenario and as your service gets real users you can adjust this scale test for the future. And in y experience this type of testing eliminates a lot of the debate whether or not the results should be taken seriously or not which is often the case when the result is bad but non realistic usage patterns were used.