You use capacity to cover a number of non-functional qualities of system and are probably trying to encaspulate performance, capacity and scalability into one concept.
Lets start with performance and if you are dealing with a web based architecture, where you are serving resources then this is really quite straightforward and can be split into 2 different KPI's; server response time and page load time (should be called resource load time since not all resources on the web are web pages).
Server response time measures the time to last byte for a request on a given resource. Please note, that this is not inclusive of things such as content negotation. You (or the business) needs to specify the expected server response time for given types of resources. This is based on a single request/response e.g a response to a request for any resource that falls under the type of a 'Car Model', should take no more than 0.5 seconds, time to last byte.
Page load times take things one step further. Given a request for a resource, how long does it take to load that resource, along with any dependent resources. It really has more meaning when in the context of a Web Page. The Web being full of unknowns, makes this a bit of a grey area since all sorts of things come into play on this one (the network, the client, content negotation) so you need speicfy this given a fixed/stabilised network and client (there are all sorts of tools to achieve this). It should also always be defined as an average, without introducing concurrency issues (we are still not thinking about capacity yet).
Once you have specified both, you can start to determine the immediate capacity of your system i.e how many requests per second for resources can I make performantly (as specified above). There are loads of tools to help you define this. This will give you an immediate measure of capacity. You'll notice I use the term immediate because often the business might turn around and say, great, but what happens if we need to increase this capacity.
So we move onto the third non functional, scalability (n.b, there are more than 3 non functional qualities of a system, including availability, reliability, validity, usability, accessibility, extensibility, and manageability). Given a certain capacity, by how much can I increase it performantly. There is also sorts of ways to increase the capacity, but most systems by design usually have a bottleneck somewhere that creates a constraint.