What is Non-blocking Concurrency and how is it different than Normal Concurrency using Threads.
Non-blocking concurrency is a different way to coordinate access between threads from blocking concurrency. There is a lot of background (theoretical) material out there, but the simplest explanation (as it seems that you're looking for a simple, hands-on answer), is that non-blocking concurrency does not make use of locks.
Why dont we use "non-blocking" Concurrency in all the scenarios where concurrency is required.
We do. I'll show you in a bit. But it is true that there aren't always efficient non-blocking algorithms for every concurrency problem.
are there any overheads for "non-blocking"
Well, there's overhead for any type of sharing information between threads that goes all the way down to how the CPU is structured, especially when you get what we call "contention", i.e. synchronizing more than one thread that are attempting to write to the same memory location at the same time. But in general, non-blocking is faster than blocking (lock-based) concurrency in many cases, especially all the cases where there are well known, simple, lock-free implementation of a given algorithm/data-structure. It is these good solutions that are provided with Java.
I have heard that this is available in Java.
Absolutely. For starters, all the classes in java.util.concurrent.atomic provide lock-free maintenance of shared variables. In addition, all the classes in java.util.concurrent whose names start with ConcurrentLinked or ConcurrentSkipList, provide lock-free implementation of lists, maps and sets.
Are there any particular scenarios we should use this feature.
You would want to use the lock-free queue and deque in all cases where you would otherwise (prior to JDK 1.5) use Collections.synchronizedlist, as they provide better performance under most conditions. i.e., you would use them whenever more than one thread is concurrently modifying the collection, or when one thread is modifying the collection and other threads are attempting to read it. Note that the very popular ConcurrentHashMap does actually use locks internally, but it is more popular than ConcurrentSkipListMap because I think it provides better performance in most scenarios. However, I think that Java 8 would include a lock-free implementation of ConcurrentHashMap.
Is there a difference/advantage of using one of these methods for a collection. What are the trade offs
Well, in this short example, they are exactly the same. Note, however, that when you have concurrent readers and writers, you must synchronize the reads as well as the writes, and Collections.synchronizedList() does that. You might want to try the lock-free ConcurrentLinkedQueue as an alternative. It might give you better performance in some scenarios.
General Note
While concurrency is a very important topic to learn, bear in mind that it is also a very tricky subject, where even very experienced developers often err. What's worse, you might discover concurrency bugs only when your system is under heavy load. So I would always recommend using as many ready-made concurrent classes and libraries as possible rather than rolling out your own.