I am writing a lock-free doubly linked list based on these papers:
"Efficient and Reliable Lock-Free Memory Reclamation Based on Reference Counting" Anders Gidenstam,Member, IEEE,Marina Papatriantafilou, H˚ akan Sundell and Philippas Tsigas
"Lock-free deques and doubly linked lists" Håkan Sundell, Philippas Tsigas
For this question we can put aside first paper.
In this paper, they use a smart way for storing a deletion flag and a pointer in a word. (More info here)
Pseudo code for this section in the paper:
union Link
: word
(p,d): {pointer to Node, boolean}
structure Node
value: pointer to word
prev: union Link
next: union Link
And my code for above pseudo code:
template< typename NodeT >
struct LockFreeLink
{
public:
typedef NodeT NodeType;
private:
protected:
std::atomic< NodeT* > mPointer;
public:
bcLockFreeLink()
{
std::atomic_init(&mPointer, nullptr);
}
~bcLockFreeLink() {}
inline NodeType* getNode() const throw()
{
return std::atomic_load(&mPointer, std::memory_order_relaxed);
}
inline std::atomic< NodeT* >* getAtomicNode() const throw()
{
return &mPointer;
}
};
struct Node : public LockFreeNode
{
struct Link : protected LockFreeLink< Node >
{
static const int dMask = 1;
static const int ptrMask = ~dMask;
Link() { } throw()
Link(const Node* pPointer, bcBOOL pDel = bcFALSE) throw()
{
std::atomic_init(&mPointer, (reinterpret_cast<int>(pPointer) | (int)pDel));
}
Node* pointer() const throw()
{
return reinterpret_cast<Node*>(
std::atomic_load(&data, std::memory_order_relaxed) & ptrMask);
}
bool del() const throw()
{
return std::atomic_load(&data, std::memory_order_relaxed) & dMask;
}
bool compareAndSwap(const Link& pExpected, const Link& pNew) throw()
{
Node* lExpected = std::atomic_load(&pExpected.mPointer, std::memory_order_relaxed);
Node* lNew = std::atomic_load(&pNew.mPointer, std::memory_order_relaxed);
return std::atomic_compare_exchange_strong_explicit(
&mPointer,
&lExpected,
lNew,
std::memory_order_relaxed,
std::memory_order_relaxed);
}
bool operator==(const Link& pOther) throw()
{
return std::atomic_load(data, std::memory_order_relaxed) ==
std::atomic_load(pOther.data, std::memory_order_relaxed);
}
bool operator!=(const Link& pOther) throw()
{
return !operator==(pOther);
}
};
Link mPrev;
Link mNext;
Type mData;
Node() {};
Node(const Type& pValue) : mData(pValue) {};
};
In this paper there is this function for set deletion mark of link to true:
procedure SetMark(link: pointer to pointer to Node)
while true do
node = *link;
if node.d = true or CAS(link, node, (node.p, true)) then break;
And my code for this function:
void _setMark(Link* pLink)
{
while (bcTRUE)
{
Link lOld = *pLink;
if(pLink->del() || pLink->compareAndSwap(lOld, Link(pLink->pointer(), bcTRUE)))
break;
}
}
But my problem is in compareAndSwap
function where i must compare and swap three atomic variable. Information about problem is here
(Actually new
variable in compare and swap function isn't important because it is thread local)
Now my question: how can i write compareAndSwap function to compare and swap three atomic varialbe or where am i making mistake?
(Excuse me for long question)
Edit:
similar problem is in memory manager paper:
function CompareAndSwapRef(link:pointer to pointer toNode,
old:pointer toNode, new:pointer toNode):boolean
if CAS(link,old,new) then
if new=NULL then
FAA(&new.mmref,1);
new.mmtrace:=false;
if old=NULLthen FAA(&old.mmref,-1);
return true;
return false;
here again i must compare and swap three atomic variable.
(Note that my arguments are type of Link
and i must compare and swap mPointer
of Link
)
Unless you can make your three data items that you are comparing/swapping fit into two pointer-size elements, you can't do this with compare and swap (certainly not on x86, and I've not heard of any other machine architecture that has such a thing).
If you rely on the data being stored on an address that is (at least) aligned to an even byte-address, you could potentially use bitwise OR to set the lowest bit when deleting the element. In the past, people have been using the upper parts of the address to store extra data, but in x86-64 at least, this is not possible, as the upper part of the address must be "canonical", meaning that any address bits above the "usable limit" (defined by the processor architecture, currently this is 48 bits), must all be the same as the highest bit of the usable limit (so, same as bit 47).
Edit: This section of code does exactly what I describe:
static const int dMask = 1;
static const int ptrMask = ~dMask;
Link() { } throw()
Link(const Node* pPointer, bcBOOL pDel = bcFALSE) throw()
{
std::atomic_init(&mPointer, (reinterpret_cast<int>(pPointer) | (int)pDel));
}
Node* pointer() const throw()
{
return reinterpret_cast<Node*>(
std::atomic_load(&data, std::memory_order_relaxed) & ptrMask);
}
It uses the lowest bit to store the pDel
flag.
You should be able to do this for a double-linked list by using the a form of cmpxchg16b
(on x86). In a Windows system, that would be the _InterlockedCompareExchange128
. In gcc (for Unix type OS's, such as Linux/MacOS) you will need to first construct a int128
from your two pointers. If you are compiling for 32-bit code, you will probably need to make a 64-bit int for both Windows and Unix OS's.
http://www.drdobbs.com/cpp/lock-free-code-a-false-sense-of-security/210600279
But replacing locks wholesale by writing your own lock-free code is not the answer. Lock-free code has two major drawbacks. First, it's not broadly useful for solving typical problems—lots of basic data structures, even doubly linked lists, still have no known lock-free implementations. Coming up with a new or improved lock-free data structure will still earn you at least a published paper in a refereed journal, and sometimes a degree.
I don't think it would be efficient enough to use it, but anyway it's interesting to read.
On x64, only 44 bits of address space are used. If your pointers are aligned to 8 bytes then you are only using 41 bits. 41x2 is still too large for 64 bits. There is a 128 bit compare and swap although I can't vouch for its speed. I always try to use the 64 bit one.
Maybe you only need up to 2 billion nodes. So what you could do is preallocate a pool of nodes that the list pulls from. You create nodes by grabbing the next free pool index using atomic ops of course. Then instead of next and prev being pointers, they could be 31 bit indexes into the node pool and you have 2 bits left over for delete flags. Assuming you don't need 2 billion nodes, you have even more bits left over. The only downside is you have to know how many nodes you are going to need at startup, although you could realloc the nodes if you had too.
What I have done is used virtual memory functions to reserve GB of address space and then map physical ram into that space as I need it to extend my pool without having to reallocate.
来源:https://stackoverflow.com/questions/19609417/atomic-operations-for-lock-free-doubly-linked-list