算法思想 :
堆排序利用堆数据结构设计的一种排序算法,堆是一种近似完全二叉树的结构,同时满足堆积的性质,即对于任意的i均有ki>=k(2i+1),ki>=k(2i+2)
步骤:
将数组转化为最大堆
将当前堆的根结点和最后一个结点交换,然后将前n-1个元素转化为最大堆
重复上一过程
为什么要是最大堆呢,因为这样方便再第二步交换的时候可以将最大的放在最后,得到最后升序的结果
算法实现:
堆排序的主要复杂的地方就是在于生成最大堆的算法,而且这一算法要能被后面调整最大堆的时候用上,因为调整最大堆时候是从根节点开始向下的,所以生成最大堆的算法也设计为 从堆中只有只有一个或两个子节点的结点开始(很好找n-2//2),一层一层的向上 ,而在每个节点的调整过程中,都是从上到下的,如果和子节点交换了,就再对这个子节点调整。经过对比,这也是生成最大堆的算法中最快的一种。
def heap_sort5(collection):
length=len(collection)
def adjustment(collection,index,heap_length):
#index要加入堆中的一个数,index是collection中的下标
#当队列已经变成大顶堆的时候,通过让根节点和最后一个元素交换,然后在对前n-1个元素组成的堆进行调整,重新变为大顶堆
#为了实现上面的功能,就必须在调整大顶堆的算法中加入一个变量,用来标识collection中是堆中的元素的长度,这个变量名为heap_lenght
left_index=index*2+1
right_index=index*2+2
temp=collection[index]
# if right_index>=length-1:#竟然有这样的错误
while right_index<=heap_length-1:
if collection[left_index]>collection[right_index]:
if temp<collection[left_index]:
collection[index]=collection[left_index]
index=left_index
left_index=index*2+1
right_index=index*2+2
else:
collection[index]=temp
break
elif temp<collection[right_index]:
collection[index]=collection[right_index]
index=right_index
right_index=index*2+2
left_index=index*2+1
else:
collection[index]=temp
break
collection[index]=temp
if left_index<=heap_length-1:#只有左子树说明就到底了
if collection[index]<collection[left_index]:
collection[index],collection[left_index]=collection[left_index],collection[index]
#生成最大堆
for i in range((length-2)//2,-1,-1):
adjustment(collection,i,length)
#堆排序
for i in range(length-1):
collection[0],collection[length-1-i]=collection[length-1-i],collection[0]#注意每次是和第一个交换,不是i是0
adjustment(collection,0,length-i-1)#这个也是从0开始
return collection
效率分析:
对比
1000pic 随机数据:
与桶排序(平均O(n+K)):比桶排序慢,时间是桶排序的4倍
详细数据:[0.0030105114, 0.00298571587, 0.00298523903, 0.00296974182, 0.00300073624, 0.00302100182, 0.0019993782, 0.00198626518, 0.00301527977, 0.00298166275, 0.0040140152, 0.00301671028, 0.00198078
156, 0.00299835205, 0.00301647186, 0.00298070908, 0.00299263, 0.00300550461, 0.00298094749, 0.0029938221, 0.00299906731, 0.00301623344, 0.00298047066, 0.00200772285, 0.00198721886, 0.00299811363, 0.00301504135, 0.00399470329, 0.0029861927, 0.00301098824, 0.00198554993, 0.00301480293, 0.00198483467, 0.0029964447, 0.00299835205, 0.0020263195, 0.00401711464, 0.00200629234, 0.00299668312, 0.00302457809, 0.00199437141, 0.00300264359, 0.00302124023, 0.00298285484, 0.00301980972, 0.0029835701, 0.00301551819, 0.00299835205, 0.00199890137, 0.00499534607, 0.00301027298, 0.00401067734, 0.00299453735, 0.00300312042, 0.00299811363, 0.0019774437, 0.0029771328, 0.00199961662, 0.003002882, 0.00298476219, 0.0019993782, 0.00299906731, 0.0020108223, 0.00297951698, 0.00202202797, 0.00199866295, 0.00198745728, 0.00399780273, 0.00299715996, 0.00199818611, 0.00397968292, 0.00300979614, 0.00198554993, 0.00199079514, 0.00398445129, 0.00299572945, 0.00398278236, 0.00300073624, 0.00199818611, 0.00399780273, 0.00301170349, 0.00299835205, 0.00296235085, 0.00202345848, 0.00399947166, 0.00299429893, 0.00398445129, 0.00299811363, 0.00398445129, 0.00300931931, 0.00298428535, 0.00301766396, 0.0029976368, 0.00202298164, 0.00401210785, 0.00299596786, 0.00398468971, 0.00199770927, 0.00298476219, 0.00299715996]
运行了100次,平均运行时间差(heap_sort5-bucket_sort3)(正数代表你是个弟弟)是:0.00290829897
前者(heap_sort5)平均运行时间0.00392773628,后者(bucket_sort3)平均运行时间0.00101943731,前者约是后者的3.8528倍
与快排(平均O(nlogn)):比快排慢,时间是快排的2.4倍
详细数据:[0.00201654434, 0.00297951698, 0.00201129913, 0.00196933746, 0.00198483467, 0.00199818611, 0.0019993782, 0.00100493431, 0.00298690796, 0.00099802017, 0.00199866295, 0.00300002098, 0.002963
78136, 0.00300335884, 0.00201702118, 0.00296258926, 0.00198817253, 0.00298595428, 0.00100684166, 0.00099444389, 0.00298118591, 0.00301027298, 0.0019865036, 0.00302529335, 0.00298523903, 0.00199794769, 0.00299739838, 0.00199866295, 0.00198602676, 0.0019865036, 0.00299787521, 0.00299954414, 0.0029976368, 0.00299859047, 0.00299811363, 0.00199818611, 0.0019993782, 0.00299739838, 0.00299835205, 0.00199794769, 0.00199818611, 0.0009996891, 0.00299787521, 0.00299692154, 0.0029976368, 0.00299930573, 0.0019993782, 0.00199961662, 0.0029976368, 0.0020005703, 0.00199818611, 0.00097441673, 0.00325942039, 0.00203251839, 0.00299954414, 0.0020236969, 0.00298595428, 0.00201702118, 0.00198149681, 0.00298118591, 0.00198149681, 0.00199890137, 0.0020236969, 0.00201678276, 0.00201177597, 0.00199770927, 0.00290203094, 0.00201177597, 0.00197386742, 0.00299882889, 0.00299882889, 0.00199460983, 0.00102114677, 0.00298571587, 0.00301146507, 0.0029861927, 0.00199913979, 0.00199842453, 0.00402283669, 0.0030105114, 0.0029976368, 0.00202226639, 0.00296854973, 0.0020160675, 0.00201106071, 0.00198984146, 0.00101685524, 0.00199794769, 0.00302910805, 0.00299811363, 0.0010368824, 0.00299811363, 0.00199842453, 0.00201964378, 0.00298213959, 0.00199866295, 0.0019993782, 0.00199818611, 0.0019865036, 0.00299906731]
运行了100次,平均运行时间差(heap_sort5-quick_sort2)(正数代表你是个弟弟)是:0.00235058308
前者(heap_sort5)平均运行时间0.00402917624,后者(quick_sort2)平均运行时间0.00167859316,前者约是后者的2.4003倍
与归并(平均O(nlogn)):与归并差不多,实际浮动10%
详细数据:[1.764297e-05, 0.00099945068, 1.740456e-05, -0.00101423264, 0.00097084045, 0.0009701252, -0.00096297264, 1.811981e-05, 0.00096440315, 0.00198078156, -0.00097417831, 0.00101184845, -5.48363
e-06, 0.00099921227, -4.7684e-07, 0.00101065636, 0.0, 0.00098347664, 0.00100016594, -0.00098752975, 0.00098752975, -1.144409e-05, 9.77516e-06, -1.573563e-05, 0.00099873543, -0.00099945068, 0.00101733208, -1.001358e-05, 9.5367e-07, 0.00097441673, 0.00098729134, -0.00097298622, 0.00099992752, 0.0, -0.00097846985, 1.645088e-05, 0.00097513199, 0.00097751617, -0.00098800659, -9.5367e-07, 0.00098729134, 0.0009868145, 0.00099849701, -0.0009624958, 1.764297e-05, -2.14577e-06, 0.00197386742, -0.00200939178, 0.00098705292, -1.072884e-05, 0.00099825859, -0.00097560883, 0.00101137161, 0.00097537041, 0.00096106529, -1.66893e-06, -7.1526e-07, 0.00100588799, -1.716614e-05, 0.00101280212, 0.00096774101, 0.00096273422, 1.40667e-05, -0.00098156929, 0.00098705292, 3.8147e-06, -0.00098156929, 0.0010008812, 9.5367e-07, -2.3842e-07, 3.671646e-05, 3.623962e-05, -4.7684e-07, 0.00097966194, -3.576279e-05, -1.66893e-05, -3.647804e-05, 7.86781e-06, 0.00100302696, 3.433228e-05, 4.05312e-06, 0.00099158287, 3.433228e-05, 4.05312e-06, 0.00098752975, 0.00198554993, 0.00102448463, 0.00101280212, -1.239777e-05, 2.098083e-05, 0.00099825859, -1.358986e-05, 0.00098633766, -0.00098776817, 0.00101685524, -1.311302e-05, -2.598763e-05, 1.144409e-05, 1.192093e-05, 0.00097417831]
运行了100次,平均运行时间差(heap_sort5-merge_sort3)(正数代表你是个弟弟)是:0.00031897068
前者(heap_sort5)平均运行时间0.00396824121,后者(merge_sort3)平均运行时间0.00364927053,前者约是后者的1.0874倍
与希尔排序():比希尔慢,时间是希尔的1.2倍
详细数据:[1.335144e-05, 0.00201821327, -5.24521e-06, 0.00099253654, -1.311302e-05, 0.0020070076, 0.00099611282, 0.00096917152, 2.43187e-05, -5.24521e-06, -1.43051e-06, 1.28746e-05, 0.00096964836, 0
.00100708008, 0.00100922585, 0.00099921227, -0.00100970268, 5.24521e-06, 0.00099039078, 0.00099992752, -3.57628e-06, 0.0009829998, 0.00099992752, 0.00198245049, -1.478195e-05, 0.00100398064, 0.00098252296, -4.7684e-07, 0.00198626518, 0.00099468231, 0.0009970665, 0.00097560883, 0.00099992752, 0.00099849701, 0.00199985504, 0.00099897385, 0.00203323364, 0.00099825859, -7.1526e-07, -7.1526e-07, 0.00099921227, -2.3842e-07, 0.00099921227, 0.00099921227, -1.43051e-06, 0.00100111961, 2.3842e-07, 0.00100064278, -7.1526e-07, -2.3842e-07, 2.38419e-06, 0.0, 0.00099992752, -0.00100040436, 0.00099897385, 0.00199961662, 0.0010099411, 0.00102591515, 0.00099945068, 1.66893e-06, 0.00101184845, 0.00199985504, 0.00017309189, -2.31266e-05, 0.00097489357, 0.00198602676, 0.00102424622, -2.3842e-07, 0.00098109245, 0.00099468231, 1.788139e-05, -2.3842e-07, 0.00098013878, 1.43051e-06, 0.00096201897, 0.00100421906, -0.00098609924, 1.263618e-05, 0.00099945068, 0.00099945068, 0.00099897385, 0.00096940994, 1.788139e-05, -0.0010035038, 0.0009906292, 0.0019724369, -5.96046e-06, 0.00100493431, 6.19888e-06, -1.239777e-05, 9.77516e-06, 0.00098276138, -0.00100016594, 1.192093e-05, 0.00200486183, 7.86781e-06, 4.7684e-07, -3.09944e-06, 8.34465e-06, 0.00099611282]
运行了100次,平均运行时间差(heap_sort5-shell_sort3)(正数代表你是个弟弟)是:0.00062998772
前者(heap_sort5)平均运行时间0.00394880772,后者(shell_sort3)平均运行时间0.00331882000,前者约是后者的1.1898倍
与选择排序(O(n^2)):比选择快,是选择时间的九分之一
详细数据:[-0.03098297119, -0.03199076653, -0.03198218346, -0.03198122978, -0.03098106384, -0.03198075294, -0.0309817791, -0.03097701073, -0.03196048737, -0.03194522858, -0.03298187256, -0.030964136
12, -0.03097605705, -0.03295016289, -0.02898311615, -0.03000020981, -0.0329887867, -0.02998185158, -0.03198099136, -0.03098106384, -0.03298187256, -0.03098273277, -0.03100085258, -0.03200554848, -0.03200006485, -0.03198170662, -0.03196406364, -0.02997612953, -0.03097915649, -0.03096842766, -0.03198003769, -0.03099584579, -0.03197383881, -0.03098273277, -0.03098297119, -0.0319545269, -0.03097319603, -0.0320045948, -0.02999281883, -0.03101181984, -0.03199768066, -0.03097224236, -0.03094625473, -0.03100132942, -0.03198194504, -0.03099894524, -0.03197240829, -0.03098273277, -0.03099250793, -0.03298354149, -0.03197526932, -0.03198218346, -0.03200125694, -0.03199028969, -0.02902030945, -0.02996492386, -0.03098249435, -0.0299885273, -0.03198456764, -0.03000354767, -0.03098773956, -0.02900099754, -0.03199529648, -0.03196907043, -0.03195810318, -0.03095602989, -0.03194713593, -0.03198242188, -0.03098273277, -0.03198242188, -0.03098154068, -0.02998065948, -0.03198289871, -0.03298139572, -0.0309817791, -0.03099584579, -0.03101348877, -0.03197312355, -0.03200078011, -0.02999806404, -0.03199219704, -0.03098058701, -0.03095769882, -0.02999162674, -0.03095960617, -0.03098225594, -0.0319507122, -0.03199577332, -0.03299593925, -0.03000020981, -0.03098058701, -0.03100943565, -0.0309586525, -0.03098368645, -0.03195118904, -0.03098011017, -0.03100037575, -0.03000617027, -0.03198218346, -0.03198266029]
运行了100次,平均运行时间差(heap_sort5-select_sort2)(正数代表你是个弟弟)是:-0.03132260799
前者(heap_sort5)平均运行时间0.00397776127,后者(select_sort2)平均运行时间0.03530036926,前者约是后者的0.1127倍
与插入排序(O(n^2)):比插入快,是插入的九分之一
详细数据:[-0.02899861336, -0.02996730804, -0.03100061417, -0.02997756004, -0.03021883965, -0.02996373177, -0.03097391129, -0.03092384338, -0.03199481964, -0.02900743484, -0.0319890976, -0.028993606
57, -0.02900671959, -0.02995681763, -0.02899456024, -0.0300078392, -0.02999472618, -0.03012704849, -0.03000092506, -0.03001236916, -0.02998304367, -0.0309548378, -0.03105306625, -0.03097820282, -0.02950978279, -0.03198122978, -0.02900028229, -0.02998232841, -0.03000259399, -0.02996039391, -0.02998280525, -0.03099131584, -0.02996897697, -0.03098130226, -0.02997732162, -0.03198027611, -0.0309650898, -0.03096795082, -0.03196811676, -0.03000283241, -0.02998304367, -0.03099298477, -0.03100728989, -0.02998328209, -0.02994346619, -0.03098249435, -0.0319852829, -0.02908825874, -0.03198170662, -0.03098082542, -0.02995967865, -0.03098034859, -0.02999520302, -0.03000998497, -0.02799153328, -0.02997493744, -0.03000164032, -0.02906346321, -0.03098011017, -0.02997016907, -0.03105521202, -0.02998423576, -0.03099560738, -0.02998161316, -0.03099656105, -0.02998256683, -0.02899646759, -0.02998304367, -0.03199863434, -0.02996826172, -0.03199887276, -0.03097033501, -0.02926707268, -0.03097510338, -0.03095555305, -0.03206801414, -0.02901434898, -0.03094601631, -0.02995657921, -0.0289683342, -0.02895879745, -0.02997756004, -0.02998709679, -0.02998375893, -0.03096938133, -0.0302259922, -0.02998423576, -0.03197407722, -0.02994465828, -0.02999520302, -0.02899694443, -0.03098726273, -0.02901959419, -0.03000187874, -0.02998280525, -0.02896547318, -0.02996993065, -0.02997756004, -0.02914905548, -0.03098154068]
运行了100次,平均运行时间差(heap_sort5-insertion_sort4)(正数代表你是个弟弟)是:-0.03028253078
前者(heap_sort5)平均运行时间0.00394899845,后者(insertion_sort4)平均运行时间0.03423152924,前者约是后者的0.1154倍
与双向冒泡(O(n^2)):比双向冒泡快,是其0.06倍
详细数据:[-0.06394600868, -0.06296277046, -0.06496214867, -0.06496310234, -0.06196260452, -0.06313633919, -0.06294035912, -0.06199359894, -0.06495571136, -0.06296420097, -0.06696176529, -0.06396317
482, -0.06496310234, -0.06396389008, -0.06396532059, -0.06496357918, -0.06495523453, -0.06297588348, -0.06297802925, -0.06296396255, -0.06296420097, -0.06310391426, -0.06396174431, -0.06396341324, -0.0639629364, -0.06496334076, -0.06396412849, -0.06493639946, -0.06296396255, -0.06394505501, -0.06496310234, -0.06296372414, -0.06297707558, -0.06298160553, -0.06497168541, -0.06299304962, -0.06345295906, -0.06507635117, -0.06296348572, -0.06296467781, -0.0630466938, -0.06496119499, -0.06198263168, -0.06491446495, -0.06296181679, -0.06296348572, -0.0639629364, -0.06296348572, -0.06397724152, -0.06397509575, -0.06496334076, -0.06397604942, -0.06198978424, -0.06342983246, -0.06496357918, -0.06196522713, -0.06296300888, -0.06216788292, -0.06396389008, -0.06412982941, -0.06496143341, -0.0642516613, -0.06396579742, -0.06517672539, -0.06194090843, -0.06196451187, -0.06305217743, -0.06396389008, -0.06396245956, -0.06389951706, -0.06196451187, -0.06396150589, -0.06296372414, -0.06428790092, -0.06198167801, -0.06294608116, -0.06409454346, -0.06496357918, -0.0639629364, -0.06296539307, -0.06296277046, -0.06398153305, -0.06296467781, -0.06396436691, -0.06497693062, -0.06397914886, -0.06296396255, -0.06497597694, -0.06421637535, -0.06296396255, -0.06402182579, -0.06401872635, -0.06594610214, -0.06696248055, -0.06395411491, -0.0639629364, -0.06396508217, -0.06296205521, -0.0639629364, -0.06596231461]
运行了100次,平均运行时间差(heap_sort5-bidirectional_bubble_sort3)(正数代表你是个弟弟)是:-0.06377598286
前者(heap_sort5)平均运行时间0.00407264948,后者(bidirectional_bubble_sort3)平均运行时间0.06784863234,前者约是后者的0.0600倍
其他的版本:
主要是生成堆积的算法
版本1:
依旧是采用寻找结点中的数值,不同的是从根节点向下。要找根结点,就必须和左右子树对比,所以就必须找左右结点的。。。这样的话,每一个结点都必须要和他所有的子节点比较;
而且一旦最上面的点发生了变化,即把根节点取出来的时候,也不能只对index=0的结点进行调整,因为 adjustment(0)#只会对任何结点下面的子节点的做一次最大堆调整,而递归结束后加入发生了交换,是不会对下面顶点再进行调整的。即不能利用之前已经有序的堆积。采用这样的办法排序所需时间是最快的4000多倍
是最快的生成堆积的8倍(不过我很奇怪,他到底比最快的慢在哪里,最快的不也是从上到下吗)
def adjustment(root):
length=len(collection)
left_index=root*2+1
right_index=root*2+2
#分别寻找左右子树上的最小值
#左右子树都存在
if right_index<length:
adjustment(left_index)
adjustment(right_index)
#其实从底层找,并交换能避免一些额外的交换
if collection[left_index]>collection[right_index]:
if collection[root]>collection[right_index]:
collection[root],collection[right_index]=collection[right_index],collection[root]
#这里要不要return呢
elif collection[root]>collection[left_index]:
collection[root],collection[left_index]=collection[left_index],collection[root]
# collection[left_index],collection[right_index]=collection[right_index],collection[left_index]#左右不必交换
# return min(collection[root],adjustment(right_index),adjustment(left_index))
#只有左子树存在
elif left_index<length:
adjustment(left_index)
if collection[left_index]<collection[root]:
collection[left_index],collection[root]=collection[root],collection[left_index]
# return min(collection[root],adjustment(left_index))
return collection[root]
版本2:
通过不断想堆中的最末尾插入元素,通过和上面的结点比较进行调整
是最快生成堆算法的1.4倍
def adjustment(collection,add_index):
'''collection是整个集合,add_index是向小堆中添加元素的collection中的下标'''
#找父节点
temp=collection[add_index]
while (add_index-1)//2>=0 and temp<collection[(add_index-1)//2]:
collection[add_index]=collection[(add_index-1)//2]
add_index=(add_index-1)//2
collection[add_index]=temp
版本3
并不是觉得上面的版本2效率不高才改进的,说实话我也不知道这个为什么比那个效率高,不过因为版本2不能用在后面调整最大堆的里面,所以想到希尔排序里里面的一种思想
def adjustment(collection,index,heap_length):
left_index=index*2+1
right_index=index*2+2
# if right_index>=length-1:#竟然有这样的错误
while right_index<=heap_length-1:
if collection[left_index]>collection[right_index]:
if collection[index]<collection[left_index]:
collection[index],collection[left_index]=collection[left_index],collection[index]
index=left_index
left_index=index*2+1
right_index=index*2+2
else:
break
elif collection[index]<collection[right_index]:
collection[index],collection[right_index]=collection[right_index],collection[index]
index=right_index
right_index=index*2+2
left_index=index*2+1
else:
break
if left_index<=heap_length-1:#只有左子树说明就到底了
if collection[index]<collection[left_index]:
collection[index],collection[left_index]=collection[left_index],collection[index]
来源:oschina
链接:https://my.oschina.net/u/4383341/blog/3462395