TheRiver | blog

You have reached the world's edge, none but devils play past here

0%

c++ stl 排序总结

条款31 中提到了这几种排序的性能优先级(时间和空间消耗),前面的更优
1 partition
2 stable_partition
3 nth_element
4 partial_sort
5 sort
6 stable_sort
? priority_queue(快排比堆排序好点,具体看上一篇文章)

站在手写堆排序,快速选择的角度,来看看这些函数的源码是怎么实现的。

nth_element

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
template <class _RandomAccessIter>
inline void nth_element(_RandomAccessIter __first, _RandomAccessIter __nth,
_RandomAccessIter __last) {
__nth_element(__first, __nth, __last, __VALUE_TYPE(__first));
}

template <class _RandomAccessIter, class _Tp>
void __nth_element(_RandomAccessIter __first, _RandomAccessIter __nth,
_RandomAccessIter __last, _Tp*) {
while (__last - __first > 3) {
_RandomAccessIter __cut =
__unguarded_partition(__first, __last,
_Tp(__median(*__first,
*(__first + (__last - __first)/2),
*(__last - 1))));
if (__cut <= __nth)
__first = __cut; // 缩小范围,更靠近结果
else
__last = __cut; // 缩小范围,更靠近结果
}
__insertion_sort(__first, __last);
}

为了看起来方便,用的gcc 2.95.1的源码。这里有nth_element函数的重载,支持自定义比较函数,核心逻辑是一致的。就拿简单的没有自定义函数的来看。__last - __first > 3的时候用快速选择,__nth_element里面while循环在重复调用__unguarded_partition,没有使用递归。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
// __median (an extension, not present in the C++ standard).

template <class _Tp>
inline const _Tp& __median(const _Tp& __a, const _Tp& __b, const _Tp& __c) {
if (__a < __b)
if (__b < __c)
return __b;
else if (__a < __c)
return __c;
else
return __a;
else if (__a < __c)
return __a;
else if (__b < __c)
return __c;
else
return __b;
}

template <class _RandomAccessIter, class _Tp>
_RandomAccessIter __unguarded_partition(_RandomAccessIter __first,
_RandomAccessIter __last,
_Tp __pivot)
{
while (true) {
while (*__first < __pivot)
++__first;
--__last; // 第一次进来是end() - 1, 后面的情况和++__first是对应的
while (__pivot < *__last)
--__last;
if (!(__first < __last))
return __first;
iter_swap(__first, __last); // 大的放右边 小的放左边
++__first;
}
}

__median这里取的是中位数,很多地方快速排序会拿第一个或最后一个作为__pivot,无关紧要。__unguarded_partition的__pivot是值传递,因为这里会swap。然后最终都要走到插入排序__insertion_sort来收尾。因为看的是不带自定义比较函数的版本,所以这块的逻辑都是按默认的升序来的。看了下gcc 9.2.0逻辑也是差不多的,多了一种__heap_select的情况,不深究了。

这里的命名可以看出来需要传入random的iterator,这是要注意的。

新版本的带比较函数的版本有这样的注释:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
/**
* @brief Sort a sequence just enough to find a particular position
* using a predicate for comparison.
* @ingroup sorting_algorithms
* @param __first An iterator.
* @param __nth Another iterator.
* @param __last Another iterator.
* @param __comp A comparison functor.
* @return Nothing.
*
* Rearranges the elements in the range @p [__first,__last) so that @p *__nth
* is the same element that would have been in that position had the
* whole sequence been sorted. The elements either side of @p *__nth are
* not completely sorted, but for any iterator @e i in the range
* @p [__first,__nth) and any iterator @e j in the range @p [__nth,__last) it
* holds that @p __comp(*j,*i) is false.
*/

comp ([fisrt, nth), [nth, last)) = false;

partition

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
template <class _ForwardIter, class _Predicate>
_ForwardIter __partition(_ForwardIter __first,
_ForwardIter __last,
_Predicate __pred,
forward_iterator_tag) {
if (__first == __last) return __first;

while (__pred(*__first))
if (++__first == __last) return __first;

_ForwardIter __next = __first;

while (++__next != __last)
if (__pred(*__next)) {
swap(*__first, *__next);
++__first;
}

return __first;
}

template <class _BidirectionalIter, class _Predicate>
_BidirectionalIter __partition(_BidirectionalIter __first,
_BidirectionalIter __last,
_Predicate __pred,
bidirectional_iterator_tag) {
while (true) {
while (true)
if (__first == __last)
return __first;
else if (__pred(*__first))
++__first;
else
break;
--__last;
while (true)
if (__first == __last)
return __first;
else if (!__pred(*__last))
--__last;
else
break;
iter_swap(__first, __last);
++__first;
}
}

template <class _ForwardIter, class _Predicate>
inline _ForwardIter partition(_ForwardIter __first,
_ForwardIter __last,
_Predicate __pred) {
return __partition(__first, __last, __pred, __ITERATOR_CATEGORY(__first));
}

这里对forward_iterator_tag和bidirectional_iterator_tag做了特化,因为is a的关系,random_access_iterator_tag也可以使用。这里特殊处理应该是有其他考虑。这里只看看单向的实现。

看了下和前面的nth_elements几乎一样的逻辑,单向的有点不同,但也好理解。nth_element强调的是在排序中的位置,partition更注重按功能分区,一次就完成了不需要多次执行缩小范围到具体的位置,并且支持的iterator的种类更多。

partial_sort

同样是2个函数,一个多个自定义函数的参数,只看下默认的实现逻辑。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
// partial_sort, partial_sort_copy, and auxiliary functions.

template <class _RandomAccessIter, class _Tp>
void __partial_sort(_RandomAccessIter __first, _RandomAccessIter __middle,
_RandomAccessIter __last, _Tp*) {
make_heap(__first, __middle);
for (_RandomAccessIter __i = __middle; __i < __last; ++__i)
if (*__i < *__first)
__pop_heap(__first, __middle, __i, _Tp(*__i),
__DISTANCE_TYPE(__first));
sort_heap(__first, __middle);
}

template <class _RandomAccessIter>
inline void partial_sort(_RandomAccessIter __first,
_RandomAccessIter __middle,
_RandomAccessIter __last) {
__partial_sort(__first, __middle, __last, __VALUE_TYPE(__first));
}

// partial_sort, partial_sort_copy, and auxiliary functions.

template <class _RandomAccessIter, class _Tp>
void __partial_sort(_RandomAccessIter __first, _RandomAccessIter __middle,
_RandomAccessIter __last, _Tp*) {
make_heap(__first, __middle);
for (_RandomAccessIter __i = __middle; __i < __last; ++__i)
if (*__i < *__first)
__pop_heap(__first, __middle, __i, _Tp(*__i),
__DISTANCE_TYPE(__first));
sort_heap(__first, __middle);
}

sort

priority_queue

1
2
3
4
5
6
#ifndef __STL_LIMITED_DEFAULT_TEMPLATES
template <class _Tp, class _Sequence = vector<_Tp>,
class _Compare = less<typename _Sequence::value_type> >
#else
template <class _Tp, class _Sequence, class _Compare>
#endif

前面都是函数模板,而priority_queue是一个单独的类。
单独整理了一篇: stl priority_queue源码分析

----------- ending -----------