二. Analysis

1
对应本书第二章。

2.Analysis

笔记

这章主要讲了算法复杂度的分析。

一. 大O

名称源自于Order of magnitude function $O(f(n))$ 描述了时间复杂度**$T(n)$当中增长速度最快**的部分。

例如:$T(n)=5n^2+27n+1005$的阶是$O(n^2)$。

因为不同的机器计算速度的区别,和不同的编程语言的效率的不同,所以分析算法的时间复杂度使用这样的粗粒度就足够了。

不同函数的增长速度:

https://raw.githubusercontent.com/applenob/algorithm_note/master/res/plot.png

python自带的数据结构:

二. List

Operation Big-O Efficiency
index [] O(1)
index assignment O(1)
append O(1)
pop() O(1)
pop(i) O(n)
insert(i,item) O(n)
del operator
O(n)
iteration O(n)
contains (in) O(n)
get slice [x:y] O(k)
del
slice O(n)
set slice O(n+k)
reverse O(n)
concatenate O(k)
sort O(n
log n)
multiply O(nk)

其中,需要注意的是concatenate还有pop和pop(i)的区别。

python中list的实现

https://raw.github.com/acmerfight/insight_python/master/images/list_insert.png

三. Dictionary

operation Big-O Efficiency
copy O(n)
get item O(1)
set item
O(1)
delete item O(1)
contains (in) O(1)
iteration O(n)

dict的in操作是O(1),list是O(n)

python的dict实现

作业

作业链接

q1

是证明list的index操作是O(1)。暂且认为list的pop(0)是O(n)是已知条件。

1
2
3
4
5
6
7
8
9
10
11
from timeit import Timer
# q1
def test5():
l = list(range(1000))
for i in range(1000):
l[i] = 1

def test6():
l = list(range(1000))
for i in range(1000):
l.pop(0)
1
2
3
4
t5 = Timer("test5()", "from __main__ import test5")
print("list index operation:",t5.timeit(number=10), "milliseconds")
t6 = Timer("test6()", "from __main__ import test6")
print("list pop(0):",t6.timeit(number=10), "milliseconds")
1
2
list index operation: 0.0006067330032237805 milliseconds
list pop(0): 0.0025059110048459843 milliseconds

q2

是证明dict的get和set是O(1),先认为dict的遍历是O(n)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
def test7():
d = dict(zip(list(range(1000)), list(range(1000,2000))))
for i in range(1000):
d[i] = 1

def test8():
d = dict(zip(list(range(1000)), list(range(1000,2000))))
for i in range(1000):
k = d[i]

def test9():
d = dict(zip(list(range(1000)), list(range(1000,2000))))
for i in range(1000):
for one in d:
pass
1
2
3
4
5
6
t7 = Timer("test7()", "from __main__ import test7")
print("dict set item:",t7.timeit(number=1000), "milliseconds")
t8 = Timer("test8()", "from __main__ import test8")
print("dict get item:",t8.timeit(number=1000), "milliseconds")
t9 = Timer("test9()", "from __main__ import test9")
print("dict iteration:",t9.timeit(number=1000), "milliseconds")
1
2
3
dict set item: 0.13327282800310059 milliseconds
dict get item: 0.1195711559994379 milliseconds
dict iteration: 11.517200201000378 milliseconds

q3

对比list和dict的del操作

1
2
3
4
5
6
7
8
9
def test10():
l = list(range(1000))
for i in range(1000):
del l[0]

def test11():
d = dict(zip(list(range(1000)), list(range(1000,2000))))
for i in range(1000):
del d[i]
1
2
3
4
5
6
7
8
t10 = Timer("test10()", "from __main__ import test10")
print("list del item:",t10.timeit(number=1000), "milliseconds")
t11 = Timer("test11()", "from __main__ import test11")
print("dict del item:",t11.timeit(number=1000), "milliseconds")
t10 = Timer("test10()", "from __main__ import test10")
print("list del item:",t10.timeit(number=1000), "milliseconds")
t11 = Timer("test11()", "from __main__ import test11")
print("dict del item:",t11.timeit(number=1000), "milliseconds")
1
2
3
4
list del item: 0.14648991199646844 milliseconds
dict del item: 0.13190877300075954 milliseconds
list del item: 0.14080390099843498 milliseconds
dict del item: 0.13055396499839844 milliseconds

q4

给定一个list,返回第k小的数。先给出时间复杂度是O(nlogn)的简单版:

1
2
3
4
5
6
7
8
9
def kth_smallest_1(nums, k):
"""
find kth smallest of a list, O(nlogn)
:type nums: list
:type k: int
:rtype: num
"""
nums.sort()
return nums[k-1]
1
kth_smallest_1([1, 5,9,10,2,3], 4)
1
5

q5

q4时间复杂度是O(n)的解法。利用快排的思路。平均时间复杂度是O(n),最坏情况时间复杂度是$O(n^2)$

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
def kth_smallest_2(nums, k):
"""
find kth smallest of a list, O(n)
:type nums: list
:type k: int
:rtype: num
"""
def partition(nums, start, end):
"""
Pick last element as pivot
Place all smaller elements before pivot
Place all bigger elements after pivot
"""
pivot = nums[end]
# currentSmaller 记录当前有几个数比pivot小
currentSmaller = start
for i in range(start, end):
# If current element <= pivot, put it to right position
if nums[i] <= pivot:
nums[i], nums[currentSmaller] = nums[currentSmaller], nums[i]
currentSmaller += 1
# Put pivot to right position
nums[end], nums[currentSmaller] = nums[currentSmaller], nums[end]
return currentSmaller

def quickSelect(nums, start, end, k):
pos = partition(nums, start, end)
# print(nums)
if pos == k - 1:
return nums[pos]
if pos < k - 1:
return quickSelect(nums, pos + 1, end, k)
else:
return quickSelect(nums, start, pos - 1, k)
return quickSelect(nums, 0, len(nums)-1, k)
1
kth_smallest_2([1,5,9,10,2,3], 4)
1
5
1
2
3
4
t12 = Timer("kth_smallest_1([1, 5,9,10,2,3], 4)", "from __main__ import kth_smallest_1")
print("kth_smallest_1:",t12.timeit(number=10), "milliseconds")
t13 = Timer("kth_smallest_2([1,5,9,10,2,3], 4)", "from __main__ import kth_smallest_2")
print("kth_smallest_2:",t13.timeit(number=10), "milliseconds")
1
2
kth_smallest_1: 7.706003088969737e-06 milliseconds
kth_smallest_2: 4.322700260672718e-05 milliseconds