Searching and Sorting查找与排序
Searching and sorting are the workhorses of computer science: almost every program either looks something up or arranges data in order. This guide builds the two core search algorithms (linear search and binary search), explains why one is dramatically faster on sorted data, introduces algorithm efficiency concepts, then covers the three classic sorts (bubble, selection, insertion) with step-by-step traces and Python code. The final section compares all algorithms and shows how to use Python's built-in sort in practice. Code blocks are in Python/pseudocode throughout; Chinese annotations follow each block.查找(search,查找)与排序(sort,排序)是计算机科学的主力算法:几乎每个程序都需要查找数据或将数据排成顺序。本指南系统讲解两大查找算法(线性查找与二分查找),说明为何二分查找在有序数据上效率显著更高,引入算法效率(algorithm efficiency,算法效率)概念,再逐步演示三种经典排序(冒泡、选择、插入),并附 Python 实现。最后一节比较所有算法并展示如何在实践中使用 Python 内置排序。全部代码使用 Python/伪代码,每段后附中文注解。
How to use this guide如何使用本指南
Searching and sorting are grade-12/Advanced-level topics in all four curricula. Ontario places them exclusively in ICS4U (Grade 12). Alberta names them in CSE3110 (Advanced, 3xxx). CSTA puts efficiency evaluation in Level 3B (the specialist elective stream). BC implies them under Computer Programming 12 "Advanced programming structures." All four use the same core algorithms, so this guide is written for any student who reaches this topic regardless of course code. The Honors flag marks content that maps to formal complexity analysis (ICS4U C2, CSE3110 outcome 1.5.4/1.6.5).查找与排序在四套大纲中均为 12 年级或高级水平内容。安大略将其完全置于 ICS4U(12 年级);阿尔伯塔将其列入 CSE3110(高级模块,3xxx 系列);CSTA 将效率评估置于 3B 级(专业选修流);BC 将其隐含于 Computer Programming 12"高级编程结构"之下。四套大纲的核心算法相同,因此本指南适用于任何到达此主题的学生,无论课程代码如何。荣誉标记(Honors)标注正式复杂度分析内容(对应 ICS4U C2 和 CSE3110 结果 1.5.4/1.6.5)。
| If you are in…如果你在… | Focus on these sections重点学习 | Defer / lighter可推迟 / 减负 | Source依据 |
|---|---|---|---|
| 🇺🇸 US CSTA 3B / AP CSP美国 CSTA 3B / AP CSP | §1–§3 core (search + efficiency). AP CSP topic 3.11 tests binary search directly; topic 3.17 tests efficiency. CSTA 3B-AP-10 (classic algorithms) and 3B-AP-11 (evaluate efficiency) are the specialist standards.§1–§3 核心(查找 + 效率)。AP CSP 主题 3.11 直接考查二分查找;主题 3.17 考查效率。CSTA 3B-AP-10(经典算法)和 3B-AP-11(评估效率)是专业标准。 | Sort implementation details (§4–§6) are enrichment for AP CSP; AP CSA (Java) covers them formally.排序实现细节(§4–§6)对 AP CSP 为拓展;AP CSA(Java)正式涵盖它们。 | CSTA K-12 and AP CSP — CSTA 3B-AP-10, 3B-AP-11; AP CSP 3.11, 3.17; AAP-4.A— CSTA 3B-AP-10、3B-AP-11;AP CSP 3.11、3.17;AAP-4.A |
| 🇨🇦 ON Grade 12 — ICS4U安大略 12 年级 — ICS4U | §1 through §7 in full. ICS4U A3.2 (linear + binary search), A3.4 (bubble, insertion, selection sorts), C2.2 (compare search efficiency), C2.3 (compare sort efficiency) all map directly.§1 至 §7 完整学习。ICS4U A3.2(线性+二分查找)、A3.4(冒泡、插入、选择排序)、C2.2(比较查找效率)、C2.3(比较排序效率)均直接对应。 | Recursion/merge-sort (A3.6) and formal Big-O derivations (C2.3 full) are ICS4U-only going-deeper material — flagged Honors.递归/归并排序(A3.6)和正式大O推导(C2.3 完整)属 ICS4U-only 深入内容,标注荣誉标记。 | ON/BC Computer Studies 11-12 — ICS4U A3.2, A3.4, A3.6, C2.2, C2.3— ICS4U A3.2、A3.4、A3.6、C2.2、C2.3 |
| 🇨🇦 BC — CP12BC — CP12 | §1 through §7. Computer Programming 12 "Advanced programming structures" and "Management of complexity" are the closest BC anchors; no explicit sorting/searching standard exists.§1 至 §7。BC Computer Programming 12"高级编程结构"和"复杂度管理"是最接近的 BC 依据;没有明确的排序/查找标准。 | Formal Big-O analysis is enrichment for BC CP12 students.正式大O分析对 BC CP12 学生为拓展内容。 | ON/BC Computer Studies 11-12 — BC CP12 "Advanced programming structures"; "Management of complexity"— BC CP12"高级编程结构";"复杂度管理" |
| 🇨🇦 AB — CSE3110阿尔伯塔 — CSE3110 | §1 through §7 in full. CSE3110 outcomes 1.5 (linear + binary search) and 1.6 (bubble, selection, insertion sort) with efficiency comparison (1.5.4, 1.6.5) are a near-perfect match for this guide.§1 至 §7 完整学习。CSE3110 结果 1.5(线性+二分查找)和 1.6(冒泡、选择、插入排序)以及效率比较(1.5.4、1.6.5)与本指南高度契合。 | Recursive sorts (quicksort, merge sort) are CSE3310 — outside scope of this guide.递归排序(快速排序、归并排序)属 CSE3310,超出本指南范围。 | Alberta CTS Computing Science — CSE3110 outcomes 1.5.1–1.5.4, 1.6.1–1.6.5— CSE3110 结果 1.5.1–1.5.4、1.6.1–1.6.5 |
Once you have located your row, use the two cards below for the pace at which you should work through the sections.找到所在行后,用下面两张卡片决定推进速度。
Memorise five things: linear search scans one-by-one (worst case n steps); binary search halves each time (worst case log₂n steps, sorted list required); bubble sort compares adjacent pairs; selection sort finds the minimum each pass; insertion sort builds a sorted left portion one element at a time. Read every cram-cheat box. Skip the Honors Big-O going-deeper box.背熟五件事:线性查找逐个扫描(最坏 n 步);二分查找每次减半(最坏 log₂n 步,需已排序);冒泡排序比较相邻元素;选择排序每轮找最小值;插入排序逐一将元素插入有序左段。读每个速记框,跳过荣誉大O深入框。
Trace every algorithm by hand with a concrete 6–8 element list before moving on. Know the pre-condition, worst-case step count, and best-use scenario for every algorithm. AP CSP Big Idea 3 topic 3.17 (efficiency) and Ontario ICS4U C2.2/C2.3 require you to compare algorithms by step count, not just name them. The Honors going-deeper boxes introduce Big-O notation at the formal level assessed in ICS4U and CSE3110.在继续学习之前,用具体的 6–8 个元素列表手工追踪每个算法。掌握每个算法的前提条件、最坏情况步骤数和最佳使用场景。AP CSP 大概念 3 主题 3.17(效率)和安大略 ICS4U C2.2/C2.3 要求你按步骤数比较算法,而不仅仅是叫出名字。荣誉深入框在 ICS4U 和 CSE3110 正式评估的级别引入大O表示法。
Linear Search线性查找
- Works on适用于 — any list, sorted or unsorted. No pre-condition required.— 任意列表,有序或无序均可,无前提条件。
- Worst case最坏情况 — n steps (item is at the end or not in the list at all).— n 步(目标在末尾或根本不存在)。
- Best case最好情况 — 1 step (target is at index 0).— 1 步(目标在索引 0 处)。
- Return value返回值 — the index where the target was found, or −1 (not found).— 找到目标的索引,或 −1(未找到)。
List: [14, 22, 8, 31, 19, 27, 5]. Search for target = 27. Trace the algorithm step by step.列表:[14, 22, 8, 31, 19, 27, 5]。查找目标 = 27。逐步追踪算法。
Pseudocode.伪代码。
INPUT data, target
FOR i FROM 0 TO length(data) - 1:
IF data[i] == target THEN
OUTPUT i
STOP
OUTPUT -1 # not found
Code keywords above: FOR loop, IF condition, OUTPUT result, early STOP.以上代码关键字:FOR 循环,IF 条件,OUTPUT 输出,提前 STOP 停止。
Python implementation.Python 实现。
def linear_search(data, target):
for i in range(len(data)):
if data[i] == target:
return i
return -1
temps = [14, 22, 8, 31, 19, 27, 5]
print(linear_search(temps, 27)) # Output: 5
Trace: index 0 (14≠27), index 1 (22≠27), index 2 (8≠27), index 3 (31≠27), index 4 (19≠27), index 5 (27==27) FOUND. Total: 6 comparisons. Worst case for 7 items would be 7.追踪:索引 0(14≠27),索引 1(22≠27),索引 2(8≠27),索引 3(31≠27),索引 4(19≠27),索引 5(27==27)找到。共 6 次比较。7 个元素的最坏情况为 7 次。
Going deeper — when linear search is the right choice深入 — 何时线性查找是正确选择
Linear search is optimal for unsorted lists because no faster algorithm exists without pre-sorting. It is also correct for small lists (under ~20 items) where the overhead of sorting to enable binary search is not worth it. CSTA 3B-AP-11 says "Evaluate algorithms in terms of their efficiency" — the answer here is: linear search is O(n), simple to implement, and the right tool when you cannot guarantee the list is sorted or when you search only once. If you search the same large sorted list many times, sort once and switch to binary search.对于未排序列表,线性查找是最优选择,因为不预先排序就不存在更快的算法。对于小型列表(约 20 个元素以下),为启用二分查找而排序的开销也不值得。CSTA 3B-AP-11 要求"从效率方面评估算法"——此处的答案是:线性查找是 O(n),实现简单,是列表无法保证有序或只查找一次时的正确工具。若需对同一大型有序列表多次查找,则一次性排序后改用二分查找。
Binary Search二分查找
- Pre-condition前提条件 — list must be sorted in ascending order.— 列表必须按升序排列。
- Worst case最坏情况 — log₂(n) steps. For 1,024 items: 10 steps. For 1,000,000 items: 20 steps.— log₂(n) 步。1024 个元素:10 步。100 万个元素:20 步。
- Strategy策略 — compare target to the middle element. If target is smaller, search left half; if larger, search right half; if equal, found.— 将目标与中间元素比较。若目标较小,搜索左半部分;若较大,搜索右半部分;若相等,找到。
Sorted list: [3, 8, 12, 17, 25, 34, 41, 55, 68, 77]. Target = 34. Trace step by step (indices 0–9).有序列表:[3, 8, 12, 17, 25, 34, 41, 55, 68, 77]。目标 = 34。逐步追踪(索引 0–9)。
Pseudocode.伪代码。
INPUT data, target
SET low TO 0
SET high TO length(data) - 1
WHILE low <= high:
SET mid TO (low + high) // 2
IF data[mid] == target THEN
OUTPUT mid
STOP
ELSE IF data[mid] < target THEN
SET low TO mid + 1
ELSE:
SET high TO mid - 1
OUTPUT -1 # not found
Code keywords: // is integer division; low/high track the current search window.代码关键字:// 为整除;low/high 跟踪当前搜索窗口。
Step-by-step trace.逐步追踪。
| Step步骤 | low | high | mid | data[mid]data[mid] | Action操作 |
|---|---|---|---|---|---|
| 1 | 0 | 9 | 4 | 25 | 25 < 34 → low = 525 < 34 → low = 5 |
| 2 | 5 | 9 | 7 | 55 | 55 > 34 → high = 655 > 34 → high = 6 |
| 3 | 5 | 6 | 5 | 34 | 34 == 34 → FOUND at index 534 == 34 → 在索引 5 找到 |
Python implementation.Python 实现。
def binary_search(data, target):
low, high = 0, len(data) - 1
while low <= high:
mid = (low + high) // 2
if data[mid] == target:
return mid
elif data[mid] < target:
low = mid + 1
else:
high = mid - 1
return -1
nums = [3, 8, 12, 17, 25, 34, 41, 55, 68, 77]
print(binary_search(nums, 34)) # Output: 5
3 comparisons to find 34 in a 10-element list. Linear search would need 6. The advantage grows dramatically for larger lists: a million items takes at most 20 binary-search comparisons vs up to 1,000,000 linear comparisons.在 10 个元素的列表中找到 34 只需 3 次比较;线性查找需要 6 次。对于更大的列表,优势急剧增加:100 万个元素最多需要 20 次二分比较,而线性查找最多需要 100 万次。
Going deeper — AP CSP exam pseudocode for binary search深入 — AP CSP 考试二分查找伪代码
AP CSP topic 3.11 tests binary search using the College Board's own pseudocode notation. Key differences from the pseudocode in this guide: assignment uses ← (arrow), lists are 1-indexed (first element is index 1, not 0), and display uses DISPLAY(). The logic is identical; only the surface notation changes. Practice translating the algorithm above into College Board pseudocode as an exam prep exercise. ICS4U C2.2 and CSE3110 outcome 1.5.4 go further: both require formal comparison of computational efficiency (step-count analysis), which maps to the Big-O section (§3) of this guide.AP CSP 主题 3.11 使用 College Board 自己的伪代码表示法考查二分查找。与本指南伪代码的主要区别:赋值用 ←(箭头),列表从 1 开始索引(第一个元素索引为 1,不是 0),显示用 DISPLAY()。逻辑相同,仅表面符号不同。将上述算法翻译为 College Board 伪代码作为考试准备练习。ICS4U C2.2 和 CSE3110 结果 1.5.4 更进一步:两者均要求对计算效率进行正式比较(步骤数分析),对应本指南第 3 节(大O)。
Introduction to Algorithm Efficiency (Big-O Concept)算法效率初步(大O概念)
- Linear search step count线性查找步骤数 — proportional to n (the list size). Double the list → double the steps. We say it is O(n).— 与 n(列表大小)成正比。列表翻倍 → 步骤翻倍。称为 O(n)。
- Binary search step count二分查找步骤数 — proportional to log₂(n). Double the list → only one extra step. We say it is O(log n).— 与 log₂(n) 成正比。列表翻倍 → 只多一步。称为 O(log n)。
- Concrete comparison具体比较 — for 1,000,000 items: linear search up to 1,000,000 steps; binary search at most 20 steps. O(log n) is dramatically better for large n.— 100 万个元素:线性查找最多 100 万步;二分查找最多 20 步。对于大 n,O(log n) 显著更优。
For list sizes n = 8, 16, 64, 1024, 1,000,000, calculate the worst-case step count for linear search and binary search side by side.对于列表大小 n = 8、16、64、1024、1,000,000,分别计算线性查找和二分查找的最坏情况步骤数。
| n | Linear (worst case)线性(最坏) | Binary (worst case)二分(最坏) | Ratio比值 |
|---|---|---|---|
| 8 | 8 | 3 | 2.7× |
| 16 | 16 | 4 | 4× |
| 64 | 64 | 6 | 10.7× |
| 1,024 | 1,024 | 10 | 102× |
| 1,000,000 | 1,000,000 | 20 | 50,000× |
As n grows, the ratio between linear and binary search becomes enormous. This is the power of O(log n) vs O(n). The cost: binary search requires the list to be sorted first. Sorting itself takes time (O(n²) for simple sorts, O(n log n) for advanced sorts — see §7).随着 n 增大,线性与二分查找之间的比值变得极大。这就是 O(log n) 对比 O(n) 的威力。代价是:二分查找要求列表必须先排序。排序本身需要时间(简单排序 O(n²),高级排序 O(n log n)——见 §7)。
Going deeper — Big-O notation formal Honors — ICS4U C2 / CSE3110深入 — 大O表示法正式版 荣誉 — ICS4U C2 / CSE3110
Computer scientists use Big-O notation to describe how an algorithm's step count grows with input size n, ignoring constant factors and lower-order terms. Common classes: O(1) constant (same steps regardless of n — e.g., read index 0); O(log n) logarithmic (binary search); O(n) linear (linear search); O(n log n) linearithmic (efficient sorts — merge sort, quicksort); O(n²) quadratic (simple comparison sorts — bubble, selection, insertion). Ontario ICS4U C2.2/C2.3 and Alberta CSE3110 outcomes 1.5.4/1.6.5 require formal comparison of search and sort efficiencies using this framework. For ICS4U/CSE3110 students: classify each algorithm in this guide by its Big-O, then verify the classification with the step-count table above.计算机科学家使用大O表示法描述算法步骤数随输入大小 n 增长的方式,忽略常数因子和低阶项。常见类别:O(1) 常数(步骤数与 n 无关——如读取索引 0);O(log n) 对数(二分查找);O(n) 线性(线性查找);O(n log n) 线性对数(高效排序——归并排序、快速排序);O(n²) 二次(简单比较排序——冒泡、选择、插入)。安大略 ICS4U C2.2/C2.3 和阿尔伯塔 CSE3110 结果 1.5.4/1.6.5 要求使用此框架对查找和排序效率进行正式比较。ICS4U/CSE3110 学生:为本指南中每个算法分类其大O,然后用上表的步骤数验证分类。
Bubble Sort冒泡排序
- Passes遍数 — after each pass, the largest unsorted element “bubbles up” to its correct position at the right end.— 每遍后,最大的未排序元素"冒泡"到右端的正确位置。
- Worst case最坏情况 — O(n²) comparisons — for n=10: up to 45 comparisons; for n=100: up to 4,950.— O(n²) 次比较——n=10 时最多 45 次;n=100 时最多 4950 次。
- Swap in PythonPython 中的交换 —
a, b = b, aswaps two variables. No temporary variable needed.—a, b = b, a交换两个变量,无需临时变量。
Sort [5, 3, 8, 1, 4] into ascending order using bubble sort. Trace each pass.用冒泡排序将 [5, 3, 8, 1, 4] 排成升序,追踪每遍。
Pseudocode.伪代码。
FOR pass FROM 0 TO length(data) - 2:
FOR i FROM 0 TO length(data) - 2 - pass:
IF data[i] > data[i+1] THEN
SWAP data[i] AND data[i+1]
Inner loop compares adjacent pairs; outer loop repeats for each pass. After each pass, the boundary shrinks by 1 (the rightmost element is now sorted).内层循环比较相邻元素;外层循环对每遍重复。每遍后边界缩小 1(最右侧元素已排好)。
Pass-by-pass trace.逐遍追踪。
| Pass遍 | List after pass该遍后列表 | Swaps交换次数 |
|---|---|---|
| Start初始 | [5, 3, 8, 1, 4] | — |
| 1 | [3, 5, 1, 4, 8] | 3 |
| 2 | [3, 1, 4, 5, 8] | 2 |
| 3 | [1, 3, 4, 5, 8] | 1 |
| 4 | [1, 3, 4, 5, 8] | 0 → sorted |
Python implementation.Python 实现。
def bubble_sort(data):
n = len(data)
for pass_num in range(n - 1):
for i in range(n - 1 - pass_num):
if data[i] > data[i + 1]:
data[i], data[i + 1] = data[i + 1], data[i] # swap
return data
print(bubble_sort([5, 3, 8, 1, 4])) # Output: [1, 3, 4, 5, 8]
The data[i], data[i+1] = data[i+1], data[i] is the Python idiom for swapping two elements.data[i], data[i+1] = data[i+1], data[i] 是 Python 交换两个元素的惯用写法。
Going deeper — optimised bubble sort with early-exit flag深入 — 带提前退出标志的优化冒泡排序
Standard bubble sort always runs all n-1 passes even if the list becomes sorted early. An optimised version adds a swapped flag: if a whole pass makes zero swaps, the list is sorted and the outer loop can exit early. Best case for optimised bubble sort is O(n) — one pass on an already-sorted list. This is an example of the CSTA 3B-AP-11 principle: "Evaluate algorithms in terms of their efficiency" — and then improve them. CSE3110 outcome 1.6.5 requires comparing sorts by computational efficiency, which includes this kind of optimisation analysis.标准冒泡排序总是运行全部 n-1 遍,即使列表提前已排好。优化版添加 swapped(已交换)标志:若某遍未发生任何交换,列表已排好,外层循环可提前退出。优化冒泡排序最好情况为 O(n)——对已排序列表只需一遍。这是 CSTA 3B-AP-11 原则"从效率方面评估算法"的体现,然后加以改进。CSE3110 结果 1.6.5 要求按计算效率比较排序,包括此类优化分析。
def bubble_sort_optimised(data):
n = len(data)
for pass_num in range(n - 1):
swapped = False
for i in range(n - 1 - pass_num):
if data[i] > data[i + 1]:
data[i], data[i + 1] = data[i + 1], data[i]
swapped = True
if not swapped:
break # list is already sorted
return data
Selection Sort选择排序
- Swaps交换次数 — exactly n-1 swaps in the worst case (one per pass) — far fewer swaps than bubble sort.— 最坏情况下恰好 n-1 次交换(每遍一次)——远少于冒泡排序。
- Comparisons比较次数 — still O(n²) comparisons in all cases (always scans the full remaining portion to find the minimum).— 所有情况下仍为 O(n²) 次比较(总是扫描完整的剩余部分以找最小值)。
- Position定位 — after pass k, the first k elements are in their final sorted positions.— 第 k 遍后,前 k 个元素处于最终排序位置。
Sort [5, 3, 8, 1, 4] using selection sort. Trace each pass.用选择排序将 [5, 3, 8, 1, 4] 排序,追踪每遍。
Pseudocode.伪代码。
FOR i FROM 0 TO length(data) - 2:
SET min_idx TO i
FOR j FROM i+1 TO length(data) - 1:
IF data[j] < data[min_idx] THEN
SET min_idx TO j
SWAP data[i] AND data[min_idx]
Inner loop finds the minimum in the unsorted portion (indices i to end). Outer loop places it at position i.内层循环在未排序部分(索引 i 到末尾)中找最小值;外层循环将其放到位置 i。
Pass-by-pass trace.逐遍追踪。
| Pass遍 | Min found找到最小值 | List after pass该遍后列表 |
|---|---|---|
| Start初始 | — | [5, 3, 8, 1, 4] |
| 1 (i=0) | 1 (idx 3) | [1, 3, 8, 5, 4] |
| 2 (i=1) | 3 (idx 1) | [1, 3, 8, 5, 4] |
| 3 (i=2) | 4 (idx 4) | [1, 3, 4, 5, 8] |
| 4 (i=3) | 5 (idx 3) | [1, 3, 4, 5, 8] |
Python implementation.Python 实现。
def selection_sort(data):
n = len(data)
for i in range(n - 1):
min_idx = i
for j in range(i + 1, n):
if data[j] < data[min_idx]:
min_idx = j
data[i], data[min_idx] = data[min_idx], data[i] # swap
return data
print(selection_sort([5, 3, 8, 1, 4])) # Output: [1, 3, 4, 5, 8]
Selection sort made exactly 4 swaps for 5 elements (n-1 = 4). Bubble sort on the same list made 3+2+1+0 = 6 swaps. Fewer swaps is better when writes to memory are expensive.选择排序对 5 个元素恰好进行了 4 次交换(n-1 = 4)。相同列表的冒泡排序进行了 3+2+1+0 = 6 次交换。当内存写入代价高时,更少交换更好。
Insertion Sort插入排序
- Analogy类比 — like sorting a hand of playing cards: pick up one card at a time and slide it into the right position among already-sorted cards.— 如同整理一手扑克牌:一次拿起一张,插入已排好牌的正确位置。
- Best case最好情况 — O(n) when the list is already sorted (no shifts needed — just one comparison per element).— O(n),当列表已排好时(无需移位——每个元素只需一次比较)。
- Worst case最坏情况 — O(n²) when the list is in reverse order (every element must shift all the way left).— O(n²),当列表完全逆序时(每个元素都需一路向左移位)。
Sort [5, 3, 8, 1, 4] using insertion sort. Trace each insertion step.用插入排序将 [5, 3, 8, 1, 4] 排序,追踪每次插入。
Pseudocode.伪代码。
FOR i FROM 1 TO length(data) - 1:
SET key TO data[i]
SET j TO i - 1
WHILE j >= 0 AND data[j] > key:
SET data[j+1] TO data[j] # shift right
SET j TO j - 1
SET data[j+1] TO key # insert key
key holds the element being inserted; the WHILE loop shifts larger elements one position right to make room.key 保存待插入元素;WHILE 循环将较大元素向右移一位腾出空间。
Step-by-step trace.逐步追踪。
| i | key | List after insertion插入后列表 |
|---|---|---|
| Start初始 | — | [5, 3, 8, 1, 4] |
| 1 | 3 | [3, 5, 8, 1, 4] |
| 2 | 8 | [3, 5, 8, 1, 4] |
| 3 | 1 | [1, 3, 5, 8, 4] |
| 4 | 4 | [1, 3, 4, 5, 8] |
Python implementation.Python 实现。
def insertion_sort(data):
for i in range(1, len(data)):
key = data[i]
j = i - 1
while j >= 0 and data[j] > key:
data[j + 1] = data[j] # shift element right
j -= 1
data[j + 1] = key # place key in correct position
return data
print(insertion_sort([5, 3, 8, 1, 4])) # Output: [1, 3, 4, 5, 8]
When key=8 (i=2), no shifts were needed because 8 > 5. This means nearly-sorted data has very few shifts, giving O(n) best-case performance — a property bubble sort and selection sort do not have.当 key=8(i=2)时,无需移位,因为 8 > 5。这意味着近乎有序数据的移位极少,达到 O(n) 最好情况性能——这是冒泡排序和选择排序所不具备的特性。
Comparing Sorts and Using Built-in Sort排序比较与内置排序
| Algorithm算法 | Best最好 | Worst最坏 | Pre-condition前提 | Notes备注 |
|---|---|---|---|---|
| Linear search线性查找 | O(1) | O(n) | None无 | Simple; works on any list简单;适用任意列表 |
| Binary search二分查找 | O(1) | O(log n) | Sorted list已排序 | Fast on large sorted data大型有序数据很快 |
| Bubble sort冒泡排序 | O(n)* | O(n²) | None无 | *with early-exit optimisation*带提前退出优化 |
| Selection sort选择排序 | O(n²) | O(n²) | None无 | Fewest swaps; good for low-write hardware交换最少;适合少写硬件 |
| Insertion sort插入排序 | O(n) | O(n²) | None无 | Best for nearly-sorted data近乎有序数据最佳 |
In practice, never use bubble/selection/insertion sort on real data — use Python's built-in sorted() or .sort(). They use Timsort (O(n log n) worst case), which is dramatically faster. Know when to use each method.实践中,切勿对真实数据使用冒泡/选择/插入排序——使用 Python 内置的 sorted() 或 .sort()。它们使用 Timsort(最坏情况 O(n log n)),速度显著更快。了解何时使用各种方法。
Python built-in sort examples.Python 内置排序示例。
# sorted() returns a NEW sorted list; original unchanged
nums = [5, 3, 8, 1, 4]
sorted_nums = sorted(nums)
print(sorted_nums) # [1, 3, 4, 5, 8]
print(nums) # [5, 3, 8, 1, 4] -- unchanged
# .sort() sorts IN PLACE; modifies the original list
nums.sort()
print(nums) # [1, 3, 4, 5, 8]
# Sort descending
nums.sort(reverse=True)
print(nums) # [8, 5, 4, 3, 1]
# Sort strings or tuples by a key
students = [("Ana", 85), ("Bob", 72), ("Cara", 91)]
students.sort(key=lambda s: s[1]) # sort by score
print(students) # [('Bob', 72), ('Ana', 85), ('Cara', 91)]
The key=lambda s: s[1] tells Python to sort by the second element (the score). Use sorted() when you need to keep the original; use .sort() when you want to modify in place.key=lambda s: s[1] 告知 Python 按第二个元素(分数)排序。需要保留原列表时用 sorted();想原地修改时用 .sort()。
Combining sort + binary search in practice.实践中结合排序与二分查找。
import bisect # Python standard library for binary search on sorted lists
data = [14, 2, 31, 8, 25, 17]
data.sort() # sort first: [2, 8, 14, 17, 25, 31]
pos = bisect.bisect_left(data, 17) # binary search for 17
print(data[pos] == 17) # True (found at index 3)
The pattern: sort once with O(n log n), then use binary search O(log n) for every lookup. Far faster than repeated linear searches on large data.模式:用 O(n log n) 排序一次,再用 O(log n) 的二分查找进行每次查找。对大数据远快于重复线性查找。
sorted(data) and data.sort()?在 Python 中,sorted(data) 和 data.sort() 有什么区别?sorted(data) is a built-in function that returns a new sorted list; the original data is unchanged. data.sort() is a list method that sorts the list in place (modifies data) and returns None. Use sorted() when you need both the original and the sorted version; use .sort() when you only need the sorted result.sorted(data) 是返回新排序列表的内置函数;原始 data 不变。data.sort() 是原地排序列表(修改 data)并返回 None 的列表方法。同时需要原列表和排序版本时用 sorted();只需排序结果时用 .sort()。Going deeper — why Python's sort is O(n log n) not O(n²) Honors深入 — 为何 Python 内置排序是 O(n log n) 而非 O(n²) 荣誉
Python uses Timsort (invented by Tim Peters, 2002), a hybrid of merge sort and insertion sort. Merge sort divides the list in half, recursively sorts each half, then merges them (O(n log n) in all cases). Insertion sort is used on small runs (<64 elements) because its O(n) best case makes it faster than merge sort for nearly-sorted small arrays. The result: Timsort is O(n) on already-sorted data and O(n log n) in the worst case — far better than the O(n²) algorithms in this guide. ICS4U A3.6 mentions merge sort as a recursive algorithm option. CSE3310 (Recursive Algorithms 1) covers quicksort and merge sort at the Advanced level. For exam purposes, knowing that Python's sort is O(n log n) and why it beats bubble/selection/insertion is sufficient at the ICS4U/CSE3110 level.Python 使用 Timsort(Tim Peters,2002 年发明),是归并排序与插入排序的混合体。归并排序将列表分成两半,递归排序各半部分,再合并(所有情况均为 O(n log n))。对小片段(<64 个元素)使用插入排序,因为其 O(n) 最好情况对近乎有序的小数组比归并排序更快。结果:Timsort 对已排序数据为 O(n),最坏情况为 O(n log n)——远优于本指南的 O(n²) 算法。ICS4U A3.6 将归并排序列为递归算法选项。CSE3310(递归算法 1)在高级模块涵盖快速排序和归并排序。为考试目的,知道 Python 内置排序是 O(n log n) 以及为何它胜过冒泡/选择/插入排序,在 ICS4U/CSE3110 级别已足够。
Exam Strategy and Common Pitfalls考试策略与常见陷阱
- State the pre-condition.陈述前提条件。 Binary search requires a sorted list. Forgetting to state this is the most common exam error. CSE3110 outcome 1.5.4 and ICS4U C2.2 both test whether you know the pre-condition.二分查找需要已排序列表。忘记陈述这一点是最常见的考试错误。CSE3110 结果 1.5.4 和 ICS4U C2.2 均考查你是否知道前提条件。
- Trace the algorithm with a concrete example.用具体例子追踪算法。 Always run through your answer with a 5–8 element list before submitting. A trace table catches most off-by-one boundary errors in sort algorithms.提交前始终用 5–8 个元素的列表演算答案。追踪表能发现排序算法中大多数"差一"边界错误。
- Use the correct algorithm name in Chinese and English.用中英文写出正确算法名称。 Examiners expect precise terminology: linear search, binary search, bubble sort, selection sort, insertion sort, algorithm efficiency, time complexity.考官期望精确术语:linear search(线性查找),binary search(二分查找),bubble sort(冒泡排序),selection sort(选择排序),insertion sort(插入排序),algorithm efficiency(算法效率),time complexity(时间复杂度)。
- State pre-condition, step count, and a number example.陈述前提条件、步骤数和数字例子。 "Binary search requires a sorted list. For n=1,024 items, binary search needs at most log₂(1,024)=10 comparisons, versus 1,024 for linear search." That is two sentences and it scores full marks on a typical efficiency question."二分查找需要已排序列表。对于 n=1024 个元素,二分查找最多需要 log₂(1024)=10 次比较,而线性查找最多需要 1024 次。"两句话,典型效率题满分。
- Trace binary search step by step when asked.被要求时逐步追踪二分查找。 Show the low, high, mid values at each step. Never skip steps on an exam trace question.每步展示 low、high、mid 的值。考试追踪题切勿跳步。
- Know what changes after each pass.知道每遍后什么发生了变化。 Bubble sort: largest unsorted element moves to its final position. Selection sort: one minimum is placed in final position. Insertion sort: one more element joins the sorted left portion.冒泡排序:最大未排序元素移至最终位置。选择排序:一个最小值放到最终位置。插入排序:多一个元素加入有序左段。
- Compare using Big-O when asked for efficiency.被问效率时用大O比较。 All three simple sorts are O(n²) worst case for comparisons. Insertion sort is O(n) best case. Python's built-in is O(n log n) worst case. ICS4U C2.3 and CSE3110 outcome 1.6.5 test this comparison explicitly.三种简单排序比较次数最坏情况均为 O(n²)。插入排序最好情况为 O(n)。Python 内置排序最坏情况为 O(n log n)。ICS4U C2.3 和 CSE3110 结果 1.6.5 明确考查此比较。
- Show all intermediate states in a sort trace.排序追踪中展示所有中间状态。 Draw the list after every pass, not just the final result. Marks are typically awarded per pass on sort-trace questions.每遍后画出列表,不只是最终结果。排序追踪题通常按每遍给分。
- For AP CSP: use College Board pseudocode.AP CSP 使用 College Board 伪代码。 Assignment uses
←, lists are 1-indexed, useDISPLAY(). Topic 3.11 binary search and 3.17 efficiency use this notation on the exam.赋值用←,列表从 1 索引,用DISPLAY()。主题 3.11 二分查找和 3.17 效率在考试中使用此表示法。
Flashcards闪卡
Practice Quiz综合测验
Readiness Checklist准备就绪清单
Tick each item when you can do it cold, without notes, on a first attempt.能在无笔记、首次尝试下完成,再勾选每一项。
- Write pseudocode for linear search: loop through every element, return the index when found, return −1 if not found. Trace it on a 6-element unsorted list. 🇨🇦 ON ICS4U A3.2 / AB CSE3110 1.5.1编写线性查找的伪代码:遍历每个元素,找到时返回索引,未找到返回 −1。在 6 个元素的未排序列表上追踪。🇨🇦 ON ICS4U A3.2 / AB CSE3110 1.5.1
- Write pseudocode for binary search: use low/mid/high pointers, halve the search window each step, handle the not-found case. State the pre-condition (sorted list). 🇨🇦 ON ICS4U A3.2 / AB CSE3110 1.5.2 / AP CSP 3.11编写二分查找的伪代码:使用 low/mid/high 指针,每步缩小一半搜索窗口,处理未找到情况。陈述前提条件(已排序列表)。🇨🇦 ON ICS4U A3.2 / AB CSE3110 1.5.2 / AP CSP 3.11
- Compare linear and binary search by step count: state worst-case n vs log₂n steps, give a concrete example (e.g., 1024 items: 1024 vs 10), and state when each is preferred. 🇺🇸 AP CSP 3.17 / 🇨🇦 ICS4U C2.2 / CSE3110 1.5.4按步骤数比较线性查找和二分查找:陈述最坏情况 n vs log₂n 步,举具体例子(如 1024 个元素:1024 vs 10),说明各自首选场景。🇺🇸 AP CSP 3.17 / 🇨🇦 ICS4U C2.2 / CSE3110 1.5.4
- Write and trace bubble sort on a 5-element list. State what is guaranteed after each pass (largest element moves to its final position). Give worst-case comparison count n(n−1)/2. 🇨🇦 ON ICS4U A3.4 / AB CSE3110 1.6.1编写并追踪 5 个元素列表的冒泡排序。陈述每遍后的保证(最大元素移至最终位置)。给出最坏情况比较次数 n(n−1)/2。🇨🇦 ON ICS4U A3.4 / AB CSE3110 1.6.1
- Write and trace selection sort on a 5-element list. State that it makes exactly n−1 swaps (one minimum placed per pass) and is always O(n²) comparisons. 🇨🇦 ON ICS4U A3.4 / AB CSE3110 1.6.2编写并追踪 5 个元素列表的选择排序。陈述其恰好进行 n−1 次交换(每遍放置一个最小值),且比较次数始终为 O(n²)。🇨🇦 ON ICS4U A3.4 / AB CSE3110 1.6.2
- Write and trace insertion sort on a 5-element list. Explain the shifting operation and state that its best case is O(n) on already-sorted data. 🇨🇦 ON ICS4U A3.4 / AB CSE3110 1.6.3编写并追踪 5 个元素列表的插入排序。解释移位操作,说明其在已排序数据上的最好情况为 O(n)。🇨🇦 ON ICS4U A3.4 / AB CSE3110 1.6.3
- Explain the difference between Python's sorted() and .sort(). Write a one-line call that sorts a list in descending order using .sort(reverse=True). 🇺🇸 CSTA 3B-AP-10解释 Python 的 sorted() 与 .sort() 的区别。编写一行代码用 .sort(reverse=True) 将列表降序排列。🇺🇸 CSTA 3B-AP-10
- Explain why Python's built-in sort is faster than bubble/selection/insertion sort for large n. State its worst-case complexity (O(n log n)) and name the algorithm it uses (Timsort). 🇨🇦 AB CSE3110 1.6.5 / ON ICS4U C2.3解释为何 Python 内置排序对大 n 比冒泡/选择/插入排序更快。陈述其最坏情况复杂度(O(n log n))并说出它使用的算法名称(Timsort)。🇨🇦 AB CSE3110 1.6.5 / ON ICS4U C2.3
- State verbatim (or near-verbatim) the CSTA standard 3B-AP-11 and identify which searching/sorting concept it directly maps to in this guide. 🇺🇸 CSTA 3B-AP-11逐字(或近逐字)陈述 CSTA 标准 3B-AP-11,并指出它直接映射到本指南中的哪个查找/排序概念。🇺🇸 CSTA 3B-AP-11
- State the ICS4U expectations for this unit: A3.2 (linear + binary search), A3.4 (bubble, insertion, selection sort), C2.2 (compare search efficiency), C2.3 (compare sort efficiency). 🇨🇦 ON ICS4U陈述本单元的 ICS4U 期望:A3.2(线性+二分查找)、A3.4(冒泡、插入、选择排序)、C2.2(比较查找效率)、C2.3(比较排序效率)。🇨🇦 ON ICS4U
- Honors — ICS4U C2 / CSE3110 Classify all 5 algorithms in this guide by Big-O: linear search O(n), binary search O(log n), bubble/selection/insertion sort O(n²) worst. Explain why insertion sort has O(n) best. 🇨🇦 ICS4U C2.2/C2.3 / AB CSE3110 1.5.4/1.6.5荣誉 — ICS4U C2 / CSE3110 用大O对本指南中全部 5 个算法分类:线性查找 O(n),二分查找 O(log n),冒泡/选择/插入排序最坏 O(n²)。解释插入排序为何有 O(n) 最好情况。🇨🇦 ICS4U C2.2/C2.3 / AB CSE3110 1.5.4/1.6.5
What This Feeds Into本单元的去向
Searching and sorting are foundational to more advanced algorithm study. The linear and binary search algorithms you learned here feed directly into AP CSA and ICS4U, where you implement them in Java. The sorting algorithm comparison (Big-O efficiency analysis) you practiced here is exactly the ICS4U C2.2/C2.3 and CSE3110 1.5.4/1.6.5 assessed content. The link below points to an existing AP CSA study guide confirmed to exist in this repository.查找与排序是更高级算法学习的基础。你在这里学到的线性查找和二分查找算法直接输入 AP CSA 和 ICS4U,在那里用 Java 实现。你在这里练习的排序算法比较(大O效率分析)正是 ICS4U C2.2/C2.3 和 CSE3110 1.5.4/1.6.5 的评估内容。以下链接指向已确认存在于本仓库的 AP CSA 学习指南。
AP feeder link (existing in this repo).AP 衔接链接(本仓库中已有)。
AP CSP Big Idea 3 topic 3.11 (Binary Search) and 3.17 (Algorithmic Efficiency) test the same binary search + efficiency concepts from this guide on the AP CSP exam. AP CSA Unit 4 covers searching and sorting formally in Java including recursive merge sort — this HS guide is the conceptual prerequisite. ICS4U and CSE3110 students: the Big-O analysis in the Honors going-deeper boxes is assessed at your level.AP CSP 大概念 3 主题 3.11(二分查找)和 3.17(算法效率)在 AP CSP 考试中考查与本指南相同的二分查找和效率概念。AP CSA Unit 4 在 Java 中正式涵盖查找和排序,包括递归归并排序——本 HS 指南是概念性先修课。ICS4U 和 CSE3110 学生:荣誉深入框中的大O分析在你的水平被评估。