T27384: 候选人追踪

懒更新，http://cs101.openjudge.cn/practice/27384/

超大型偶像团体HIHO314159总选举刚刚结束了。制作人小Hi正在复盘分析投票过程。

小Hi获得了N条投票记录，每条记录都包含一个时间戳Ti以及候选人编号Ci，代表有一位粉丝在Ti时刻投了Ci一票。

给定一个包含K名候选人集合S={S1, S2, ... SK}，小Hi想知道从投票开始(0时刻)，到最后一张票投出的时刻(max{Ti})，期间有多少时间得票最多的前K名候选人恰好是S中的K名候选人。

注意这里对前K名的要求是"严格"的，换句话说，S中的每一名候选人得票都要大于任何一名S之外的候选人。S集合内名次先后不作要求。

注：HIHO314159这个团体有314159名团员，编号是1~314159。

输入

第一行包含两个整数N和K。

第二行包含2N个整数：T1, C1, T2, C2, ... TN, CN。

第三行包含K个整数：S1, S2, ... SK。

对于30%的数据，1 ≤ N, K ≤ 100

对于60%的数据，1 ≤ N, K ≤ 1000

对于100%的数据, 1 ≤ N, K ≤ 314159 1 ≤ Ti ≤ 1000000 1 ≤ Ci, SK ≤ 314159

输出

一个整数，表示前K名恰好是S一共持续了多少时间。

样例输入

10 2  
3 1 4 1 5 1 4 3 6 5 8 3 7 5 8 5 9 1 10 5  
1 5

样例输出

来源：HC

python

import heapq

maxn = 320000
cnt = [0] * maxn
n, k = 0, 0
vis = [False] * maxn

n, k = map(int, input().split())
*records, = map(int, input().split())
arr = [(records[i], records[i+1]) for i in range(0, 2*n, 2)]

Q = []
candidates = list(map(int, input().split()))
for i in range(k):
    heapq.heappush(Q, (0, candidates[i]))
    vis[candidates[i]] = True

arr = sorted(arr[:n])
if k == 314159:
    print(arr[n-1][0])
    exit()

rmx = 0
rs = 0
for i in range(n):
    c = arr[i][1]
    cnt[c] += 1
    if vis[c]:
        while cnt[Q[0][1]]: # 懒更新，每次只更新到堆中的最小值是实际的最小值
            f = heapq.heappop(Q)
            f = (f[0] + cnt[f[1]], f[1])
            heapq.heappush(Q, f)
            cnt[f[1]] = 0
    else:
        rmx = max(rmx, cnt[c])
    if i != n-1 and arr[i+1][0] != arr[i][0] and Q[0][0] > rmx:
        rs += arr[i+1][0] - arr[i][0]

print(rs)

python

# 23n2300011335 邓锦文
n,k = map(int,input().split())
lst = list(map(int,input().split()))
arr = sorted([[lst[2*i],lst[2*i+1]] for i in range(n)])
vote = [0 for _ in range(314160)]
s = list(map(int,input().split()))
mark_dict = {}
for i in range(k):
    mark_dict[s[i]] = 0
if k == 314159:
    print(arr[-1][0])
    exit()
most,least = 0,0
ans = 0
for j in range(n):
    v = arr[j][1]
    if v in mark_dict:
        mark_dict[v] += 1
        if least == mark_dict[v]-1:
            least = min(mark_dict.values())
    else:
        vote[v] += 1
        most = max(most,vote[v])
    if j < n-1 and arr[j+1][0] != arr[j][0] and least > most:
        ans += arr[j+1][0]-arr[j][0]
print(ans)

python

#文敉浩 2300011027
import bisect
n, k = map(int, input().split())
votes = []
entry = input().split()
S = [int(x) for x in input().split()]
rst=0
for _ in range(n):
    t, c = int(entry[2*_]), int(entry[2*_+1])
    votes.append((t, c))
votes.sort()
if len(S)==314159:
    print(votes[-1][0])
    exit()
vote_num = [0]*314160
score = [0]*(len(S)+1)
for i in range(len(votes)-1):
    vote_num[votes[i][1]]-=1
    if vote_num[votes[i][1]]<score[-1]:
        position = bisect.bisect_left(score,vote_num[votes[i][1]]+1)
        score[position]-=1
    if votes[i+1][0]>votes[i][0]:
        
        if score[-2]==score[-1]:
            continue
        ok = True
        for j in S:
            position = bisect.bisect_left(score,vote_num[j])
            if not (position<len(score)-1 and score[position]==vote_num[j]):
                ok = False
                break
        if ok:
            rst+=votes[i+1][0]-votes[i][0]
print(rst)

python

# 曾睿豪 25物院
# 追踪最小值也可以用字典和集合实现
n, k = map(int, input().split())
vl = list(map(int, input().split()))
votes = sorted([(vl[2 * i], vl[2 * i + 1]) for i in range(n)])
s = set(map(int, input().split()))

if k == 314159:
    print(votes[-1][0])
    exit()
c = [0] * 314160
scores = {0 : set(s)}
if votes[0][1] in s:
    scores[0].discard(votes[0][1])
    scores[1] = {votes[0][1]}
minI = 1 if len(scores[0]) == 0 else 0
maxO = 0 if votes[0][1] in s else 1
t = votes[0][0]
tot = 0
c[votes[0][1]] += 1
for v in votes[1:]:
    if v[0] > t:
        if minI > maxO:
            tot += v[0] - t
        t = v[0]
    c[v[1]] += 1
    if v[1] in s:
        scores[c[v[1]] - 1].discard(v[1])
        if c[v[1]] in scores:
            scores[c[v[1]]].add(v[1])
        else:
            scores[c[v[1]]] = {v[1]}
        if len(scores[minI]) == 0:
            scores.pop(minI)
            minI += 1
    else:
        maxO = max(maxO, c[v[1]])
print(tot)

双堆+懒删除

python

import sys
import heapq

# 增加递归深度，虽然本解法主要依靠迭代
sys.setrecursionlimit(5000)

def solve():
    # 使用 sys.stdin.read 快速读取所有输入
    input_data = sys.stdin.read().split()
    
    if not input_data:
        return

    iterator = iter(input_data)
    
    try:
        N = int(next(iterator))
        K = int(next(iterator))
    except StopIteration:
        return

    # 读取投票记录 (2*N 个整数)
    votes = []
    for _ in range(N):
        t = int(next(iterator))
        c = int(next(iterator))
        votes.append((t, c))
    
    # 读取集合 S (K 个整数)
    target_s = set()
    for _ in range(K):
        target_s.add(int(next(iterator)))

    # 关键步骤：按时间戳排序
    votes.sort(key=lambda x: x[0])

    # 题目给定总人数
    TOTAL_MEMBERS = 314159
    max_id = TOTAL_MEMBERS + 1
    
    vote_counts = [0] * max_id
    
    # 两个堆：
    # min_heap_s: 维护 S 集合内的票数 (票数, ID)，堆顶是最小值
    # max_heap_other: 维护 S 集合外的票数 (-票数, ID)，堆顶是最大值
    min_heap_s = []
    max_heap_other = []

    # 初始化 S 集合堆，所有人初始票数为 0
    for cid in target_s:
        heapq.heappush(min_heap_s, (0, cid))
    
    # 【特判核心逻辑】
    # 如果 K == 314159，说明所有人都在 S 里，S 之外为空集。空集的最大值设为 -1。
    # 否则，S 之外至少有一人，初始票数为 0，最大值底线为 0。
    default_max_other = -1 if K == TOTAL_MEMBERS else 0

    # 获取 S 中最小票数（带懒删除）
    def get_min_s_votes():
        while min_heap_s:
            cnt, cid = min_heap_s[0]
            # 检查堆顶元素是否过期
            if vote_counts[cid] == cnt:
                return cnt
            heapq.heappop(min_heap_s)
        return 0

    # 获取 S 之外最大票数（带懒删除）
    def get_max_other_votes():
        while max_heap_other:
            neg_cnt, cid = max_heap_other[0]
            if vote_counts[cid] == -neg_cnt:
                return -neg_cnt
            heapq.heappop(max_heap_other)
        # 如果堆空了，返回设定好的默认值 (-1 或 0)
        return default_max_other

    total_valid_time = 0
    last_time = 0
    idx = 0
    
    # 遍历所有投票事件
    while idx < N:
        curr_time = votes[idx][0]
        
        # 1. 结算上一段时间段
        if curr_time > last_time:
            min_s = get_min_s_votes()
            max_other = get_max_other_votes()
            
            # 核心条件：S 最小值 > Other 最大值
            if min_s > max_other:
                total_valid_time += (curr_time - last_time)
        
        # 2. 处理当前时刻所有的票数变动
        while idx < N and votes[idx][0] == curr_time:
            _, cid = votes[idx]
            
            vote_counts[cid] += 1
            new_cnt = vote_counts[cid]
            
            if cid in target_s:
                heapq.heappush(min_heap_s, (new_cnt, cid))
            else:
                # Python 默认小顶堆，存负数来实现大顶堆
                heapq.heappush(max_heap_other, (-new_cnt, cid))
            
            idx += 1
        
        last_time = curr_time

    # 注意：题目要求统计到 max{Ti}，循环结束后即到达最后时刻，不需要处理后续时间
    print(total_valid_time)

if __name__ == '__main__':
    solve()

算法详细解释

这道题目本质上是一个时间轴模拟（Simulate）问题，结合了数据结构的优化。

1. 核心思路

我们需要在时间轴上从 $0$ 走到 $m a x (T_{i})$ 。在任意两个投票事件之间，票数是保持不变的。因此，我们只需要处理每个投票发生的时间点。算法流程如下：

排序：将所有投票记录按时间 $T$ 排序。
遍历：按时间顺序处理投票。对于每一个新的时间点 $T_{n o w}$ ，我们计算从上一个时间点 $T_{l a s t}$ 到 $T_{n o w}$ 的时间差。
判断状态：如果在这段空隙时间内，满足“S集合所有人的票数都严格大于非S集合所有人的票数”，则将时间差累加到总结果中。
更新数据：处理 $T_{n o w}$ 时刻发生的所有投票，更新候选人的票数。

2. 条件判断的优化

题目要求的条件是：S 中票数最少的人 > S 之外票数最多的人。如果每次都遍历所有候选人来找最大最小值，复杂度是 $O (N \times K)$ 或 $O (N \times 总人数)$ ，这在 $N = 300, 000$ 时会超时。

我们需要高效地获取两个值：

$M i n_S$ ：集合 $S$ 内候选人的最低票数。
$M a x_O t h e r$ ：集合 $S$ 外候选人的最高票数。

解决方案：使用两个堆（优先队列）

Min-Heap (min_heap_s): 存储集合 $S$ 中所有人的 (票数, ID)。堆顶即为 $M i n_S$ 。
Max-Heap (max_heap_other): 存储集合 $S$ 之外所有得过票的人的 (票数, ID)。堆顶即为 $M a x_O t h e r$ 。如果堆为空，则 $M a x_O t h e r = 0$ （因为还有无数没得票的人）。

3. 懒删除（Lazy Deletion）技巧

当一个候选人的票数增加时（例如从 2 票变为 3 票），他在堆中原来的记录 (2, ID) 就失效了。由于 Python 的 heapq 不支持直接修改堆中间的元素，我们采用懒删除策略：

更新时，直接将新的记录 (3, ID) 压入堆中。
取值时（查看堆顶），检查堆顶元素的票数是否等于该 ID 当前的实际票数（在 vote_counts 数组中查询）。
- 如果不等，说明这是个旧记录，直接弹出（heappop）并丢弃。
- 如果相等，说明是有效记录，直接使用。

4. 复杂度分析

时间复杂度：
- 排序： $O (N \log N)$ 。
- 主循环：遍历 $N$ 次。每次投票操作会进行堆的 push，每次状态查询可能会进行多次 pop（由于懒删除）。但每个元素最多进堆一次、出堆一次。因此总的堆操作次数与 $N$ 成正比。
- 总复杂度： $O (N \log N)$ 。对于 $N = 300, 000$ ，这完全可以在 1 秒内完成。
空间复杂度：需要存储所有候选人的票数和堆，约为 $O (N + 候选人ID范围)$ ，内存限制 256MB 足够使用。

5. 注意事项

输入读取：Python 的 input() 在数据量巨大时较慢，推荐使用 sys.stdin.read().split() 一次性读取。
同一时刻多票：题目中可能有多个记录发生在同一时刻 $T$ 。必须先处理完该时刻所有的票数变化，再去计算随后的时间段，或者先计算上一段时间段，再处理当前时刻的所有票。代码中使用的是“先结算上一段 $l a s t \to c u r r$ ，再更新当前 $c u r r$ ”的逻辑。
初始状态：S 集合初始票数全为 0，S 之外初始全为 0。初始时刻 $M i n_S = 0, M a x_O t h e r = 0$ ，条件 $0 > 0$ 不成立。

T27384: 候选人追踪 ​

T27384: 候选人追踪