连续登录
连续登录 3 天
user_id login_date
0001 2025-01-01
0001 2025-01-02
0001 2025-01-03
0001 2025-01-04
0001 2025-01-05
0001 2025-01-07
0001 2025-01-08
0001 2025-01-09
0002 2025-01-01
0002 2025-01-02
0002 2025-01-03
0002 2025-01-07
0002 2025-01-08
0003 2025-01-07
0003 2025-01-08
0003 2025-01-09
准备数据
create table login_log
(
user_id string comment '用户ID',
login_date date comment '登录日期'
);
INSERT INTO login_log VALUES
('0001','2025-01-01'),
('0001','2025-01-02'),
('0001','2025-01-03'),
('0001','2025-01-04'),
('0001','2025-01-05'),
('0001','2025-01-07'),
('0001','2025-01-08'),
('0001','2025-01-09'),
('0002','2025-01-01'),
('0002','2025-01-02'),
('0002','2025-01-03'),
('0002','2025-01-07'),
('0002','2025-01-08'),
('0003','2025-01-07'),
('0003','2025-01-08'),
('0003','2025-01-09');
答案
-- 方案一:查找连续3天登录的用户
WITH tmp AS (
SELECT
user_id,
login_date,
LEAD(login_date, 1) OVER (PARTITION BY user_id ORDER BY login_date) AS next_day,
LEAD(login_date, 2) OVER (PARTITION BY user_id ORDER BY login_date) AS next_next_day
FROM login_log
)
SELECT DISTINCT user_id
FROM tmp
WHERE
DATEDIFF(next_day, login_date) = 1 -- 明天登录
AND DATEDIFF(next_next_day, login_date) = 2; -- 后天也登录
思路
1.使用LEAD
函数,获取用户登录日期的后一天和后两天
2.使用DATEDIFF
函数,计算用户登录日期的后一天和后两天之间的天数
3.使用WHERE
条件,过滤出用户登录日期的后一天和后两天之间的天数为1和2的用户
4.使用SELECT DISTINCT
函数,去重用户ID
连续登录 5 天
WITH ranked_logins AS (
SELECT
user_id,
login_date,
DATE_SUB(login_date, ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY login_date)) AS grp
FROM login_log
),
consecutive_streaks AS (
SELECT
user_id,
grp,
COUNT(*) AS streak_days
FROM ranked_logins
GROUP BY user_id, grp
)
SELECT
user_id,
MAX(streak_days) AS max_consecutive_days
FROM consecutive_streaks
GROUP BY user_id
HAVING MAX(streak_days) >= 5;
思路一、核心原理
当日期连续时,login_date - row_number
的结果始终相同。
假设用户 0001
连续 3 天登录:
登录日期 | 行号 (row_number ) | login_date - row_number |
---|---|---|
2025-01-01 | 1 | 2024-12-31 (固定值) |
2025-01-02 | 2 | 2024-12-31 (固定值) |
2025-01-03 | 3 | 2024-12-31 (固定值) |
-
连续日期:每天增加 1 天,同时行号也增加 1,差值保持不变。
-
非连续日期:如中间断了一天(2025-01-05),则差值会变化:
登录日期 行号 login_date - row_number
2025-01-05 4 2025-01-01 (新的固定值)