面试题- sql汇总

Posted by lichao modified on June 22, 2022

用户行为分析 (SQL)

表A:用户浏览视频日志 user_behavior: date, user_id, video_id, start_time, end_time 表B:视频信息 video_info: video_id, video_duration 表C:用户信息 user_info: user_id, gender

问题: (1)某一天(如 20200310),观看不同视频个数的前 5 名 user_id ; (2)观看超过 50 个不同视频的女性用户中,完整观看率最高的 10 个 user_id。

五个知识点:group by; 排序; join 的用法 ;条件筛选 ;业务理解:如何定义「完整观看率」

答案:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
select
    user_id,
    count(*) as video_cnt
from
    (
        select
            user_id,
            video_id
        from
            user_behavior
        where
            date = '20200310'
        group by
            user_id,
            video_id
    ) a
group by
    user_id
order by
    video_cnt desc
limit 5;

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
select
    user_id,
    count(*) as video_cnt,
    sum(is_full) * 1.0/count(*) as full_ratio
from
    (
        select
            a.user_id,
            a.video_id,
            case
                when a.max_time >= c.duration then 1
                else 0
            end as is_full
        from
            (
                select
                    user_id,
                    video_id,
                    max(end_time -start_time) as max_time 
                from
                    user_behavior
                where
                    date = '20200310'
                group by
                    user_id,
                    video_id
            ) a
            inner join (
                select
                    user_id,
                    gender
                from
                    user_info
                where
                    gender = 'female'
            ) b on a.user_id = b.user_id
            left outer join video_info c on a.video_id = c.video_id
    ) a
group by
    user_id
having
    video_cnt >= 50
order by
    full_ratio desc
limit
    10

统计数据表中每个小时记录数

例如:数据表

1
2
3
4
5
6
7
id   p_date
1    2017/08/29 00:10:10
2    2017/08/29 01:10:10
3    2017/08/29 01:10:10
4    2017/08/29 02:10:10
5    2017/08/29 01:10:10
……

答案:

1
2
3
4
5
6
7
SELECT 
 DATE_FORMAT(p_date, "%Y-%m-%d %H" ),
 count(*)  
FROM 
 video.cdn_quality_log 
group by 
 DATE_FORMAT(p_date, "%Y-%m-%d %H" ) ;

如何建立联合索引

1
2
SELECT * FROM `table` WHERE a IN (1,2,3) and b > 1;
SELECT * FROM `table` WHERE a > 1 ORDER BY b;

https://www.cnblogs.com/rjzheng/p/12557314.html