應用SQL分析上海租房數據
在上一階段的學習中通過Gooseeker瀏覽器爬取了上海租房數據,並經由EXCEL進行了初步的描述統計分析。此次將應用SQL工具進行類似分析,具體步驟如下:
將此前已完成數據清洗的EXCEL表「上海租房0910」轉為CSV文件,然後在可視化的SQL工具Navicat Premium中將其導入,注意導入的時候選擇簡體中文編碼,否則導入後中文欄位會呈現亂碼。
在查詢編輯器中進行查詢,此次希望查詢的問題如下:
1. 總數據量
輸入命令:
SELECT COUNT(*) FROM 上海租房0910
得到結果:
2. 上海各行政區的租房價格、每平米單價及面積排名。
輸入命令:
SELECT DISTRICT, ROUND(AVG(PRICE),2) AS AVERAGE PRICE, ROUND(AVG(UNIT_PRICE),2) AS AVERAGE UNIT PRICE, ROUND(AVG(AREA),2) AS AVERAGE AREA
FROM 上海租房0910
GROUP BY DISTRICT
ORDER BY AVG(PRICE) DESC
得到結果:
3. 上海各板塊的租房每平米單價排名。
輸入命令:
SELECT SUB_DISTRICT, ROUND(AVG(PRICE),2) AS AVERAGE PRICE, ROUND(AVG(UNIT_PRICE),2) AS AVERAGE UNIT PRICE, ROUND(AVG(AREA),2) AS AVERAGE AREA
FROM 上海租房0910
GROUP BY SUB_DISTRICT
ORDER BY AVG(PRICE) DESC
結果(僅截取部分)如下:
4. 各種戶型的出租數量和佔比排序
輸入命令:
SELECT TYPE, COUNT(TYPE) AS QUANTITY, CONCAT(ROUND(100*COUNT(TYPE)/2760,2),%) AS PERCENTAGE OF QUANTITY FROM 上海租房0910
GROUP BY TYPE
ORDER BY COUNT(TYPE) DESC
得到結果:
5. 各種樓層類型的出租數量及佔比排序
輸入命令:
SELECT
(CASE WHEN FLOOR BETWEEN 1 AND 6 THEN 低層
WHEN FLOOR BETWEEN 7 AND 10 THEN 小高層
WHEN FLOOR BETWEEN 11 AND 24 THEN 中高層
ELSE 高層
END) 樓層,
COUNT(*) AS 數量, CONCAT(ROUND(100*COUNT(*)/2760,2),%) AS 佔比
FROM 上海租房0910
GROUP BY 樓層
ORDER BY 數量 DESC
推薦閱讀: