2015年3月20日金曜日

GNU R言語でCSVを扱う

訳あって、大量のCSVデータと格闘することになりそうです。
そのために調べているのが統計解析のためのR言語です。
試しに、気象庁から過去のデータを引っ張ってきてR言語で処理してみます。

1.データをダウンロード

まずデータをダウンロードする地点を選びます。
今回は東京・大阪・仙台・名古屋・福岡を選んでみました。
次に項目を選びます
データの種類は「日別値」、過去の平均値との比較はなし、項目は「日平均気温」「日最高気温」「日最低気温」「降水量の日合計」「10分間降水量の日最大」「日照時間」「日平均風速」「日最大風速(風向)」「日最大瞬間風速(風向)」「日平均相対湿度」「日平均海面気圧」を選んでみました。
次に期間を選びます。
2014年1月1日~12月31日を選んでみたところ、データ量が多すぎるようです。
「日照時間」「日平均風速」「日最大風速(風向)」「日最大瞬間風速(風向)」「日平均相対湿度」は項目から削除しました。
これでCSVでダウンロードしました。

2.文字コードの変換

ダウンロードしたファイルは文字コードがSJISになっています。これをUTF-8に変換します。
$ iconv -f SJIS -t UTF8 kishou2014.csv > kishou2014-utf.csv

3.データの読込みと生表示

データファイルのあるディレクトリに移動してからR言語を起動し、
> dataset <- read.csv("kishou2014-utf.csv",stringsAsFactors=FALSE,skip=2)
としてdatasetオブジェクトにデータを読み込ませます。その際に、最初の2行(ダウンロード時刻と空行)は切り捨てます。「stringsAsFactors=FALSE」はこれがないと後で数値に変換する際におかしな変換がかかります。
(参考: http://detail.chiebukuro.yahoo.co.jp/qa/question_detail/q1474978024)
読み込まれたデータをちょっとだけ見てみます。
> dataset[1:6,]
         X         東京       東京.1       東京.2       東京.3       東京.4
1   年月日 平均気温(℃) 平均気温(℃) 平均気温(℃) 最高気温(℃) 最高気温(℃)
2                                                                          
3                           品質情報     均質番号                  品質情報
4 2014/1/1          9.6            8            1         15.5            8
5 2014/1/2          7.3            8            1         12.1            8
6 2014/1/3          5.9            8            1          8.5            8
        東京.5       東京.6       東京.7       東京.8           東京.9
1 最高気温(℃) 最低気温(℃) 最低気温(℃) 最低気温(℃) 降水量の合計(mm)
2                                                                     
3     均質番号                  品質情報     均質番号                 
4            1          3.1            8            1                0
5            1          3.1            8            1                0
6            1          3.8            8            1                0
           東京.10          東京.11          東京.12                東京.13
1 降水量の合計(mm) 降水量の合計(mm) 降水量の合計(mm) 10分間降水量の最大(mm)
2                                                                          
3     現象なし情報         品質情報         均質番号                       
4                1                8                1                      0
5                1                8                1                      0
6                1                8                1                      0
実際にはもっとたくさん表示されます。
2列目の東京の平均気温だけを表示させてみます。
> dataset[,2]
 [1] "平均気温(℃)" "" "" "9.6" "7.3" 
 [6] "5.9" "6.5" "5.4" "5.3" "5.5" 
 [11] "7.3" "7.2" "3.6" "4.3" "5.3" 
 [16] "5.1" "4.2" "3.0" "4.8" "5.8" 
 [21] "5.2" "4.2" "4.8" "6.3" "5.5" 
 [26] "6.5" "7.2" "8.4" "9.5" "4.6" 
 [31] "8.8" "8.9" "10.3" "9.9" "7.9" 
 [36] "9.3" "12.8" "5.0" "1.6" "2.7" 
 [41] "4.0" "0.3" "5.4" "6.0" "3.3" 
 [46] "4.8" "5.3" "1.4" "3.9" "6.9" 
 [51] "7.3" "4.7" "4.7" "4.6" "5.5" 
 [56] "5.1" "5.6" "6.3" "8.2" "10.0" 
 [61] "10.0" "13.8" "8.9" "5.6" "6.6" 
 [66] "6.9" "6.5" "5.4" "4.4" "5.9" 
 [71] "6.4" "4.9" "6.1" "11.0" "12.3" 
 [76] "10.1" "8.3" "10.8" "12.1" "15.4" 
 [81] "11.2" "8.1" "10.1" "9.3" "11.6" 
 [86] "13.6" "16.6" "16.4" "12.6" "14.8" 
 [91] "17.7" "16.5" "15.7" "13.9" "15.2" 
 [96] "13.8" "15.3" "11.4" "9.0" "11.4" 
[101] "15.8" "15.5" "16.1" "12.9" "14.6" 
[106] "14.4" "14.0" "16.3" "18.7" "17.7" 
[111] "11.8" "11.7" "11.0" "13.4" "15.2" 
[116] "16.2" "16.6" "17.8" "18.2" "18.1" 
[121] "18.6" "17.8" "16.7" "20.6" "21.0" 
[126] "21.3" "18.6" "18.1" "14.3" "16.5" 
[131] "18.7" "19.3" "19.8" "20.6" "19.6" 
[136] "19.3" "21.9" "19.9" "21.5" "21.1" 
[141] "20.5" "21.1" "21.7" "17.5" "18.6" 
[146] "17.8" "20.6" "22.7" "21.3" "21.4" 
[151] "22.7" "23.1" "24.1" "25.6" "26.6" 
[156] "25.5" "24.0" "24.3" "21.2" "19.9" 
[161] "18.4" "20.5" "22.9" "23.5" "22.2" 
[166] "22.0" "24.9" "24.7" "24.4" "25.0" 
[171] "24.3" "23.2" "24.8" "24.6" "24.6" 
[176] "22.5" "24.2" "22.6" "22.6" "23.6" 
[181] "23.9" "22.8" "23.3" "23.6" "25.3" 
[186] "25.8" "24.8" "21.6" "22.0" "24.2" 
[191] "23.4" "26.8" "24.2" "26.6" "28.5" 
[196] "29.2" "26.9" "28.5" "28.2" "28.0" 
[201] "27.1" "24.9" "23.8" "24.5" "25.0" 
[206] "27.7" "29.4" "29.9" "30.5" "30.5" 
[211] "30.1" "27.7" "27.6" "27.9" "29.5" 
[216] "29.8" "30.9" "31.0" "30.5" "31.1" 
[221] "30.8" "29.7" "29.0" "25.4" "26.3" 
[226] "29.4" "26.3" "26.7" "27.1" "29.5" 
[231] "27.0" "26.2" "29.1" "30.3" "31.1" 
[236] "30.5" "30.2" "27.6" "28.4" "27.0" 
[241] "25.2" "21.2" "21.1" "23.3" "22.6" 
[246] "23.2" "21.8" "24.5" "24.1" "25.2" 
[251] "27.8" "27.3" "21.7" "22.1" "23.5" 
[256] "23.4" "21.8" "23.5" "23.1" "23.8" 
[261] "24.1" "25.1" "23.3" "22.3" "22.3" 
[266] "20.5" "21.3" "22.6" "22.1" "22.4" 
[271] "24.4" "22.9" "20.2" "21.7" "23.5" 
[276] "24.7" "21.2" "21.7" "25.7" "24.2" 
[281] "17.2" "21.3" "20.0" "20.3" "21.0" 
[286] "22.4" "20.3" "18.6" "18.1" "23.1" 
[291] "15.9" "16.8" "18.4" "17.5" "18.3" 
[296] "19.4" "19.0" "16.1" "14.3" "16.7" 
[301] "18.3" "20.1" "19.3" "15.8" "16.4" 
[306] "16.8" "18.4" "16.8" "19.5" "18.0" 
[311] "16.2" "15.2" "17.7" "17.7" "14.1" 
[316] "14.9" "17.8" "14.5" "14.1" "15.0" 
[321] "12.8" "12.7" "12.5" "12.1" "12.9" 
[326] "12.1" "9.1" "12.3" "13.8" "14.1" 
[331] "13.2" "11.1" "9.2" "12.9" "14.1" 
[336] "13.8" "15.1" "13.5" "10.8" "8.2" 
[341] "8.8" "8.3" "5.1" "6.2" "6.6" 
[346] "7.3" "7.3" "9.6" "9.3" "7.0" 
[351] "4.8" "5.3" "3.6" "4.3" "4.0" 
[356] "5.4" "5.3" "8.6" "7.1" "6.3" 
[361] "5.9" "6.3" "5.1" "4.6" "3.8" 
[366] "4.2" "6.6" "8.0"
代わりに、最初の行に表示されたインデックス(?)を使って、「dataset$東京」でもアクセスできるようです。
> dataset$東京
 [1] "平均気温(℃)" "" "" "9.6" "7.3" 
 [6] "5.9" "6.5" "5.4" "5.3" "5.5" 
 [11] "7.3" "7.2" "3.6" "4.3" "5.3" 
 [16] "5.1" "4.2" "3.0" "4.8" "5.8" 
 [21] "5.2" "4.2" "4.8" "6.3" "5.5" 
 [26] "6.5" "7.2" "8.4" "9.5" "4.6" 
 [31] "8.8" "8.9" "10.3" "9.9" "7.9" 
 [36] "9.3" "12.8" "5.0" "1.6" "2.7" 
 [41] "4.0" "0.3" "5.4" "6.0" "3.3" 
 [46] "4.8" "5.3" "1.4" "3.9" "6.9" 
 [51] "7.3" "4.7" "4.7" "4.6" "5.5" 
 [56] "5.1" "5.6" "6.3" "8.2" "10.0" 
 [61] "10.0" "13.8" "8.9" "5.6" "6.6" 
 [66] "6.9" "6.5" "5.4" "4.4" "5.9" 
 [71] "6.4" "4.9" "6.1" "11.0" "12.3" 
 [76] "10.1" "8.3" "10.8" "12.1" "15.4" 
 [81] "11.2" "8.1" "10.1" "9.3" "11.6" 
 [86] "13.6" "16.6" "16.4" "12.6" "14.8" 
 [91] "17.7" "16.5" "15.7" "13.9" "15.2" 
 [96] "13.8" "15.3" "11.4" "9.0" "11.4" 
[101] "15.8" "15.5" "16.1" "12.9" "14.6" 
[106] "14.4" "14.0" "16.3" "18.7" "17.7" 
[111] "11.8" "11.7" "11.0" "13.4" "15.2" 
[116] "16.2" "16.6" "17.8" "18.2" "18.1" 
[121] "18.6" "17.8" "16.7" "20.6" "21.0" 
[126] "21.3" "18.6" "18.1" "14.3" "16.5" 
[131] "18.7" "19.3" "19.8" "20.6" "19.6" 
[136] "19.3" "21.9" "19.9" "21.5" "21.1" 
[141] "20.5" "21.1" "21.7" "17.5" "18.6" 
[146] "17.8" "20.6" "22.7" "21.3" "21.4" 
[151] "22.7" "23.1" "24.1" "25.6" "26.6" 
[156] "25.5" "24.0" "24.3" "21.2" "19.9" 
[161] "18.4" "20.5" "22.9" "23.5" "22.2" 
[166] "22.0" "24.9" "24.7" "24.4" "25.0" 
[171] "24.3" "23.2" "24.8" "24.6" "24.6" 
[176] "22.5" "24.2" "22.6" "22.6" "23.6" 
[181] "23.9" "22.8" "23.3" "23.6" "25.3" 
[186] "25.8" "24.8" "21.6" "22.0" "24.2" 
[191] "23.4" "26.8" "24.2" "26.6" "28.5" 
[196] "29.2" "26.9" "28.5" "28.2" "28.0" 
[201] "27.1" "24.9" "23.8" "24.5" "25.0" 
[206] "27.7" "29.4" "29.9" "30.5" "30.5" 
[211] "30.1" "27.7" "27.6" "27.9" "29.5" 
[216] "29.8" "30.9" "31.0" "30.5" "31.1" 
[221] "30.8" "29.7" "29.0" "25.4" "26.3" 
[226] "29.4" "26.3" "26.7" "27.1" "29.5" 
[231] "27.0" "26.2" "29.1" "30.3" "31.1" 
[236] "30.5" "30.2" "27.6" "28.4" "27.0" 
[241] "25.2" "21.2" "21.1" "23.3" "22.6" 
[246] "23.2" "21.8" "24.5" "24.1" "25.2" 
[251] "27.8" "27.3" "21.7" "22.1" "23.5" 
[256] "23.4" "21.8" "23.5" "23.1" "23.8" 
[261] "24.1" "25.1" "23.3" "22.3" "22.3" 
[266] "20.5" "21.3" "22.6" "22.1" "22.4" 
[271] "24.4" "22.9" "20.2" "21.7" "23.5" 
[276] "24.7" "21.2" "21.7" "25.7" "24.2" 
[281] "17.2" "21.3" "20.0" "20.3" "21.0" 
[286] "22.4" "20.3" "18.6" "18.1" "23.1" 
[291] "15.9" "16.8" "18.4" "17.5" "18.3" 
[296] "19.4" "19.0" "16.1" "14.3" "16.7" 
[301] "18.3" "20.1" "19.3" "15.8" "16.4" 
[306] "16.8" "18.4" "16.8" "19.5" "18.0" 
[311] "16.2" "15.2" "17.7" "17.7" "14.1" 
[316] "14.9" "17.8" "14.5" "14.1" "15.0" 
[321] "12.8" "12.7" "12.5" "12.1" "12.9" 
[326] "12.1" "9.1" "12.3" "13.8" "14.1" 
[331] "13.2" "11.1" "9.2" "12.9" "14.1" 
[336] "13.8" "15.1" "13.5" "10.8" "8.2" 
[341] "8.8" "8.3" "5.1" "6.2" "6.6" 
[346] "7.3" "7.3" "9.6" "9.3" "7.0" 
[351] "4.8" "5.3" "3.6" "4.3" "4.0" 
[356] "5.4" "5.3" "8.6" "7.1" "6.3" 
[361] "5.9" "6.3" "5.1" "4.6" "3.8" 
[366] "4.2" "6.6" "8.0"
どの名前をつければどの項目が切り出せるかは、str()でわかるようです。
> str(dataset)
'data.frame': 368 obs. of  101 variables:
 $ X        : chr  "年月日" "" "" "2014/1/1" ...
 $ 東京     : chr  "平均気温(℃)" "" "" "9.6" ...
 $ 東京.1   : chr  "平均気温(℃)" "" "品質情報" "8" ...
 $ 東京.2   : chr  "平均気温(℃)" "" "均質番号" "1" ...
 $ 東京.3   : chr  "最高気温(℃)" "" "" "15.5" ...
 $ 東京.4   : chr  "最高気温(℃)" "" "品質情報" "8" ...
 $ 東京.5   : chr  "最高気温(℃)" "" "均質番号" "1" ...
 $ 東京.6   : chr  "最低気温(℃)" "" "" "3.1" ...
 $ 東京.7   : chr  "最低気温(℃)" "" "品質情報" "8" ...
 $ 東京.8   : chr  "最低気温(℃)" "" "均質番号" "1" ...
 $ 東京.9   : chr  "降水量の合計(mm)" "" "" "0" ...
 $ 東京.10  : chr  "降水量の合計(mm)" "" "現象なし情報" "1" ...
 $ 東京.11  : chr  "降水量の合計(mm)" "" "品質情報" "8" ...
 $ 東京.12  : chr  "降水量の合計(mm)" "" "均質番号" "1" ...
 $ 東京.13  : chr  "10分間降水量の最大(mm)" "" "" "0" ...
 $ 東京.14  : chr  "10分間降水量の最大(mm)" "" "現象なし情報" "1" ...
 $ 東京.15  : chr  "10分間降水量の最大(mm)" "" "品質情報" "8" ...
 $ 東京.16  : chr  "10分間降水量の最大(mm)" "" "均質番号" "1" ...
 $ 東京.17  : chr  "平均海面気圧(hPa)" "" "" "1005.7" ...
 $ 東京.18  : chr  "平均海面気圧(hPa)" "" "品質情報" "8" ...
 $ 東京.19  : chr  "平均海面気圧(hPa)" "" "均質番号" "1" ...
 $ 大阪     : chr  "平均気温(℃)" "" "" "9.9" ...
 $ 大阪.1   : chr  "平均気温(℃)" "" "品質情報" "8" ...
 $ 大阪.2   : chr  "平均気温(℃)" "" "均質番号" "1" ...
 $ 大阪.3   : chr  "最高気温(℃)" "" "" "13.0" ...
 $ 大阪.4   : chr  "最高気温(℃)" "" "品質情報" "8" ...
 $ 大阪.5   : chr  "最高気温(℃)" "" "均質番号" "1" ...
 $ 大阪.6   : chr  "最低気温(℃)" "" "" "6.2" ...
 $ 大阪.7   : chr  "最低気温(℃)" "" "品質情報" "8" ...
 $ 大阪.8   : chr  "最低気温(℃)" "" "均質番号" "1" ...
 $ 大阪.9   : chr  "降水量の合計(mm)" "" "" "0.0" ...
 $ 大阪.10  : chr  "降水量の合計(mm)" "" "現象なし情報" "0" ...
 $ 大阪.11  : chr  "降水量の合計(mm)" "" "品質情報" "8" ...
 $ 大阪.12  : chr  "降水量の合計(mm)" "" "均質番号" "1" ...
 $ 大阪.13  : chr  "10分間降水量の最大(mm)" "" "" "0.0" ...
 $ 大阪.14  : chr  "10分間降水量の最大(mm)" "" "現象なし情報" "0" ...
 $ 大阪.15  : chr  "10分間降水量の最大(mm)" "" "品質情報" "8" ...
 $ 大阪.16  : chr  "10分間降水量の最大(mm)" "" "均質番号" "1" ...
 $ 大阪.17  : chr  "平均海面気圧(hPa)" "" "" "1012.2" ...
 $ 大阪.18  : chr  "平均海面気圧(hPa)" "" "品質情報" "8" ...
 $ 大阪.19  : chr  "平均海面気圧(hPa)" "" "均質番号" "1" ...
 $ 仙台     : chr  "平均気温(℃)" "" "" "5.4" ...
 $ 仙台.1   : chr  "平均気温(℃)" "" "品質情報" "8" ...
 $ 仙台.2   : chr  "平均気温(℃)" "" "均質番号" "1" ...
 $ 仙台.3   : chr  "最高気温(℃)" "" "" "9.9" ...
 $ 仙台.4   : chr  "最高気温(℃)" "" "品質情報" "8" ...
 $ 仙台.5   : chr  "最高気温(℃)" "" "均質番号" "1" ...
 $ 仙台.6   : chr  "最低気温(℃)" "" "" "1.2" ...
 $ 仙台.7   : chr  "最低気温(℃)" "" "品質情報" "8" ...
 $ 仙台.8   : chr  "最低気温(℃)" "" "均質番号" "1" ...
 $ 仙台.9   : chr  "降水量の合計(mm)" "" "" "5.0" ...
 $ 仙台.10  : chr  "降水量の合計(mm)" "" "現象なし情報" "0" ...
 $ 仙台.11  : chr  "降水量の合計(mm)" "" "品質情報" "8" ...
 $ 仙台.12  : chr  "降水量の合計(mm)" "" "均質番号" "1" ...
 $ 仙台.13  : chr  "10分間降水量の最大(mm)" "" "" "1.5" ...
 $ 仙台.14  : chr  "10分間降水量の最大(mm)" "" "現象なし情報" "0" ...
 $ 仙台.15  : chr  "10分間降水量の最大(mm)" "" "品質情報" "8" ...
 $ 仙台.16  : chr  "10分間降水量の最大(mm)" "" "均質番号" "1" ...
 $ 仙台.17  : chr  "平均海面気圧(hPa)" "" "" "1003.7" ...
 $ 仙台.18  : chr  "平均海面気圧(hPa)" "" "品質情報" "8" ...
 $ 仙台.19  : chr  "平均海面気圧(hPa)" "" "均質番号" "1" ...
 $ 名古屋   : chr  "平均気温(℃)" "" "" "6.1" ...
 $ 名古屋.1 : chr  "平均気温(℃)" "" "品質情報" "8" ...
 $ 名古屋.2 : chr  "平均気温(℃)" "" "均質番号" "1" ...
 $ 名古屋.3 : chr  "最高気温(℃)" "" "" "12.1" ...
 $ 名古屋.4 : chr  "最高気温(℃)" "" "品質情報" "8" ...
 $ 名古屋.5 : chr  "最高気温(℃)" "" "均質番号" "1" ...
 $ 名古屋.6 : chr  "最低気温(℃)" "" "" "1.8" ...
 $ 名古屋.7 : chr  "最低気温(℃)" "" "品質情報" "8" ...
 $ 名古屋.8 : chr  "最低気温(℃)" "" "均質番号" "1" ...
 $ 名古屋.9 : chr  "降水量の合計(mm)" "" "" "0.0" ...
 $ 名古屋.10: chr  "降水量の合計(mm)" "" "現象なし情報" "0" ...
 $ 名古屋.11: chr  "降水量の合計(mm)" "" "品質情報" "8" ...
 $ 名古屋.12: chr  "降水量の合計(mm)" "" "均質番号" "1" ...
 $ 名古屋.13: chr  "10分間降水量の最大(mm)" "" "" "0.0" ...
 $ 名古屋.14: chr  "10分間降水量の最大(mm)" "" "現象なし情報" "0" ...
 $ 名古屋.15: chr  "10分間降水量の最大(mm)" "" "品質情報" "8" ...
 $ 名古屋.16: chr  "10分間降水量の最大(mm)" "" "均質番号" "1" ...
 $ 名古屋.17: chr  "平均海面気圧(hPa)" "" "" "1011.4" ...
 $ 名古屋.18: chr  "平均海面気圧(hPa)" "" "品質情報" "8" ...
 $ 名古屋.19: chr  "平均海面気圧(hPa)" "" "均質番号" "1" ...
 $ 福岡     : chr  "平均気温(℃)" "" "" "10.5" ...
 $ 福岡.1   : chr  "平均気温(℃)" "" "品質情報" "8" ...
 $ 福岡.2   : chr  "平均気温(℃)" "" "均質番号" "1" ...
 $ 福岡.3   : chr  "最高気温(℃)" "" "" "12.6" ...
 $ 福岡.4   : chr  "最高気温(℃)" "" "品質情報" "8" ...
 $ 福岡.5   : chr  "最高気温(℃)" "" "均質番号" "1" ...
 $ 福岡.6   : chr  "最低気温(℃)" "" "" "6.5" ...
 $ 福岡.7   : chr  "最低気温(℃)" "" "品質情報" "8" ...
 $ 福岡.8   : chr  "最低気温(℃)" "" "均質番号" "1" ...
 $ 福岡.9   : chr  "降水量の合計(mm)" "" "" "0.0" ...
 $ 福岡.10  : chr  "降水量の合計(mm)" "" "現象なし情報" "0" ...
 $ 福岡.11  : chr  "降水量の合計(mm)" "" "品質情報" "8" ...
 $ 福岡.12  : chr  "降水量の合計(mm)" "" "均質番号" "1" ...
 $ 福岡.13  : chr  "10分間降水量の最大(mm)" "" "" "0.0" ...
 $ 福岡.14  : chr  "10分間降水量の最大(mm)" "" "現象なし情報" "0" ...
 $ 福岡.15  : chr  "10分間降水量の最大(mm)" "" "品質情報" "8" ...
 $ 福岡.16  : chr  "10分間降水量の最大(mm)" "" "均質番号" "1" ...
 $ 福岡.17  : chr  "平均海面気圧(hPa)" "" "" "1016.2" ...
  [list output truncated]
>
実際のデータは4つ目の9.6からなので、そこからを表示させてみます。
> dataset$東京[4:368]
 [1] "9.6" "7.3" "5.9" "6.5" "5.4" "5.3" "5.5" "7.3" "7.2" "3.6" 
 [11] "4.3" "5.3" "5.1" "4.2" "3.0" "4.8" "5.8" "5.2" "4.2" "4.8" 
 [21] "6.3" "5.5" "6.5" "7.2" "8.4" "9.5" "4.6" "8.8" "8.9" "10.3"
 [31] "9.9" "7.9" "9.3" "12.8" "5.0" "1.6" "2.7" "4.0" "0.3" "5.4" 
 [41] "6.0" "3.3" "4.8" "5.3" "1.4" "3.9" "6.9" "7.3" "4.7" "4.7" 
 [51] "4.6" "5.5" "5.1" "5.6" "6.3" "8.2" "10.0" "10.0" "13.8" "8.9" 
 [61] "5.6" "6.6" "6.9" "6.5" "5.4" "4.4" "5.9" "6.4" "4.9" "6.1" 
 [71] "11.0" "12.3" "10.1" "8.3" "10.8" "12.1" "15.4" "11.2" "8.1" "10.1"
 [81] "9.3" "11.6" "13.6" "16.6" "16.4" "12.6" "14.8" "17.7" "16.5" "15.7"
 [91] "13.9" "15.2" "13.8" "15.3" "11.4" "9.0" "11.4" "15.8" "15.5" "16.1"
[101] "12.9" "14.6" "14.4" "14.0" "16.3" "18.7" "17.7" "11.8" "11.7" "11.0"
[111] "13.4" "15.2" "16.2" "16.6" "17.8" "18.2" "18.1" "18.6" "17.8" "16.7"
[121] "20.6" "21.0" "21.3" "18.6" "18.1" "14.3" "16.5" "18.7" "19.3" "19.8"
[131] "20.6" "19.6" "19.3" "21.9" "19.9" "21.5" "21.1" "20.5" "21.1" "21.7"
[141] "17.5" "18.6" "17.8" "20.6" "22.7" "21.3" "21.4" "22.7" "23.1" "24.1"
[151] "25.6" "26.6" "25.5" "24.0" "24.3" "21.2" "19.9" "18.4" "20.5" "22.9"
[161] "23.5" "22.2" "22.0" "24.9" "24.7" "24.4" "25.0" "24.3" "23.2" "24.8"
[171] "24.6" "24.6" "22.5" "24.2" "22.6" "22.6" "23.6" "23.9" "22.8" "23.3"
[181] "23.6" "25.3" "25.8" "24.8" "21.6" "22.0" "24.2" "23.4" "26.8" "24.2"
[191] "26.6" "28.5" "29.2" "26.9" "28.5" "28.2" "28.0" "27.1" "24.9" "23.8"
[201] "24.5" "25.0" "27.7" "29.4" "29.9" "30.5" "30.5" "30.1" "27.7" "27.6"
[211] "27.9" "29.5" "29.8" "30.9" "31.0" "30.5" "31.1" "30.8" "29.7" "29.0"
[221] "25.4" "26.3" "29.4" "26.3" "26.7" "27.1" "29.5" "27.0" "26.2" "29.1"
[231] "30.3" "31.1" "30.5" "30.2" "27.6" "28.4" "27.0" "25.2" "21.2" "21.1"
[241] "23.3" "22.6" "23.2" "21.8" "24.5" "24.1" "25.2" "27.8" "27.3" "21.7"
[251] "22.1" "23.5" "23.4" "21.8" "23.5" "23.1" "23.8" "24.1" "25.1" "23.3"
[261] "22.3" "22.3" "20.5" "21.3" "22.6" "22.1" "22.4" "24.4" "22.9" "20.2"
[271] "21.7" "23.5" "24.7" "21.2" "21.7" "25.7" "24.2" "17.2" "21.3" "20.0"
[281] "20.3" "21.0" "22.4" "20.3" "18.6" "18.1" "23.1" "15.9" "16.8" "18.4"
[291] "17.5" "18.3" "19.4" "19.0" "16.1" "14.3" "16.7" "18.3" "20.1" "19.3"
[301] "15.8" "16.4" "16.8" "18.4" "16.8" "19.5" "18.0" "16.2" "15.2" "17.7"
[311] "17.7" "14.1" "14.9" "17.8" "14.5" "14.1" "15.0" "12.8" "12.7" "12.5"
[321] "12.1" "12.9" "12.1" "9.1" "12.3" "13.8" "14.1" "13.2" "11.1" "9.2" 
[331] "12.9" "14.1" "13.8" "15.1" "13.5" "10.8" "8.2" "8.8" "8.3" "5.1" 
[341] "6.2" "6.6" "7.3" "7.3" "9.6" "9.3" "7.0" "4.8" "5.3" "3.6" 
[351] "4.3" "4.0" "5.4" "5.3" "8.6" "7.1" "6.3" "5.9" "6.3" "5.1" 
[361] "4.6" "3.8" "4.2" "6.6" "8.0" 
>
これを別のオブジェクトに入れて、グラフ表示させてみます。
> TokyoDegree <- as.numeric(dataset$東京[4:368])
> summary(TokyoDegree)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
   0.30    9.10   17.70   16.64   23.30   31.10 
> plot(TokyoDegree,type="l")
とすると、
Screenshot-3
としてグラフ表示されます。
もう少し頑張って、日付とタイトル等を入れてみます。
> dataset <- read.csv("kishou2014-utf.csv",stringsAsFactors=FALSE,skip=2)
> day <- as.Date(dataset$X[4:368])
> deg <- dataset$東京[4:368]
> plot(day,deg,main="2014年 東京",xlab="日付",ylab="平均気温",type="l")
とすると、
Screenshot-4
となりました。
そのまま続けて、こんなことも簡単にできちゃいます。
> hist(as.numeric(deg),breaks=seq(-2,35,1),xlab="平均気温",ylab="日数",main="2014年東京の平均気温別日数")
Screenshot-5
同様に、
> plot(as.Date(dataset$X[4:368]),dataset$仙台.17[4:368],main="2014年 仙台",xlab="日付",ylab="平均海面気圧(hPa)",type="l")
とすると、
Screenshot-6
という感じで仙台の2014年の平均海面気圧がグラフ表示されます。
使いこなせば強力な武器になるのは間違いなさそうです・・・が、なかなか大変かも。
(でも、習得すればExcelで毎回ポチポチとやるよりはずっと速そうですが・・・)

0 件のコメント: