1. 政府資料開放平臺
2. 臺北市政府資料開放平台
3. 高雄市政府資料開放平台
4. 宜蘭縣政府開放資料平台
5. 臺南市政府資料開放平台
6. 新北市政府資料開放平台
3. 高雄市政府資料開放平台
4. 宜蘭縣政府開放資料平台
5. 臺南市政府資料開放平台
6. 新北市政府資料開放平台
第一代 UberOS 資戰機 C 型 (Ubuntu Server 版)
首次加入 Spark 閃光主動陣列雷達 及 Apache Tez 渦輪增壓系統, 運行效能立即躍升一級, 資戰機的電戰系統更提升為 Hadoop 2.7.2, 資戰機結構圖如下 :
UberOS 資戰機下載網址如下 :
https://docs.google.com/uc?id=0ByAESZ_C1fg-bFZjdGVQQnkyQ0E&export=download
2. Docker 發動機 (Container)
3. 資戰機 首航 (Hadoop 叢集系統)
4. Hadoop 電戰系統 (HDFS,YARN)
5. ETL 閃舞雷達系統 (Pig)
下載網址 (2017/03/01 更新)
1. 2017/03/01 修正 putty 無法連接問題 (對外連接網卡設定錯誤)
建置與規劃 Hadoop 核心系統
請開啟終端機, 執行以下步驟 :
1. 建立 Hadoop A 叢集的所有貨櫃主機
$ dkcreate a
yes/no : yes
cla00 created : dsa100 dsb100 dsc100
cla01 created : dsa101 dsb101 dsc101
hbma created
nginx created
nna created
rma created
spkma created
wka01 created
wka02 created
zka01 created
2. 檢視 Hadoop A 叢集的系統架構
$ dkls a
Docker Utility 0.6.0 (2017/02/01)
[A Cluster]
--------------------------------------------------------------------------------------------
zka01(a29d40f5af01) 172.17.6.30 Exited ()
wka02(8c441c83f381) 172.17.8.11 Exited ()
wka01(a7ad549704d2) 172.17.8.10 Exited ()
spkma(4d9d1e69181d) 172.17.6.20 Exited ()
rma(8dff19f0d4fe) 172.17.6.12 Exited ()
nna(968ee9420d20) 172.17.6.10 Exited ()
nginx(dc35fb5f597a) 172.17.7.20 Exited (CVBG:80->nginx:80)
hbma(90effcbad3cf) 172.17.6.30 Exited ()
cla01(17c404486d66) 172.17.2.11 Exited (CVBG:22101->cla01:22, user:dsa101 dsb101 dsc101)
cla00(08fb31ee111b) 172.17.2.10 Exited (CVBG:22100->cla00:22, user:dsa100 dsb100 dsc100)
[Docker Images]
--------------------------------------------------------------------------------------------
dafu/worker 16.04 523546e962f0 3 weeks ago 1.92 GB
3. 格式化 Hadoop A 叢集的 HDFS
$ formathdfs a jedi
format (yes/no) yes
Name Node (nna) format ok
Secondary NameNode (nna) ok
DataNode (wka01) ok
DataNode (wka02) ok
4. 建立 Hadoop A 叢集的 Data Lake
$ createdlka
start HDFS ok
[create HDFS schema]
/elt (bigred:bigdata,750)
/dataset (bigred:bigdata,750)
/app (bigred:bigdata,750)
/metadata (bigred:bigdata,750)
/tmp (777)
[create Hadoop users]
(HDFS) /user dir created
(nna) bigdata group created
(nna) dsa100 -> /user/dsa100 created
(nna) dsb100 -> /user/dsb100 created
(nna) dsc100 -> /user/dsc100 created
(nna) dsa101 -> /user/dsa101 created
(nna) dsb101 -> /user/dsb101 created
(nna) dsc101 -> /user/dsc101 created
(nna) dsa150 -> /user/dsa150 created
(nna) dsa151 -> /user/dsa151 created
stop HDFS ok
開始大數據資料分析
請開啟終端機, 執行以下步驟 :
1. 啟動 Hadoop 核心系統
$ starthd a
[Cluster A]
start Application Container ok
start HDFS ok
start YARN ok
2. 檢視 Hadoop 核心系統資訊
$ dkls a
Docker Utility 0.6.0 (2017/02/01)
[A Cluster]
--------------------------------------------------------------------------------------------
zka01(e335072afdbf) 172.17.6.30 Running ()
wka02(ecc0eb4a695b) 172.17.8.11 Running ( NodeManager DataNode )
wka01(7d1c16a590a6) 172.17.8.10 Running ( NodeManager DataNode )
spkma(92b188924db1) 172.17.6.20 Running ()
rma(3e2dec0dd023) 172.17.6.12 Running ( ResourceManager JobHistoryServer )
nna(560485ebef6f) 172.17.6.10 Running ( NameNode SecondaryNameNode )
nginx(0d07f7419af6) 172.17.7.20 Running (CVN79:80->nginx:80)
hbma(76eaa30d57c6) 172.17.6.30 Running ()
cla01(2e2416304852) 172.17.2.11 Running (CVN79:22101->cla01:22, user:dsa101 dsb101 dsc101)
cla00(cfe443a20ab7) 172.17.2.10 Running (CVN79:22100->cla00:22, user:dsa100 dsb100 dsc100)
[Docker Images]
--------------------------------------------------------------------------------------------
dafu/worker 16.04 8a5745347502 12 days ago 1.93 GB
3. 確認 HDFS 運作資訊
$ hdfs dfsadmin -printTopology
Rack: /default-rack
172.17.8.10:50010 (wka01)
172.17.8.11:50010 (wka02)
4. 確認 YARN 運作資訊
$ yarn node -list -all
17/01/01 19:08:26 INFO client.RMProxy: Connecting to ResourceManager at rma/172.17.6.12:8032
17/01/01 19:08:26 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Total Nodes:2
Node-Id Node-State Node-Http-Address Number-of-Running-Containers
wka01:44971 RUNNING wka01:8042 0
wka02:36045 RUNNING wka02:8042 0
5. 登入 Hadoop Client 主機
$ dslogin dsa101
Welcome to Ubuntu 14.04.5 LTS (GNU/Linux 4.4.0-62-generic x86_64)
* Documentation: https://help.ubuntu.com/
The programs included with the Ubuntu system are free software;
the exact distribution terms for each program are described in the
individual files in /usr/share/doc/*/copyright.
Ubuntu comes with ABSOLUTELY NO WARRANTY, to the extent permitted by
applicable law.
build derby database ... ok
6. 執行 MapReduce 程式
$ hadoop jar /opt/hadoop-2.8.0/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.8.0.jar pi 2 10000
:::
Job Finished in 85.291 seconds
Estimated value of Pi is 3.14280000000000000000
7. 使用 Pig 資料分析工具
$ hdfs dfs -put /opt/dataset/customer.csv
$ pig -e 'ls' 2>/dev/null
hdfs://nna:8020/user/dsa101/customer.csv<r 2> 695
$ pig 2>/dev/null
grunt> a = load 'customer.csv' using PigStorage(',');
grunt> store a into 'customer' using PigStorage(',');
grunt> ls
hdfs://nna:8020/user/dsa101/customer <dir>
hdfs://nna:8020/user/dsa101/customer.csv<r 2> 695
grunt> quit
8. 使用 Hive 資料倉儲工具
$ nano customer.sql
CREATE EXTERNAL TABLE CUSTOMER (
cid string,
name string,
fname string,
age int,
occupation string
)
ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
STORED AS TEXTFILE LOCATION '/user/dsa101/customer';
$ hive -S -f customer.sql 2>/dev/null
$ hive -S -e 'select * from CUSTOMER limit 2' 2>/dev/null
4000001 Kristina Chung 55 Pilot
4000002 Paige Chen 74 Teacher
9. 離開 Hadoop Client 主機
$ exit
logout
Connection to cla01 closed.
開始使用大數據資料庫 HBase
必須先啟動 Hadoop 核心系統 (starthda)
1. 啟動 HBase 系統
$ starthba
starting master, logging to /tmp/hbase-bigred-master-zka01.out
ZooKeeper JMX enabled by default
Using config: /opt/zookeeper-3.4.9/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
starting regionserver, logging to /tmp/hbase-bigred-regionserver-wka01.out
starting regionserver, logging to /tmp/hbase-bigred-regionserver-wka02.out
HBaseFsck command line options: -metaonly
Version: 1.2.4
Number of live region servers: 2
Number of dead region servers: 0
Master: zka01,16000,1486623999995
Number of backup masters: 0
Average load: 1.0
Number of requests: 0
Number of regions: 2
Number of regions in transition: 0
Number of empty REGIONINFO_QUALIFIER rows in hbase:meta: 0
Summary:
Table hbase:meta is okay.
Number of regions: 1
Deployed on: wka02,16020,1486624008071
0 inconsistencies detected.
Status: OK
0 row(s) in 3.6750 seconds
nil
$ echo "drop 'customers'" | hbase shell -n 2>/dev/null
0 row(s) in 2.8300 seconds
nil
$ echo "list" | hbase shell -n 2>/dev/null
TABLE
0 row(s) in 1.2070 seconds
6. 關閉 HBase 系統
$ stophba
stopping master.
ZooKeeper JMX enabled by default
Using config: /opt/zookeeper-3.4.9/bin/../conf/zoo.cfg
Stopping zookeeper ... STOPPED
stopping regionserver.............................................
stopping regionserver...................
7. 關閉 Hadoop 核心系統
$ stophd a
[Cluster A]
stop YARN ok
stop HDFS ok
stop Application Container ok
Hadoop 開發特戰包 (Windows 版) 操作手冊
下載網址如下 :
https://docs.google.com/uc?id=0ByAESZ_C1fg-T05xd1ZZbXdjZGc&export=download
1. 工具安裝與設定
2. 連接 UberOS 資戰機 C 型
3. 使用開發工具 Eclipse
首次加入 Spark 閃光主動陣列雷達 及 Apache Tez 渦輪增壓系統, 運行效能立即躍升一級, 資戰機的電戰系統更提升為 Hadoop 2.7.2, 資戰機結構圖如下 :
系統需求
1. 雙核心 CPU
2. 至少 8G 記憶體
3. 微軟 64 位元作業系統 (Windows 7/8/10)
4. VMware Workstation 12.1.1 Player(這是免費軟體), 下載網址如下 :UberOS 資戰機下載網址如下 :
https://docs.google.com/uc?id=0ByAESZ_C1fg-bFZjdGVQQnkyQ0E&export=download
操作手冊
1. 認識 雲端 UberOS 資戰機 (UberOS271.zip)2. Docker 發動機 (Container)
3. 資戰機 首航 (Hadoop 叢集系統)
4. Hadoop 電戰系統 (HDFS,YARN)
5. ETL 閃舞雷達系統 (Pig)
第一代 UberDL 航空母艦 (Ubuntu 16.04 桌面版+Docker)
下載網址 (2017/03/01 更新)
https://docs.google.com/uc?id=0ByAESZ_C1fg-SDc4eXZhMWNKVTg&export=download
1. 2017/03/01 修正 putty 無法連接問題 (對外連接網卡設定錯誤)
建置與規劃 Hadoop 核心系統
請開啟終端機, 執行以下步驟 :
1. 建立 Hadoop A 叢集的所有貨櫃主機
$ dkcreate a
yes/no : yes
cla00 created : dsa100 dsb100 dsc100
cla01 created : dsa101 dsb101 dsc101
hbma created
nginx created
nna created
rma created
spkma created
wka01 created
wka02 created
zka01 created
2. 檢視 Hadoop A 叢集的系統架構
$ dkls a
Docker Utility 0.6.0 (2017/02/01)
[A Cluster]
--------------------------------------------------------------------------------------------
zka01(a29d40f5af01) 172.17.6.30 Exited ()
wka02(8c441c83f381) 172.17.8.11 Exited ()
wka01(a7ad549704d2) 172.17.8.10 Exited ()
spkma(4d9d1e69181d) 172.17.6.20 Exited ()
rma(8dff19f0d4fe) 172.17.6.12 Exited ()
nna(968ee9420d20) 172.17.6.10 Exited ()
nginx(dc35fb5f597a) 172.17.7.20 Exited (CVBG:80->nginx:80)
hbma(90effcbad3cf) 172.17.6.30 Exited ()
cla01(17c404486d66) 172.17.2.11 Exited (CVBG:22101->cla01:22, user:dsa101 dsb101 dsc101)
cla00(08fb31ee111b) 172.17.2.10 Exited (CVBG:22100->cla00:22, user:dsa100 dsb100 dsc100)
[Docker Images]
--------------------------------------------------------------------------------------------
dafu/worker 16.04 523546e962f0 3 weeks ago 1.92 GB
3. 格式化 Hadoop A 叢集的 HDFS
$ formathdfs a jedi
format (yes/no) yes
Name Node (nna) format ok
Secondary NameNode (nna) ok
DataNode (wka01) ok
DataNode (wka02) ok
4. 建立 Hadoop A 叢集的 Data Lake
$ createdlka
start HDFS ok
[create HDFS schema]
/elt (bigred:bigdata,750)
/dataset (bigred:bigdata,750)
/app (bigred:bigdata,750)
/metadata (bigred:bigdata,750)
/tmp (777)
[create Hadoop users]
(HDFS) /user dir created
(nna) bigdata group created
(nna) dsa100 -> /user/dsa100 created
(nna) dsb100 -> /user/dsb100 created
(nna) dsc100 -> /user/dsc100 created
(nna) dsa101 -> /user/dsa101 created
(nna) dsb101 -> /user/dsb101 created
(nna) dsc101 -> /user/dsc101 created
(nna) dsa150 -> /user/dsa150 created
(nna) dsa151 -> /user/dsa151 created
stop HDFS ok
請開啟終端機, 執行以下步驟 :
1. 啟動 Hadoop 核心系統
$ starthd a
[Cluster A]
start Application Container ok
start HDFS ok
start YARN ok
2. 檢視 Hadoop 核心系統資訊
$ dkls a
Docker Utility 0.6.0 (2017/02/01)
[A Cluster]
--------------------------------------------------------------------------------------------
zka01(e335072afdbf) 172.17.6.30 Running ()
wka02(ecc0eb4a695b) 172.17.8.11 Running ( NodeManager DataNode )
wka01(7d1c16a590a6) 172.17.8.10 Running ( NodeManager DataNode )
spkma(92b188924db1) 172.17.6.20 Running ()
rma(3e2dec0dd023) 172.17.6.12 Running ( ResourceManager JobHistoryServer )
nna(560485ebef6f) 172.17.6.10 Running ( NameNode SecondaryNameNode )
nginx(0d07f7419af6) 172.17.7.20 Running (CVN79:80->nginx:80)
hbma(76eaa30d57c6) 172.17.6.30 Running ()
cla01(2e2416304852) 172.17.2.11 Running (CVN79:22101->cla01:22, user:dsa101 dsb101 dsc101)
cla00(cfe443a20ab7) 172.17.2.10 Running (CVN79:22100->cla00:22, user:dsa100 dsb100 dsc100)
[Docker Images]
--------------------------------------------------------------------------------------------
dafu/worker 16.04 8a5745347502 12 days ago 1.93 GB
$ hdfs dfsadmin -printTopology
Rack: /default-rack
172.17.8.10:50010 (wka01)
172.17.8.11:50010 (wka02)
4. 確認 YARN 運作資訊
$ yarn node -list -all
17/01/01 19:08:26 INFO client.RMProxy: Connecting to ResourceManager at rma/172.17.6.12:8032
17/01/01 19:08:26 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Total Nodes:2
Node-Id Node-State Node-Http-Address Number-of-Running-Containers
wka01:44971 RUNNING wka01:8042 0
wka02:36045 RUNNING wka02:8042 0
5. 登入 Hadoop Client 主機
$ dslogin dsa101
Welcome to Ubuntu 14.04.5 LTS (GNU/Linux 4.4.0-62-generic x86_64)
* Documentation: https://help.ubuntu.com/
The programs included with the Ubuntu system are free software;
the exact distribution terms for each program are described in the
individual files in /usr/share/doc/*/copyright.
Ubuntu comes with ABSOLUTELY NO WARRANTY, to the extent permitted by
applicable law.
build derby database ... ok
6. 執行 MapReduce 程式
$ hadoop jar /opt/hadoop-2.8.0/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.8.0.jar pi 2 10000
:::
Job Finished in 85.291 seconds
Estimated value of Pi is 3.14280000000000000000
7. 使用 Pig 資料分析工具
$ hdfs dfs -put /opt/dataset/customer.csv
$ pig -e 'ls' 2>/dev/null
hdfs://nna:8020/user/dsa101/customer.csv<r 2> 695
$ pig 2>/dev/null
grunt> a = load 'customer.csv' using PigStorage(',');
grunt> store a into 'customer' using PigStorage(',');
grunt> ls
hdfs://nna:8020/user/dsa101/customer <dir>
hdfs://nna:8020/user/dsa101/customer.csv<r 2> 695
grunt> quit
8. 使用 Hive 資料倉儲工具
$ nano customer.sql
CREATE EXTERNAL TABLE CUSTOMER (
cid string,
name string,
fname string,
age int,
occupation string
)
ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
STORED AS TEXTFILE LOCATION '/user/dsa101/customer';
$ hive -S -f customer.sql 2>/dev/null
$ hive -S -e 'select * from CUSTOMER limit 2' 2>/dev/null
4000001 Kristina Chung 55 Pilot
4000002 Paige Chen 74 Teacher
9. 離開 Hadoop Client 主機
$ exit
logout
Connection to cla01 closed.
開始使用大數據資料庫 HBase
必須先啟動 Hadoop 核心系統 (starthda)
1. 啟動 HBase 系統
$ starthba
starting master, logging to /tmp/hbase-bigred-master-zka01.out
ZooKeeper JMX enabled by default
Using config: /opt/zookeeper-3.4.9/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
starting regionserver, logging to /tmp/hbase-bigred-regionserver-wka01.out
starting regionserver, logging to /tmp/hbase-bigred-regionserver-wka02.out
2. 檢測 HBase 系統
$ hbase hbck -metaonly 2>/dev/nullHBaseFsck command line options: -metaonly
Version: 1.2.4
Number of live region servers: 2
Number of dead region servers: 0
Master: zka01,16000,1486623999995
Number of backup masters: 0
Average load: 1.0
Number of requests: 0
Number of regions: 2
Number of regions in transition: 0
Number of empty REGIONINFO_QUALIFIER rows in hbase:meta: 0
Summary:
Table hbase:meta is okay.
Number of regions: 1
Deployed on: wka02,16020,1486624008071
0 inconsistencies detected.
Status: OK
3. 建立 customers 資料表
$ echo "create 'customers', 'customers_data'" | hbase shell -n 2>/dev/null
0 row(s) in 5.7480 seconds
Hbase::Table - customers
4. 計算 customers 資料筆數
$ echo "count 'customers'" | hbase shell -n 2>/dev/null
0 row(s) in 1.3280 seconds
0
5. 刪除 customers 資料表
$ echo "disable 'customers'" | hbase shell -n 2>/dev/null0 row(s) in 3.6750 seconds
nil
$ echo "drop 'customers'" | hbase shell -n 2>/dev/null
0 row(s) in 2.8300 seconds
nil
$ echo "list" | hbase shell -n 2>/dev/null
TABLE
0 row(s) in 1.2070 seconds
$ stophba
stopping master.
ZooKeeper JMX enabled by default
Using config: /opt/zookeeper-3.4.9/bin/../conf/zoo.cfg
Stopping zookeeper ... STOPPED
stopping regionserver.............................................
stopping regionserver...................
$ stophd a
[Cluster A]
stop YARN ok
stop HDFS ok
stop Application Container ok
Hadoop 開發特戰包 (Windows 版) 操作手冊
下載網址如下 :
https://docs.google.com/uc?id=0ByAESZ_C1fg-T05xd1ZZbXdjZGc&export=download
1. 工具安裝與設定
2. 連接 UberOS 資戰機 C 型
3. 使用開發工具 Eclipse
沒有留言:
張貼留言