顯示具有 Opendata 標籤的文章。 顯示所有文章
顯示具有 Opendata 標籤的文章。 顯示所有文章

2017年12月28日 星期四

Pig Latin 首部曲


兒童黑話Pig Latin是一種英語語言遊戲,形式是在英語上加上一點規則使發音改變。據說是由在德國的英國戰俘發明來瞞混德軍守衛的。兒童黑話於1950年代和1960年代在英國利物浦達到顛峰,各種年紀和職業的人都有使用。兒童黑話多半被兒童用來瞞著大人秘密溝通,有時則只是說著好玩。雖然是起源於英語的遊戲,但是規則適用很多其他語言。


Pig Latin 基本資料型態
Int: An integer. Ints are represented in interfaces by java.lang.Integer. They store a four byte signed integer. Constant integers are expressed as integer numbers, for example 12.
Long: A long integer. Long are represented in interfaces by java.lang.Long. They store a eight byte signed integer. Constants are expressed as integer numbers with a L appended, for example 34L.
Float: A floating point number. Floats are represented in interfaces by java.lang.Float. They store a four byte floating point number. Constants are represented as floating point numbers with f appended, for example, 2.18f.
Double: A double precision floating point number. Doubles are represented in interfaces by java.lang.Double. They store a eight byte floating point number. Constants are represented either as floating point numbers or in exponent notation, for example, 32.12567 or 3e-17.
Chararray: A string or array of characters. Represented in interfaces by java.lang.String. Constant chararrays are represented by single quotes, for example, 'constant chararray'.
Bytearray: A blob or array of bytes. Represented by java class DataByteArray which wraps a java byte[]. There is no way to specify a bytearray constant.


Pig 命令類型
Pig 所使用的指令稱為 Pig Latin Statements,執行可以簡單分成三個步驟
1. 使用 LOAD 讀取資料
2. 一連串操作資料的指令
3. 使用 DUMP 來看結果或用 STORE 把結果存起來。如果不執行 DUMP STORE 是不會產生任何 MapReduce job
可再細分指令的類型
讀取 : LOAD
儲存 : STORE
資料處理 : FILTER, FOREACH, GROUP, COGROUP, inner JOIN, outer JOIN, UNION, SPLIT, …
彙總運算 : AVG, COUNT, MAX, MIN, SIZE, …
數學運算 : ABS, RANDOM, ROUND, …
字串處理 : INDEXOF, SUBSTRING, REGEX EXTRACT, …
Debug : DUMP, DESCRIBE, EXPLAIN, ILLUSTRATE
HDFS 或本機的檔案操作 : cat, ls, cp, mkdir, copyfromlocal, copyToLocal, ……


grunt> movies = LOAD 'movies_data.csv' USING PigStorage(',') as (id,name,year,rating,duration);
grunt> describe movies;
movies: {id: bytearray,name: bytearray,year: bytearray,rating: bytearray,duration: bytearray}
grunt> movies_greater_than_four = FILTER movies BY (float)rating>4.0;
2017-12-29 06:26:14,456 [main] WARN  org.apache.pig.newplan.BaseOperatorPlan - Encountered Warning IMPLICIT_CAST_TO_DOUBLE 1 time(s).
grunt> dump movies_greater_than_four;
2017-12-29 06:26:53,824 [main] WARN  org.apa...
:::

(48867,Alaska: The Last Frontier,2011,4.1,)
(48875,Brew Masters,2010,4.1,)
(49026,Cake Boss: Next Great Baker,2010,4.1,)
(49154,Gator Boys: Season 2,2012,4.1,)
(49194,Stephen Hawking's Grand Design: Season 2,2012,4.1,)
(49316,Aziz Ansari: Buried Alive (Trailer),2013,4.1,105)
(49327,Top Gear: Series 19,2013,4.2,)
(49383,Stephen Hawking's Grand Design,2012,4.1,)
(49486,Max Steel: Season 1,2013,4.1,)
(49504,Lilyhammer: Season 2 (Trailer),2013,4.5,106)
(49505,Life With Boys,2011,4.1,)
(49546,Bo Burnham: what.,2013,4.1,3614)
(49549,Life With Boys: Season 1,2011,4.1,)
(49554,Max Steel,2013,4.1,)
(49556,Lilyhammer: Season 1 (Recap),2013,4.2,194)
(49571,The Short Game (Trailer),2013,4.1,156)
(49579,Transformers Prime Beast Hunters: Predacons Rising,2013,4.2,3950)

grunt> store movies_greater_than_four into 'movies_greater_than_four.csv';

:::

Input(s):
Successfully read 49590 records (17341170 bytes) from: "hdfs://nn:8020/user/ubuntu/movies_data.csv"

Output(s):
Successfully stored 897 records (14483846 bytes) in: "hdfs://nn:8020/user/ubuntu/movies_greater_than_four.csv"

Counters:
Total records written : 897
Total bytes written : 14483846
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0

Job DAG:
job_local1980029370_0002


grunt> ls
:::
hdfs://nn:8020/user/ubuntu/QuasiMonteCarlo_1513152287029_119532194      <dir>
hdfs://nn:8020/user/ubuntu/movies_data.csv<r 3> 2893177
hdfs://nn:8020/user/ubuntu/movies_greater_than_four.csv <dir>
hdfs://nn:8020/user/ubuntu/school.txt<r 3>      20609
hdfs://nn:8020/user/ubuntu/sr   <dir>
hdfs://nn:8020/user/ubuntu/student<r 3> 105569


grunt> cat movies_greater_than_four.csv
139     Pulp Fiction    1994    4.1     9265
288     Life Is Beautiful       1997    4.2     6973
303     Mulan: Special Edition  1998    4.2     5270
465     Forrest Gump    1994    4.3     8525
491     Braveheart      1995    4.2     10658
591     White Christmas 1954    4.3     7201
673     Roman Holiday   1953    4.1     7087
690     The African Queen       1951    4.1     6312
955     The Boondock Saints     1999    4.1     6507
:::


Pig Latin 複雜資料型態

Map: A map is a chararray to data element mapping which is expressed in key-value pairs. The key should always be of type chararray and can be used as index to access the associated value. It is not necessary that all the values in a map be of the same type.

 ['
Name'#'John', 'Age'#22]
Tuple: Tuples are fixed length, ordered collection of Pig data elements. Tuples contain fields which may be of different Pig types. A tuple is analogous to a row in Sql with fields as columns.
('John', 25)

Bag: Bags are unordered collection of tuples. Since bags are unordered, we cannot reference a tuple in a bag by its position. Bags are also not required to declare a schema. In case of bags, schema describes all the tuples in the bag.
 {('John', 25), ('Nathan', 30)}

取出 5 Tuple 資料
grunt> ten = limit movies 9;
grunt> dump ten;
:::
2017-12-29 06:39:35,375 [main] INFO  org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input files to process : 1
2017-12-29 06:39:35,376 [main] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
(1,The Nightmare Before Christmas,1993,3.9,4568)
(2,The Mummy,1932,3.5,4388)
(3,Orphans of the Storm,1921,3.2,9062)
(4,The Object of Beauty,1991,2.8,6150)
(5,Night Tide,1963,2.8,5126)
(6,One Magic Christmas,1985,3.8,5333)
(7,Muriel's Wedding,1994,3.5,6323)
(8,Mother's Boys,1994,3.4,5733)
(9,Nosferatu: Original Version,1929,3.5,5651)

轉換 Tuple 資料格式
grunt> ten_trans = foreach ten generate name,year,duration;
grunt> dump ten_trans;
:::
2017-12-29 06:42:27,933 [main] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
(The Nightmare Before Christmas,1993,4568)
(The Mummy,1932,4388)
(Orphans of the Storm,1921,9062)
(The Object of Beauty,1991,6150)
(Night Tide,1963,5126)
(One Magic Christmas,1985,5333)
(Muriel's Wedding,1994,6323)
(Mother's Boys,1994,5733)
(Nosferatu: Original Version,1929,5651)

轉換 Tuple 資料為 Bag 格式
grunt> ten_group = group ten_trans by year;
grunt> dump grunt;
:::
ine.util.MapRedUtil - Total input paths to process : 1
(1921,{(Orphans of the Storm,1921,9062)})
(1929,{(Nosferatu: Original Version,1929,5651)})
(1932,{(The Mummy,1932,4388)})
(1963,{(Night Tide,1963,5126)})
(1985,{(One Magic Christmas,1985,5333)})
(1991,{(The Object of Beauty,1991,6150)})
(1993,{(The Nightmare Before Christmas,1993,4568)})
(1994,{(Muriel's Wedding,1994,6323),(Mother's Boys,1994,5733)}) 這裡有兩筆


排序 Bag 資料
grunt> a = LOAD 'movies_data.csv' USING PigStorage(',');
grunt> b = limit a 20;
grunt> dump b;
:::
ne.util.MapRedUtil - Total input paths to process : 1
(1,The Nightmare Before Christmas,1993,3.9,4568)
(2,The Mummy,1932,3.5,4388)
(3,Orphans of the Storm,1921,3.2,9062)
(4,The Object of Beauty,1991,2.8,6150)
(5,Night Tide,1963,2.8,5126)
(6,One Magic Christmas,1985,3.8,5333)
(7,Muriel's Wedding,1994,3.5,6323)
(8,Mother's Boys,1994,3.4,5733)
(9,Nosferatu: Original Version,1929,3.5,5651)
(10,Nick of Time,1995,3.4,5333)
(11,Broken Blossoms,1919,3.3,5367)
(12,Big Night,1996,3.6,6561)
(13,The Birth of a Nation,1915,2.9,12118)
(14,The Boys from Brazil,1978,3.6,7417)
(15,Big Doll House,1971,2.9,5696)
(16,The Breakfast Club,1985,4.0,5823)
(17,The Bride of Frankenstein,1935,3.7,4485)
(18,Beautiful Girls,1996,3.5,6755)
(19,Bustin' Loose,1981,3.7,5598)
(20,The Beguiled,1971,3.4,6307)
grunt> c = group b by $2;
grunt> dump c;
:::
ne.util.MapRedUtil - Total input paths to process : 1
(1915,{(13,The Birth of a Nation,1915,2.9,12118)})
(1919,{(11,Broken Blossoms,1919,3.3,5367)})
(1921,{(3,Orphans of the Storm,1921,3.2,9062)})
(1929,{(9,Nosferatu: Original Version,1929,3.5,5651)})
(1932,{(2,The Mummy,1932,3.5,4388)})
(1935,{(17,The Bride of Frankenstein,1935,3.7,4485)})
(1963,{(5,Night Tide,1963,2.8,5126)})
(1971,{(15,Big Doll House,1971,2.9,5696),(20,The Beguiled,1971,3.4,6307)})
(1978,{(14,The Boys from Brazil,1978,3.6,7417)})
(1981,{(19,Bustin' Loose,1981,3.7,5598)})
(1985,{(16,The Breakfast Club,1985,4.0,5823),(6,One Magic Christmas,1985,3.8,533                                         3)})
(1991,{(4,The Object of Beauty,1991,2.8,6150)})
(1993,{(1,The Nightmare Before Christmas,1993,3.9,4568)})
(1994,{(7,Muriel's Wedding,1994,3.5,6323),(8,Mother's Boys,1994,3.4,5733)})
(1995,{(10,Nick of Time,1995,3.4,5333)})
(1996,{(12,Big Night,1996,3.6,6561),(18,Beautiful Girls,1996,3.5,6755)})
grunt>

ubuntu@HDClient:~$ cat sortbag.pig
a = LOAD 'movies_data.csv' USING PigStorage(',');
b = limit a 20;
c = group b by $2;
d = FOREACH c {
    d1 = foreach b generate $1,$3,$4;
    d2 = order d1 by $1 desc; ##用年份來排序
    generate group, d2;
}
dump d;
ubuntu@HDClient:~$ pig -f sortbag.pig
:::
2017-12-29 07:10:05,949 [main] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
(1915,{(The Birth of a Nation,2.9,12118)})
(1919,{(Broken Blossoms,3.3,5367)})
(1921,{(Orphans of the Storm,3.2,9062)})
(1929,{(Nosferatu: Original Version,3.5,5651)})
(1932,{(The Mummy,3.5,4388)})
(1935,{(The Bride of Frankenstein,3.7,4485)})
(1963,{(Night Tide,2.8,5126)})
(1971,{(The Beguiled,3.4,6307),(Big Doll House,2.9,5696)})
(1978,{(The Boys from Brazil,3.6,7417)})
(1981,{(Bustin' Loose,3.7,5598)})
(1985,{(The Breakfast Club,4.0,5823),(One Magic Christmas,3.8,5333)})
(1991,{(The Object of Beauty,2.8,6150)})
(1993,{(The Nightmare Before Christmas,3.9,4568)})
(1994,{(Muriel's Wedding,3.5,6323),(Mother's Boys,3.4,5733)})
(1995,{(Nick of Time,3.4,5333)})
(1996,{(Big Night,3.6,6561),(Beautiful Girls,3.5,6755)})
2017-12-29 07:10:05,983 [main] INFO  org.apache.pig.Main - Pig script completed in 19 seconds and 633 milliseconds (19633 ms)






2017年11月21日 星期二

Opendate PM 2.5


Government Open Date Platform
https://data.gov.tw

Open Data EPA
https://opendata.epa.gov.tw

Open data file typ: csv, txt, JSON, XML


Two view of JSON file:



http://opendata.epa.gov.tw/ws/Data/POP00049/?$orderby=M_Time%20desc&$skip=0&$top=1000&format=json
<Please open this url by FireFox or JSON Editor Online website>


{"Epb":"嘉義縣","CNO":"Q7600375","Abbr":"南亞塑膠工業股份有限公司嘉義廠","PolNo":"P001","ItemDesc":"氮氧化物監測設施十五分鐘數據紀錄值","Item":"923 ","M_Time":"2017-11-21 18:45:00","M_Val":"83.75","Unit":"ppm    ","Code2":"10","Code2Desc":"正常排放量測值","Std":"250","Std_s":"電力設施空氣污染物排放標準"}



可以看到兩種結構組成:
物件(object)用大括號 { },分號隔開
例如:{"subject":"Math","score":80}
陣列(array)用中括號 [ ] ,分號:隔開
例如:[0,4,5,2,7,8,3]






JSON Editor Online
http://www.jsoneditoronline.org

open this url 
https://opendata.epa.gov.tw/webapi/api/rest/datastore/355000000I-000207?offset=0&limit=1000



Maxde-MacBook-Pro:taipei_PM2.5 max$ cat taipei_PM2.5_ver1.py 
#!/usr/bin/python
#-*- coding: UTF-8 -*-

import urllib, json

url = "http://opendata2.epa.gov.tw/AQI.json"
response = urllib.urlopen(url)
data_list = json.loads(response.read())

print  "城市:"
print   data_list[10]['County']
print  "地區:"
print   data_list[10]['SiteName']
print  "PM2.5指數:"
print   data_list[10]['PM2.5']
print  "狀態:"
print   data_list[10]['Status']


======

接RGB LED 進行判斷式 空氣品質->亮紅藍綠燈...
佈線圖在ppt

#!/usr/bin/python
#-*- coding: UTF-8 -*-
import urllib, json, pyfirmata
from time import sleep

#ser = serial.Serial("COM4",9600)

port = '/dev/cu.usbmodem1411'
pin1 = 11 #R
pin2 = 10 #G
pin3 = 8  #B

board = pyfirmata.Arduino(port) 
sleep(5)

url = "http://opendata2.epa.gov.tw/AQI.json"
response = urllib.urlopen(url)
data_list = json.loads(response.read())

print  "城市:"
print   data_list[3]['County']
print  "地區:"
print   data_list[3]['SiteName']
print  "PM2.5指數:"
print   data_list[3]['PM2.5']
print  "狀態:"
print   data_list[3]['Status']

    
a = int(data_list[3]['PM2.5'])
print a

#PM2.5介於0~5亮綠燈
if  a>0 and a<5:
    board.digital[pin2].write(1)
    print 'good'
    
#PM2.5介於5~10亮藍燈 
elif a>=5 and a<=10:
    board.digital[pin3].write(1)
    print 'bad'
    
#PM2.5大於10亮紅燈   
elif a>10:
    board.digital[pin1].write(1)
    print 'warning'


else:
    board.digital[pin1].write(0)
    board.digital[pin2].write(0)
    board.digital[pin3].write(0)

成功!!

2017年11月16日 星期四

Arduino, Python Firmata


$sudo pip install pySerial
$sudo pip install pyfirmata
$sudo pip install requests



upload firmata



>>> import pyfirmata
>>> pin = 13
>>> port = '/dev/cu.usbmodem1411'
>>> board = pyfirmata.Arduino(port)
>>> board.digital[pin].write(1)
>>> board.digital[pin].read()
1
>>> board.digital[pin].write(0)
>>> board.digital[pin].read()
0





1LED_Blink.py


#!/usr/bin/python



import pyfirmata
from time import sleep
pin = 13

port = '/dev/cu.usbmodem1411'
board = pyfirmata.Arduino(port) 
while True:
board.digital[pin].write(1) 
sleep(1)  
board.digital[pin].write(0) 
sleep(1)
board.exit()

Maxde-MacBook-Pro:downloads max$ python 1LED_Blink.py

跳出 Crtl+C
^CTraceback (most recent call last):
  File "1LED_Blink.py", line 11, in <module>
    sleep(1)  
KeyboardInterrupt
Maxde-MacBook-Pro:downloads max$

2console.py

#!/usr/bin/python

import pyfirmata
from time import sleep
pin = 13
port = '/dev/cu.usbmodem1411'
board = pyfirmata.Arduino(port) 
while True:
board.digital[pin].write(1) 
print(board.digital[pin].read()) 
sleep(1)  
board.digital[pin].write(0) 
print(board.digital[pin].read()) //Print out LED status
sleep(1)
board.exit()


^CTraceback (most recent call last):
  File "2console.py", line 14, in <module>
    sleep(1)
KeyboardInterrupt



4helloGUI.py  圖形化介面
#!/usr/bin/python
# -*- coding: UTF-8 -*-

# This code is supporting material for the book
# Python Programming for Arduino
# by Pratik Desai
# published by PACKT Publishing

import Tkinter

# Initialize main windows with title and size
top = Tkinter.Tk()
top.title("Hello GUI")
top.minsize(200, 30)

# Label widget
helloLabel = Tkinter.Label(top,
                           text="Hello World! It is a course這是一個課程")
helloLabel.pack()

# Start and open the window

top.mainloop()

=====7csvWriter.py=====

#!/usr/bin/python
# This code is supporting material for the book
# Python Programming for Arduino
# by Pratik Desai
# published by PACKT Publishing

import csv

data = [[1, 2, 3],['a','b','c'],['Python','for','Arduino']]

with open('PythonforArduino.csv', 'wb') as f:
    w = csv.writer(f)
    for row in data:

        w.writerow(row)


=====7csvReader.py=====


import csv


f = open ('PythonforArduino.csv', 'r')
ll = f.read()
print ll

Run:


Maxde-MacBook-Pro:Downloads max$ python 7csvReader.py 

1,2,3
a,b,c
Python,for,Arduino


=====9plotLive.py=====
顯示電阻Status





Docker Command

#1 pull images $docker pull chusiang/takaojs1607 #2 list images $docker images #3.1 run docker $docker run -it ### bash #3.2 run do...