2012年12月17日

Threano scan function

http://deeplearning.net/software/theano/tutorial/loop.html 使用scan的好處在連結裡面已交代清楚,但是用法仍需再解釋清楚。首先從第一個範例開始,下面兩個程式都是計算A的k次方之值:
result = 1
for i in xrange(k):
    result = result * A

import theano
import theano.tensor as T
theano.config.warn.subtensor_merge_bug = False

k = T.iscalar("k")
A = T.vector("A")

def inner_fct(prior_result, A):
    return prior_result * A

# Symbolic description of the result
result, updates = theano.scan(fn=inner_fct,
                            outputs_info=T.ones_like(A),
                            non_sequences=A, n_steps=k)

'''
Scan has provided us with A ** 1 through A ** k.  
Keep only the last value. Scan notices this and 
does not waste memory saving them.
'''
final_result = result[-1]

power = theano.function(inputs=[A, k], outputs=final_result,
                      updates=updates)

print power(range(10),2)
#[  0.   1.   4.   9.  16.  25.  36.  49.  64.  81.]

  • scan當中的fn為所要執行的函數,也可使用lambda的方式來定義。
  • 第二個param outputs_info設定為大小與 A相同的矩陣,且矩陣內之值全部為1。
  • non_sequences為在scan當中不會變動之值,在此A在整個loop當中均不會變化。
  • steps為所要執行次數。

Theano shared variable

http://deeplearning.net/software/theano/tutorial/examples.html#using-shared-variables
裡面的參考範例如下:
from theano import shared
state = shared(0)
inc = T.iscalar('inc')
accumulator = function([inc], state;
                              updates=[(state, state+inc)])

  • 裡面比較特殊的部份是state為shared variable, 0為其初始值。
  • 此值可在多個function當中共用, 在程式當中可用state.get_value()的方式取其值,也可用state.set_value(val)的方式來設定其值。
  • 另一需說明的部份為function.update([shared-variable, new-expression]), 此函數必須為pair form,也可使用dict的key=value形式。
  • 此式的意義即在每次執行時,都將shared-variable.value更換成new-expression所得到的結果。
因此在執行範例後得到的結果如下:
state.get_value() #程式尚未執行,array(0)
accumulator(1)    #array(0)->array(1)
state.get_value() #array(1)
accumulator(300)  #array(1)->array(301)
state.get_value() #array(301)
#reset shared variable
state.set_value(-1)
accumulator(3)
state.get_value() #array(-1)->array(2)
如同上述,shard variable可被多個function共用,因此定義另一個decreaser對state做存取:
decrementor = function([inc], state, updates=[(state, state-inc)])
decrementor(2)
state.get_value() #array(2)->array(0)
如果要在shared variable放函數時,需改用function.given(),範例如下:
fn_of_state = state * 2 + inc
# the type (lscalar) must match the shared ariable we
# are replacing with the ``givens`` list
foo = T.lscalar() 
skip_shared = function([inc, foo], fn_of_state,
                                   givens=[(state, foo)])
skip_shared(1, 3)  # we're using 3 for the state, not state.value
state.get_value()  # old state still there, but we didn't use it
#array(0)
雖然上述的函數相當方便,但文件中未提到是否會有race condition的情形發生。
http://deeplearning.net/software/theano/tutorial/aliasing.html
在understanding memory aliasing for speed and correctness這一節中,提到了theano有自已管理記憶體的機制(pool),而theano會管理pool中變數之變動。

  • theano的pool中的變數與python的變數位於不同的memory space,因此不會互相衝突
  • Theano functions only modify buffers that are in Theano’s memory space.
  • Theano's memory space includes the buffers allocated to store shared variables and the temporaries used to evaluate functions.
  • Physically, Theano's memory space may be spread across the host, a GPU device(s), and in the future may even include objects on a remote machine.
  • The memory allocated for a shared variable buffer is unique: it is never aliased to anothershared variable.
  • Theano's managed memory is constant while Theano functions are not running and Theano's library code is not running.
  • The default behaviour of a function is to return user-space values for outputs, and to expect user-space values for inputs.
The distinction between Theano-managed memory and user-managed memory can be broken down by some Theano functions (e.g. sharedget_value and the constructors for In and Out) by using aborrow=True flag. This can make those methods faster (by avoiding copy operations) at the expense of risking subtle bugs in the overall program (by aliasing memory).

Theano gpu setting

根據官方網站的設定:

如果要改用gpu而不是使用cpu來計算函數,必須在import theano之前就先設定,方法有兩種:
  1.  在$HOME/.theanorc中設定
  2. 在環境變數THEANO_FLAGS中設定

而在eclipse的開發環境中,如果要對不同的檔案使用不同的設定,使用方法2較有彈性,執行設定方法如下:
  1. 切換到要執行的檔案,選擇上方的Run->Run configrations->Environment->New,然後Name中填入THEANO_FLAGS,values中填入floatX=float32,device=gpu後, 按下方的apply
  2. 因為eclipse似乎無法正確讀入環境變數中的PATH設定,因此要在同一畫面中加入name=PATH, values=usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/cuda/bin:/usr/local/cuda/bin (依cuda安裝位置而定).

之後在程式當中即可正確使用gpu來計算。
也可用print theano.config來確定設定正確。