2013年10月30日 星期三

雜事


2013.10.30

錶耳卡住新買的外套毛線,拿到 RSC 請師傅把毛線清一清,順便註冊。

服務小姐手上掛著迷你版的黑水鬼,頗有意思。



Mini-Lua 繼續卡在 symbol table,不知道怎麼實作比較好,

特別是 array 的部分,可以想想。

2013年10月20日 星期日

La Guitarra - Los Autenticos Decadentes


Tuve un problema de difícil solución,
en una época difícil de mi vida.
Estaba entre la espada y la pared,
y aguantando la opinión de mi familia.

Yo no quería una vida normal,
no me gustaban los horarios de oficina.
Mi espíritu rebelde se reía
del dinero, del lujo y el comfort.

Y tuve una revelación,
ya se que quiero en esta vida.
Voy a seguir mi vocación
será la música mi techo y mi comida.

Porque yo
no quiero trabajar,
no quiero ir a estudiar,
no me quiero casar.

Quiero tocar la guitarra todo el día,
y que la gente se enamore de mi voz.

Porque yo
no quiero trabajar,
no quiero ir a estudiar,
no me quiero casar.

Y en la cabeza tenia
la voz de mi viejo,
que me sonaba como
un rulo de tambor.

Vos, mejor que te afeites,
mejor que madures, mejor que labores.
Ya me cansé de que me tomes la cerveza,
te voy a dar con la guitarra en la cabeza.

Vos, mejor que te afeites,
mejor que madures, mejor que labores.
Ya me cansé de ser tu fuente de dinero,
voy a ponerte esa guitarra de sombrero.

Y tuve una revelación,
ya se que quiero en esta vida.
Voy a seguir mi vocación
será la música mi techo y mi comida.

Porque yo
no quiero trabajar,
no quiero ir a estudiar,
no me quiero casar.

Quiero tocar la guitarra todo el día,
y que la gente se enamore de mi voz.

Porque yo
no quiero trabajar,
no quiero ir a estudiar,
no me quiero casar.

Y en la cabeza tenia
la voz de mi viejo,
que me sonaba como
un rulo de tambor.

Vos, mejor que te afeites,
mejor que madures, mejor que labores.
Ya me cansé de que me tomes la cerveza,
te voy a dar con la guitarra en la cabeza.

Vos, mejor que te afeites,
mejor que madures, mejor que labores.
Ya me cansé de ser tu fuente de dinero,
voy a ponerte esa guitarra de sombrero.

2013年10月15日 星期二

Communicate your intentions through your code


Book: Implementation Patterns: Kent Beck



很棒的書,特別是寫了幾年的爛東西後,看這個特別有感覺。

  • Programs are read more often than they are written.
  • There is no such thing as “done”.  Much more investment will be spent modifying programs than developing them initially.
  • They are structured using a basic set of state and control flow concepts.
  • Readers need to understand programs in detail and in concept.  Sometimes they move from detail to concept, sometimes from concept to detail.

Using patterns helps programmers write reasonable solutions to common problems, leaving more time, energy, and creativity to apply to the truly unique problems.


Three values that are consistent with excellence in programming are
  • Communication
  • Simplicity
  • Flexibility

The implementation patterns aren’t the way they are “just because”. Each one expresses one or more of the values of communication, simplicity, and flexibility.  Principles are another level of general ideas, more specific to programming than the values, that also form the foundation of the patterns.


Several principles:
  • Local Consequences
  • Minimize Repetition
  • Logic and Data Together
  • Symmetry
  • Declarative Expression
  • Rate of Change



2013年10月14日 星期一

SQLite Compiler


The Architecture Of SQLite [http://www.sqlite.org/arch.html]


SQL Command Processor 可以往 source code 裡面看:
CAPI3REF: Compiling An SQL Statement
KEYWORDS: {SQL statement compiler}

The sqlite3_prepare function compiles an SQL statement, and produces an equivalent internal object. This object is commonly referred to as a prepared statement in database literature, and is implemented as a bytecode program in SQLite. A bytecode program is an abstract representation of an SQL statement that is run on a virtual machine or an interpreter. For more information, see the later section, Bytecode Programming Language. I use the terms bytecode program and prepared statement interchangeably in this book.


看圖片 SQL compiler 可以再分成三段:
  1. Tokenizer
  2. Parser
  3. Code generator

接下來就往 source code 裡面看進去了。繼續努力!
SQLITE_API int sqlite3_prepare(
  sqlite3 *db,            /* Database handle */
  const char *zSql,       /* SQL statement, UTF-8 encoded */
  int nByte,              /* Maximum length of zSql in bytes. */
  sqlite3_stmt **ppStmt,  /* OUT: Statement handle */
  const char **pzTail     /* OUT: Pointer to unused portion of zSql */
);

註解寫的蠻詳細的

((應該使用 sqlite3_prepare_v2,不過因為相容性,所以舊的 API 也會留下來))

現在追到 sqlite3LockAndPrepare,看字面有 lock ((mutex))。



Lock 的 methods 有一對,sqlite3_mutex_enter() 以及 sqlite3_mutex_leave()。

sqlite3BtreeEnterAll() 及 sqlite3BtreeLeaveAll() 就是 backend,裡面有三個主要 modules,B+ tree,Pager,以及 OS Interface ((可以想像 B+ tree 是最頂層的抽象資料結構,每個 file fragments 以 pagers 抽象化,最後 implementor 當然就是 OS 的 file system,這個就看是 Windows 或者是 Unix 或者是你想的到的))。

((這是我的猜測與想像,B+ tree 應該是這樣子的吧?!))



鬼扯一堆,最重要的還是 sqlite3Prepare(),這個得仔細看下去了!
  1. Initialize.
  2. Try to get a read lock on all database schemas.
  3. sqlite3RunParser()
  4. sqlite3VdbeSetSql()



然後這個應該是關鍵:LEMON-generated LALR(1) parser

The Lemon Parser Generator [http://www.sqlite.org/src/doc/trunk/doc/lemon.html]



The main parser program: sqlite3Parser()


[FWD] TimeUnit


http://docs.oracle.com/javase/7/docs/api/java/util/concurrent/TimeUnit.html



A TimeUnit represents time durations at a given unit of granularity and provides utility methods to convert across units, and to perform timing and delay operations in these units.



這是很不錯的設計。

有時候寫程式常常忘記用 second 為單位,或者是用 millisecond 為單位。

何不讓 code 變成 oral language?

2013年10月11日 星期五

[FWD] Bloom filters

WIKI: http://en.wikipedia.org/wiki/Bloom_filter

Tutorial: http://billmill.org/bloomfilter-tutorial/

實作細節不外乎 bloom filter 變形以及 hash functions 的選擇。



為什麼會講到 bloom filters?

因為雲端病毒碼就是利用這個技術減少 signatures 的大小。

如果能夠穿過許多道 bloom filters 引擎才上網把完整的 signatures 抓下來。

2013年10月10日 星期四

解釋過後好像也沒有比較好


回頭看傻事好有感覺。

那是歷史痕跡不容扭曲。



早已經雲淡風輕。

骨頭裂掉再黏回去就會凸出一塊,轉轉骨頭喀喀啦啦響,那就是痕跡。

以前機車代步,現在公車、捷運、計程車是我的三寶,

每星期騎機車只是為了讓車可以發動。



我也不好意思說有的沒的,倒不是念舊,而是不想讓自己那麼仇恨。

《半生緣》兩人歷盡千帆後再相遇,卻惟有感慨萬千:「世鈞,我們回不過去了。」

這結局才是最美麗的!



至於最近發生的事,暫且不提,反正也是回不過去了。

至於現在發生的事,偷偷告訴大家,完全是個沒進展。

2013年10月9日 星期三

ProjectEuler 185 -- TLE on backtracking


Backtracking 只是比較聰明的暴力法,

所以 time complexity 大約還是 10^16。

import sys

DIGIT_COUNT = 5

#DIGIT_COUNT = 15

ANSWER = [0] * DIGIT_COUNT

GUESSES = { 
    "90342" : 2, 
    "70794" : 0, 
    "39458" : 2, 
    "34109" : 1, 
    "51545" : 2, 
    "12531" : 1,
}

#GUESSES = {
#    "5616185650518293" : 2,
#    "3847439647293047" : 1,
#    "5855462940810587" : 3,
#    "9742855507068353" : 3,
#    "4296849643607543" : 3,
#    "3174248439465858" : 1,
#    "4513559094146117" : 2,
#    "7890971548908067" : 3,
#    "8157356344118483" : 1,
#    "2615250744386899" : 2,
#    "8690095851526254" : 3,
#    "6375711915077050" : 1,
#    "6913859173121360" : 1,
#    "6442889055042768" : 2,
#    "2321386104303845" : 0,
#    "2326509471271448" : 2,
#    "5251583379644322" : 2,
#    "1748270476758276" : 3,
#    "4895722652190306" : 1,
#    "3041631117224635" : 3,
#    "1841236454324589" : 3,
#    "2659862637316867" : 2,
#}

def is_unique(digit, position):
    has_more = False

def check_guessing(dimension):
    for guess in GUESSES:
        correct_count = 0
        for i in range(dimension):
            if ANSWER[i] == int(guess[i]):
                correct_count += 1
            if correct_count > GUESSES[guess]:
                return False
        if correct_count + DIGIT_COUNT - dimension < GUESSES[guess]:
            return False
    return True

def backtracking(dimension):
    if not check_guessing(dimension):
        return
    if dimension == DIGIT_COUNT:
        print('Found =>', ANSWER[0:dimension])
        return
    for i in range(10):
        ANSWER[dimension] = i
        backtracking(dimension + 1)    
    
def solve():
    for i in range(10):
        ANSWER[0] = i
        backtracking(1)
    
def main():
    solve()
    
if __name__ == '__main__':
    sys.exit(main())

2013年10月6日 星期日

[Java] insertion sort & merge sort


要熟悉某個語言,最簡單的方式就是寫一次基本的演算法。

此外,將來要面試科技公司,基本演算法更要知道怎麼寫,

怎麼用原子筆寫,這很重要。



以某科技公司為例,主管當然知道考古題的存在

主管會認為題目已經外洩沒理由考不好,但偏偏就是有面試者考不好,

而且這間公司好的標準異常寬鬆,100分拿35分就可以了



有時候誠意真的很重要。

如果我是主管的話,正妹直接錄取 ((逃))((當作在選後宮))



正題:

Insertion sort,對於小的 array 可以用這個實做

merge sort,divide and conquer 基本應用



public class Sorting {
public static void insertionSort(int[] array) {
for (int j = 1; j < array.length; j++) {
int key = array[j];
int i = j - 1;
while (i >= 0 && array[i] > key) {
array[i+1] = array[i];
i--;
}
array[i + 1] = key;
}
}

public static void mergeSort(int[] array) {
mergeSort(array, 0, array.length - 1);
}

private static void mergeSort(int[] array, int p, int r) {
if (p >= r) {
return;
}
int q = (p + r)/2;
mergeSort(array, p, q);
mergeSort(array, q + 1, r);
merge(array, p, q, r);
}

private static void merge(int[] array, int p, int q, int r) {
int n1 = q - p + 1;
int n2 = r - q;

int[] left = new int[n1 + 1];
int[] right = new int[n2 + 1];

for (int k = 0; k < n1; k++) {
left[k] = array[p + k];
}
for (int k = 0; k < n2; k++) {
right[k] = array[q + 1 + k];
}
left[n1] = Integer.MAX_VALUE;
right[n2] = Integer.MAX_VALUE;

int i = 0;
int j = 0;
for (int k = p; k <= r; k++) {
if (left[i] <= right[j]) {
array[k] = left[i];
i++;
}
else {
array[k] = right[j];
j++;
}
}
}
}

2013年10月1日 星期二

[Programming] TIPS TO BE COMPETITIVE

Question 1

There are n webpages (1 ≤ n ≤ 10M). Each webpage i has different page rank r_i. You want to pick top 10 pages with highest page ranks. Which method is more feasible?
(a) Load all n webpages’ page rank to memory, sort, and pick top 10.
(b) Use priority queue data structure (heap).

Question 2

Given a list L of up to 10K integers, you want to frequently ask the value of sum(i, j), i.e. the sum of L[i] + L[i+1] + ... + L[j]. Which data structure should you use?
(a) Simple Array.
(b) Balanced Binary Search Tree.
(c) Hash Table.
(d) Segment Tree.
(e) Suffix Tree.
(f) Simple Array that is pre-processed with Dynamic Programming.

Question 3

You have to compute the ‘shortest path’ between two vertices on a weighted Directed Acyclic Graph (DAG) with |V |, |E| ≤ 100K. Which algorithm(s) can be used?
(a) Dynamic Programming + Topological Sort.
(b) Breadth First Search.
(c) Dijkstra’s.
(d) Bellman Ford’s.
(e) Floyd Warshall’s.

Question 4

Which algorithm is faster (based on its time complexity) for producing a list of the first 10K prime numbers?
(a) Sieve of Eratosthenes.
(b) For each number i ∈ [1 − 10K], test if i is a prime with prime testing function.