行有餘力則以學文: 2010

2010年12月9日星期四

製作 USB 隨身碟開機方法

http://siryeh.com/module-news-display-sid-26.htm

這是正體中文的網站中，解說最正確的文章了~~，可惜圖片連結有問題~~

2010年12月7日星期二

簡報術

星期天聽了一場很棒的演講，主講人把ppt檔用得淋灕盡至，完全沒有冷場，令我想到之前曾經想學習的高橋流簡報法。巧的是這本書的中譯本也在今年上市了，晚些會花點時間看看。當然江山各有才人出，網路上也有其它相關的介紹， Presentation Zen – 書不如blog 這篇評論我覺得相當犀利，一方面提到了創意黏力學，同時也再次導引大家到簡報禪的聖堂 presentation zen 去一窺堂奧。

2010年12月4日星期六

從官網 http://nlp.stanford.edu/software/segmenter.shtml 下載並解壓後，執行以下命令

segment.bat pku test.simp.utf8 UTF-8 0 > out.txt

結果會存到out.txt當中

實際以繁中文件測試，結果並不理想；但翻譯為簡中後，正確率超過99%，相當出色。要直接處理繁中，有文件指出可以下達 -loadClassifier data\traditional.gz 參數，但是並沒有找到這個檔案；退而求其次的方法，應該就是把原文轉為簡中再處理了，幸好處理完不需要再轉為繁中，因為簡、繁中字的對應位置不會改變，只要把位置資訊留著就可以指回原來的文件

2010年12月3日星期五

與Stanford parser處理中文有關的網路資源

官方所付的中文文法檔為 chineseFactored.ser.gz ，以此為關鍵字進行搜尋

http://114.255.218.78/wikiteamwork/images/d/d7/(3)Stanfold_Parser_in_GATE.ppt 句法分析工具Stanfold Parser及其在GATE中的使用

http://blog.amelielee.com/archives/140#comments Solving the ‘exceeded MAX_ITEMS’ problem in Stanford Parser

https://mailman.stanford.edu/pipermail/parser-user/2010-August/000652.html [parser-user] stanford parser 中文分词关于分词“的” 的特别现象

http://blog.csdn.net/leeharry/archive/2008/03/06/2153583.aspx stanford parser 使用；值得注意的是，這篇文章帶出一個重點：中文資料必需另外用斷字程式斷好，才能丟進parser裏面，這點不管是s牌或b牌都是一樣的。

"如何训练一个中文的Berkeley Parser"

http://playwithnlp.blogspot.com/2010/06/berkeley-parser.html

整個網路上只有這篇文章提到中文文法 chn_sm5.gr 的使用，故特為此文紀錄之

"Call Stanford Parser in Perl"，真是太酷了

http://layesuen.spaces.live.com/blog/cns!BCB0A55D794BEAF6!1034.entry?wa=wsignin1.0&sa=523996814

use Inline (
    Java => <<'END_JAVA',

import edu.stanford.nlp.parser.lexparser.LexicalizedParser;

class Parser {
    LexicalizedParser lexParser;
    public Parser(String model) {
        lexParser = new LexicalizedParser(model);
    }
    public String parse(String sentence) {
        lexParser.parse(sentence);
        return lexParser.getBestParse().toString();
    }
}

END_JAVA

    CLASSPATH => 'stanford-parser.jar',
    EXTRA_JAVA_ARGS => '-mx800m'
);

my $p = Parser->new("englishPCFG.ser.gz");
print $p->parse($_)."\n" while (<>);

感觉实在是很 Cool，主要使用了 Inline-Java 这个 bundle。
运行时需要把 stanford-parser.jar, Parser 数据文件 englishPCFG.ser.gz 和这个 perl 程序放在同一目录下，当然必须保证 Inline-Java 能找到你的 JDK，可以通过 J2SDK 这个 Option 来指定。

Stanford parser及Berkeley parser產出結果初探

使用的輸入檔內容

The strongest rain ever recorded in India shut down the financial hub of Mumbai, snapped communication lines, closed airports and forced thousands of people to sleep in their offices or walk home during the night, officials said today.

產出的結果是差不多的，list風格的樹狀表示；Berkeley parser將所有結果一次印出在一行中，佔用較少的畫面；Stanford parser將結果展開，方便看出結構；不過這並不是重點。雖然Stanford parser更新較頻繁，但是Berkeley parser的好評似乎較多。無論如何，在自然語言處理上，純文字的剖析是第一步，接下來的應用才是好戲。

Berkeley parser初探

對應的目錄

http://code.google.com/p/berkeleyparser/downloads/list

把這些檔案抓下來，放在同一個工作目錄中，以要分析的檔案名叫mumbai.txt為例，鍵入

java -Xms64m -Xmx512m -jar berkeleyParser.jar -gr eng_sm6.gr.gz -inputFile mumbai.txt

它的說明檔範例中沒有參數-Xms64m -Xmx512m，對於使用者來說可能會得到空間不夠的錯誤訊息；其它可用參數如下

-render Write rendered tree to image file. (Default: false)
-inputFile Read input from this file instead of reading it from STDIN.
-substates Output subcategories (only for binarized viterbi trees). (Default: false)
-gr Grammarfile (Required) [required]
-binarize Output binarized trees. (Default: false)
-likelihood Output sentence likelihood, i.e. summing out all parse trees: P(w) (Default: false)
-confidence Output confidence measure, i.e. tree likelihood: P(T|w) (Default: false)
-tokenize Tokenize input first. (Default: false=text is already tokenized)
-scores Output inside scores (only for binarized viterbi trees). (Default: false)
-viterbi Compute viterbi derivation instead of max-rule tree (Default: max-rule)
-chinese Enable some Chinese specific features in the lexicon.
-accurate Set thresholds for accuracy. (Default: set thresholds for efficiency)

Stanford parser初探

在作業系統的開發上，Stanford與Berkeley一直互有競逐，並延伸到其它的層面。在自然語言處理上，代表作就是stanford parser和berkeley parser。先來看看Stanford parser，訪問首頁

http://nlp.stanford.edu/software/lex-parser.shtml

可以下載最新的版本。將它解壓到一個方便的目錄下，依照網頁下方所言產生mumbai.txt檔案來進行實驗，打出以下指令

java -mx200m -cp stanford-parser.jar edu.stanford.nlp.parser.lexparser.LexicalizedParser -retainTMPSubcategories -outputFormat "wordsAndTags,penn,typedDependencies" englishPCFG.ser.gz mumbai.txt

文章的作者因為是開發者，所以沒注意到加上-cp這段，我們如果作為純使用者的話，指明class path是必要的

2010年11月26日星期五

「你到底行不行？」：學術摸底系統Web of Science介紹

非常有趣的一個工具，希望有一天能把endnotes也整合進去，讓寫論文變得更方便

2010年10月30日星期六

XP開機不能!!先從bootlog下手看看

http://technet.microsoft.com/en-us/library/bb457123.aspx

簡單來說就是在boot.ini中加入/bootlog參數，看最後到那一步掛掉的

2010年9月16日星期四

Email::Send::Gmail - Send Messages using Gmail (3)

切記安裝的是ActivePerl 5.10，否則裝了ActivePerl 5.12的話，不論怎樣都看不到這個gmail的模組的

2010年8月22日星期日

中橫健行隊經典復刻梯

真的是太經典了…以下資訊轉錄自http://www.youth.org.tw/index.php

活動名稱：
中橫健行隊經典復刻梯活動代號：
92MA08

活動特色：七、八十年代全國高中職、大專生擠破頭報名的夢幻活動：「中橫健行隊」，即將在當年領隊、駐站服務員手中經典復刻！本營隊突破困難，重啟封莊十年的慈恩、洛韶兩山莊，只為重現我們那曾經伴隨汗水與淚光、笑語和勇氣的永恆青春！老朋友、新朋友，讓我們肩起背包頭頂小黃帽，腳踏實地69公里，在陽光與笑顏裡寫下我們自己的光陰故事。第一天：台中火車站報到→專車接送至霧社→合歡山→觀雲→晚餐→觀雲大型團康晚會之夜。第二天：觀雲早餐→健行至金馬隧道6公里→午餐(便當)→健行至慈恩10公里→晚餐→慈恩通舖溫馨晚會之夜。第三天：慈恩早餐→健行至新白楊10公里→午餐(便當)→健行至洛韶11公里→晚餐→洛韶闖關遊戲晚會之夜。第四天：洛韶早餐→健行至西寶8公里→午餐(便當)→健行至天祥8公里→晚餐→天祥土風舞晚會之夜。第五天：天祥早餐→健行至靳珩橋9公里→午餐（便當）→健行至長春祠健行終點7公里→專車接送至花蓮學苑→晚餐→花蓮市區遊覽→花蓮惜別晚會之夜第六天：花蓮學苑早餐（三明治）→團體照→花蓮學苑解散。附註：洛韶山莊目前雖正加緊整建，但相關室內住宿、用餐環境、盥洗衛生等環境，因山莊荒廢日久，故各項條件相對簡陋，需請注意。
活動日期：第2梯時間 : 2010-07-06 13:00:00 至 2010-07-11 11:00:00 報到地點：台中火車站
活動對象：高中職校以上	活動費用： NT 5200元	活動地點：台中、花蓮地區
活動聯絡人：文耀忠	聯絡電話： (02)25025858分機404

2010年7月22日星期四

Perl OpenGL (POGL)

http://en.wikipedia.org/wiki/Perl_OpenGL
Perl OpenGL (POGL) is a portable, compiled wrapper library that allows OpenGL to be used in the Perl programming language.

基本上就是在perl裏呼叫opengl函式庫，速度看來相當快

http://graphcomp.com/pogl.cgi?v=0111s3m8&r=s3m1 有兩個demo可以拿來run一下看看

2010年7月17日星期六

What To Do Once You've Downloaded A Module From The CPAN / 如果你直接從cpan下載了一個模組的話，接下來要怎麼安裝呢?

http://www.cpan.org/modules/INSTALL.html

基本上我上了黃色底色的部分是大家比較容易忽略的

If you're on Unix,(You can use Andreas König's CPAN module to automate the entire process, from DECOMPRESS through INSTALL.)
A. DECOMPRESS

Decompress the file with gzip -d yourmodule.tar.gz
You can get gzip from ftp://prep.ai.mit.edu/pub/gnu.

Or, you can combine this step with the next to save disk space:

gzip -dc yourmodule.tar.gz | tar -xof -
B. UNPACK

Unpack the result with tar -xof yourmodule.tar
C. BUILD

Go into the newly-created directory and type:

perl Makefile.PL
make
make test
D. INSTALL

While still in that directory, type:

make install

Make sure you have the appropriate permissions to install the module in your Perl 5 library directory. Often, you'll need to be root.
That's all you need to do on Unix systems with dynamic linking. Most Unix systems have dynamic linking -- if yours doesn't, or if for another reason you have a statically-linked perl, and the module requires compilation, you'll need to build a new Perl binary that includes the module. Again, you'll probably need to be root.

Personal Brain

http://www.youtube.com/watch?v=QP3d7P4AERg

相較於mindmanager，這個軟體比較強調一種"魚眼"的視角功能

這在認知心理學上或圖學上都有它的立論基礎

不過一旦mindmanager也實作這個功能的話，personal brain就沒有什麼競爭優勢了

2010年6月2日星期三

如何停用超連結警告訊息，在 2007 Office 程式 / How to disable hyperlink warning messages in 2007 Office programs

到底是什麼樣的豬頭產品，必需讓使用者經常地自己去動危險的registry呢?

答案：

http://support.microsoft.com/kb/925757/zh-tw

http://support.microsoft.com/?scid=kb;en-us;925757&x=10&y=11

2010年5月28日星期五

2009 Top 25 fabless IC suppliers

http://www.evertiq.com/news/15994

2010年5月7日星期五

Europe's Web of Debt

2010年4月8日星期四

在工作表中加入按鈕或命令按鈕

http://office.microsoft.com/zh-tw/excel/HP102366761028.aspx

2007版的"開發人員"tab預設是不顯示的，真是令人無言…

2010年4月6日星期二

scripting mindmanager

簡單的來說mm提供了類似ms office的object model,用你會用的語言來進行操作即可

這位仁兄顯然被mm的一些"特性"搞火了，開新文件時需要特殊的作法

http://www.slxdeveloper.com/page.aspx?action=viewarticle&articleid=64 這篇是法文來的…

2010年3月29日星期一

Perl及Python語法的比較

Xah's Perl and Python Tutorial

http://xahlee.org/perl-python/index.html

比如說，在http://xahlee.org/perl-python/keyed_list.html這個頁面中，就解釋了python的dictionary及perl的hash的語法，而以作者的分類就是keyed list

p.s.這位老兄是個很有趣的台灣人

PerlPhrasebook

http://wiki.python.org/moin/PerlPhrasebook#Declarationofahashoflists

This phrasebook contains a collection of idioms, various ways of accomplishing common tasks, tricks and useful things to know, in Perl and Python side-by-side.

如它所言，這個頁面強調的是side-by-side的比較

python 与 perl 的对比学习（Part I）& II

http://remonstrate.wordpress.com/2009/07/25/python-%E4%B8%8E-perl-%E7%9A%84%E5%AF%B9%E6%AF%94%E5%AD%A6%E4%B9%A0/

http://remonstrate.wordpress.com/2009/07/26/python-%e4%b8%8e-perl-%e7%9a%84%e5%af%b9%e6%af%94%e5%ad%a6%e4%b9%a0%ef%bc%88part-ii%ef%bc%89/

這裏就比較像是使用心得了。http://remonstrate.wordpress.com/category/programming-languages/perl/中還談到了SWIG的使用

2010年2月26日星期五

perl模組安裝及管理--cpanplus/cpanp

perl 學習手札寫得不錯，不過有關cpanplus在win32上的資訊需要更新

安裝了activestate perl後，命令列就可以下達cpanp指令，以m指令查詢模組

內容較ppm多，只要是cpan上的模組都找得到

2010年1月21日星期四

Programming the semantic web

這本書作為一個介紹語義網的入門是不錯，不過選擇python語言似乎有點…不太簡潔

或許是因為它想要介紹"使用"現在的語義網資料的方法

不過我猜把這些code拿掉以後，大概剩100頁不到吧…

我其實滿希望它有cover一些如何由自然語言抽取出語義網的方法…

訂閱：文章 (Atom)

2010年12月9日 星期四

2010年12月7日 星期二

2010年12月4日 星期六

2010年12月3日 星期五

2010年11月26日 星期五

2010年10月30日 星期六

2010年9月16日 星期四

2010年8月22日 星期日

2010年7月22日 星期四

2010年7月17日 星期六

2010年6月2日 星期三

2010年5月28日 星期五

2010年5月7日 星期五

2010年4月8日 星期四

2010年4月6日 星期二

2010年3月29日 星期一

2010年2月26日 星期五

2010年1月21日 星期四

常用資訊速查

搜尋此網誌

熱門文章

網誌存檔