http://opensecrets.pixnet.net/blog/post/27841494
http://azo-freeware.blogspot.com/2008/08/some-pdf-image-extract-14.html
使用linux/perl的話,參考下列連結
http://nedbatchelder.com/blog/200712/extracting_jpgs_from_pdfs.html
否則就要參考一些資料了
http://stackoverflow.com/questions/2693820/extract-images-from-pdf-without-resampling-in-python
http://www.jpedal.org/PDFblog/2010/04/understanding-the-pdf-file-format-how-are-images-stored/
這也證實了,如果不是單純的jpg圖檔的話,"擷取PDF檔內圖片"這件工作可能會很麻煩
涉及中文的話,可參考以下連結
http://ccckmit.wikidot.com/pdf:streamcoding
原版的pdf規格
http://partners.adobe.com/public/developer/en/pdf/PDFReference16.pdf
簡明的pdf檔格式的說明:
http://www.mactech.com/articles/mactech/Vol.15/15.09/PDFIntro/
節錄其中重點如下
b closepath, fill,and stroke path. B fill and stroke path. b* closepath, eofill,and stroke path. B* eofill and stroke path. BI begin image. BMC begin marked content. BT begin text object. BX begin section allowing undefined operators. c curveto. cm concat. Concatenates the matrix to the current transform. cs setcolorspace for fill. CS setcolorspace for stroke. d setdash. Do execute the named XObject. DP mark a place in the content stream, with a dictionary. EI end image. EMC end marked content. ET end text object. EX end section that allows undefined operators. f fill path. f* eofill Even/odd fill path. g setgray (fill). G setgray (stroke). gs set parameters in the extended graphics state. h closepath. i setflat. ID begin image data. j setlinejoin. J setlinecap. k setcmykcolor (fill). K setcmykcolor (stroke). l lineto. m moveto. M setmiterlimit. n end path without fill or stroke. q save graphics state. Q restore graphics state. re rectangle. rg setrgbcolor (fill). RG setrgbcolor (stroke). s closepath and stroke path. S stroke path. sc setcolor (fill). SC setcolor (stroke). sh shfill (shaded fill). Tc set character spacing. Td move text current point. TD move text current point and set leading. Tf set font name and size. Tj show text. TJ show text, allowing individual character positioning. TL set leading. Tm set text matrix. Tr set text rendering mode. Ts set super/subscripting text rise. Tw set word spacing. Tz set horizontal scaling. T* move to start of next line. v curveto. w setlinewidth. W clip. y curveto. TABLE 1: PDF Page Markup Operators (Note: Equivalent PostScript operators are in boldface.)