I was recently faced with the task of scanning a lot of documents and wanted to preserve them as multi-page pdfs. Converting the raw jpegs to pdfs proved to be quite easy using ImageMagick's convert that I got from http://stackoverflow.com/questions/8955425/how-can-i-convert-a-series-of-images-to-a-pdf-from-the-command-line-on-linux. But, the file sizes of the pdfs was really large when using convert to create multi-page documents. A little bit of googling and I found pdfunite which was perfect and I was able to create reasonably sized multi-page pdfs from a set of single page pdfs :)
Thursday, October 04, 2012
Wednesday, January 18, 2012
Naruto downloader from managreader.net
After reading Naruto upto chapter 520 (courtesy naruto with elisp) I was eager to read the rest. As of today the latest chapter is 570. I found mangareader.net after a bit of googling and was busy reading my way through the chapters. But, like before reading in the browser was not up to my taste. Currently the mcomix is the comic reader of my choice. So, I set about to write a script to automatically download the remaining chapters from mangareader.net. I do not know if its wrong to do so, the site does not have any terms of use :|
So, I let it rip and now I'm busy reading... :)
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import re | |
from urllib2 import urlopen | |
from zipfile import ZipFile, ZIP_DEFLATED | |
from xml.dom.minidom import parseString | |
def get_info(line, alt_regex): | |
try: | |
line = line[:line.index('</a>')] + '</a>' | |
line = line[line.index('<a href'):] | |
dom = parseString(line) | |
info = {} | |
a = dom.getElementsByTagName('a')[0] | |
info['next'] = a.getAttribute('href') | |
img = a.getElementsByTagName('img')[0] | |
info['img_url'] = img.getAttribute('src') | |
info['img_ext'] = info['img_url'][info['img_url'].rindex('.') + 1:] | |
alt = img.getAttribute('alt') | |
m = re.search(alt_regex, alt) | |
info['chapter'] = int(m.group(1)) | |
info['page'] = int(m.group(2)) | |
dom.unlink() | |
return info | |
except Exception as e: | |
print('[ERROR] %s' % line) | |
print('[ERROR] ' + e) | |
def get_image(url): | |
try: | |
f = urlopen(url) | |
b = f.read() | |
f.close() | |
return b | |
except Exception as e: | |
print('[ERROR] ' + e) | |
def get_chapter(url_prefix, url_suffix, title, chapter=1): | |
need_more = True | |
alt_regex = re.compile(r'%s (\d+) - Page (\d+)' % title) | |
cbz = ZipFile('%03d.cbz' % chapter, "w", ZIP_DEFLATED) | |
url = '%s%s' % (url_prefix, url_suffix) | |
try: | |
while ( need_more ): | |
f = urlopen(url) | |
lines = f.readlines() | |
f.close() | |
line = filter(lambda x: x.find('id="img"') != -1, lines)[0] | |
info = get_info(line, alt_regex) | |
need_more = info['chapter'] == chapter | |
if ( need_more ): | |
cbz.writestr('%02d.%s' % (info['page'], info['img_ext']), get_image(info['img_url'])) | |
url = '%s%s' % (url_prefix, info['next']) | |
else: | |
# new chapter | |
cbz.close() | |
chapter = info['chapter'] | |
need_more = True | |
cbz = ZipFile('%03d.cbz' % chapter, "w", ZIP_DEFLATED) | |
except IndexError: | |
pass # image not found so end of chapter | |
get_chapter('http://www.mangareader.net', '/naruto/521', 'Naruto') |
Friday, January 13, 2012
Project Euler Problem 1 functionally
I'm trying to learn a bit of elisp (see Luhns Algorithm in elisp). So, to try out me elisp some more I tried the Project Euler's simplest problem, Problem 1. So, I cracked open emacs and tried out a few things, gave up and went into hibernation. A few weeks later I remembered this problem and had a go at it again. I thought I nailed it...
After a bit of googling I found out the elisp only supports 300 recursive calls :(. So, I decided to take the plunge into clisp and started downloading that. Meanwhile I had some old version of scala available and tried this
Wow, that went well and by the time I had googled to find the syntax for scala etc., clisp was downloaded and here goes
https://gist.github.com/fc-unleashed/86ecd9aee76dcd8ec1decb3f15b84d6d
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
(defun num_check (x) (if (or (eq (mod x 3) 0) (eq (mod x 5) 0)) x 0)) | |
(defun sum(x) (if (eq x 3) 3 (+ (num_check x) (sum (- x 1))))) | |
(sum 3) | |
3 | |
(sum 5) | |
8 | |
(sum 10) | |
33 | |
(sum 1000) | |
Debugger entered--Lisp error: (error "Lisp nesting exceeds `max-lisp-eval-depth'") |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Welcome to Scala version 2.9.0.1 (Java HotSpot(TM) Server VM, Java 1.6.0_20). | |
Type in expressions to have them evaluated. | |
Type :help for more information. | |
scala> def num_check(n: Int):Int = if ( (n % 3) == 0 || (n % 5) == 0 ) n else 0 | |
num_check: (n: Int)Int | |
scala> def do_sum(x: Int):Int = if ( x == 3 ) 3 else num_check(x) + do_sum(x - 1) | |
do_sum: (x: Int)Int | |
scala> do_sum(10) | |
res0: Int = 33 | |
scala> do_sum(100) | |
res1: Int = 2418 | |
scala> do_sum(1000) | |
res2: Int = 234168 |
Subscribe to:
Posts (Atom)