Rough scripts [en] [Moved]

Published on 09.29.2007 by Toni

Page moved to http://r.outlyer.net/hacks:rough

These are quick shell scripts/functions/aliases I use, this section needs some cleanup, for now I'll just list them here:

Get ed2k links from an HTML page; first save the page to disk, e.g. page.html, and do:
grep -o 'ed2k://[^"]*' page.html

Note you must use the correct quotes. Also, you might need to invert the quotes if the page uses older html (i.e. <a href='ed2k...'> is deprecated in newer HTML and should always be <a href="ed2k...">)

Site specific

Scripts / filters tied to a specific site.

Easynews

Extract download links; it works better with files put in a zipmanager slot and the slot displayed as a list; first save the page to disk (e.g. to zipmanager.html) and then:
egrep -o 'https?://downloads.[^"]*' zipmanager.html > urls

Generate an MD5SUM file from a list of urls (EasyNews' web interface urls include the md5sum hash so the integrity of the download can be checked easily).
I'm assuming you have the list of URLs in a file named urls as generated above.

This one is a bit trickier (and it has stuff that certainly can be done in a nicer way) so it's better put in a file, save this to e.g. ~/bin/enewsmd5 and make it executable (chmod +x ~/bin/enewsmd5)

#!/bin/bash
export IFS=$'\10' # Required for the echo $(cat)
# 1st sed : place MD5 and filename in the correct order
# 2nd sed : transform url encodings in character escapes
grep 'downloads.' /dev/stdin | \
	sed -r 's#^https?.*?news/./././([a-z0-9A-Z]{32}).*/(.*)#\1 *\2#g' | \
	sed -r 's/%([0-9a-fA-F]{2})/\\x\1/g' | \
	echo -e $(cat /dev/stdin)

Generate your md5 file with:
~/bin/enewsmd5 < urls > MD5SUM
And check the files' integrity with:
md5sum -c MD5SUM

Note that you could also chain it all together, up from the zipmanager.html file:

egrep -o 'https?://downloads.[^"]*' zipmanager.html | \
    ~/bin/enewsmd5 | md5sum -c -

There are no comments about this page

Leave your comment

XHTML: You can use tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Your comment will require approval before publishing.

Posts RSS Comments RSS


Bad Behavior has blocked 64 access attempts in the last 7 days.