Displaying data from Ungdomar.se on your desktop
9/04/2009If you’re not a member of the Swedish community ungdomar.se or any of the other sites from the same company (e.g. gameplayer.se) this article will be of very little use to you. However, if you are a member of said community, I will show you how to get data from there onto your desktop using a little bit of Bash magic combined with the power of Conky. Even if you don’t care about ungdomar.se, the same techniques that I’m going to show you could be used to fetch wheather information, the TV guide, stock prices, or pretty much anything else.
Summary:
I’m going to hold your hand as we build a script to fetch data from a website and then display it on the desktop using conky. If you already know all about this and just want to get to the script, jump straight down to the end of the post to find the download link.
- Prerequisites
- Data structure
- Logging in and downloading the index page
- Fetching the relevant data
- Putting it together
- Setting up Conky
- The result
- Download the script
Prerequisites
Before we being scripting make sure you have installed the following:
If you’re missing one or both, but have the apturl package installed, click on the relevant link next to its name to have your package manager automatically download and install. If you don’t have apturl or don’t use apt, do a search from the package manager itself and I’m sure you’ll find it. Wget and Conky are both included in pretty much every distribution’s repository.
Installing with apt:
sudo apt-get install wget conky
Installing with yum:
<code>yum install wget conky
Installing with pacman:
pacman -S wget conky
Installing with Zypper:
zypper install wget conky
Data structure
Now we need to take a look at what data we want to extract from the website. I’m constructing this script to fetch data from ungdomar.se but this same technique could be used for pretty much any website data. In this case we want to get the numbers below those icons. If I don’t have any unread messages or whatnot, the icons aren’t there. If we take a look at the markup, it looks like this:
<td style="padding-left: 11px;"><a href="user_guestbook.php"><img src="/gfx/guestbook_comment_unread.png" alt="" title="Min gästbok" height="14" width="16"></a></td> <td style="padding-left: 11px;"><a href="forum.php?view=3"><img src="/gfx/forum_comment_unread.png" alt="" title="Du har 1 ny forumkommentar" height="12" width="16"></a></td> <td style="padding-left: 11px;"><a href="user_images.php?user_id=162005&func=display_new_comments"><img src="/gfx/folder_comment_unread.png" alt="" title="Du har 1 ny albumkommentar" height="12" width="17"></a></td> <td style="padding-left: 11px;"><a href="forum_favorites.php"><img src="/gfx/forum_watch_unread.png" alt="" title="Du har 1 uppdaterad tråd i "bevakade trådar"" height="13" width="15"></a></td> <td style="padding-left: 11px;" rowspan="2"><img src="/gfx/menu_separator.gif" alt=""></td> </tr><tr> <td align="center"><a class="page_login_text" href="pm.php" title="Du har 1 oläst pm."><span class="count">1</span></a></td> <td style="padding-left: 11px;" align="center"><a class="page_login_text" href="user_guestbook.php" title="Min gästbok"><span class="count">1</span></a></td> <td style="padding-left: 11px;" align="center"><a class="page_login_text" href="forum.php?view=3" title="Du har 1 ny forumkommentar"><span class="count">1</span></a></td> <td style="padding-left: 11px;" align="center"><a class="page_login_text" href="user_images.php?user_id=162005&func=display_new_comments" title="Du har 1 ny albumkommentar"><span class="count">1</span></a></td> <td style="padding-left: 11px;" align="center"><a class="page_login_text" href="forum_favorites.php" title="Du har 1 uppdaterad tråd i "bevakade trådar""><span class="count">1</span></a></td>
Remember what the data we’re looking for looks like, and let’s move on to logging in and downloading the index page.
Logging in and downloading the index page
Like I said, we’re going to use wget to access the data. That can be done simply by doing:
MY_USERNAME=username MY_PASSWORD=password # needs to be urlencoded, this can be done at http://lajm.eu/emil/dump/stringfunctions.php. LOGIN_DATA="action=login&login_nick=$MY_USERNAME&login_pwd=$MY_PASSWORD" wget --quiet --save-cookies kakburk --keep-session-cookies --post-data $LOGIN_DATA --user-agent 'Firefox' -O um.htm http://ungdomar.se/index.php
We get $LOGIN_DATA from looking at the login form on the main page, as this is the action the form takes when you press submit. If you use any special characters in your password, you first need to urlencode it using php. So if your password is Pf9RusOrpGo@, it will be Pf9RusOrpGo%40 once urlencoded. I’ve looked at ways to automate this, but none of the ways to do this in bash give exactly the same result as the urlencode()-function in PHP does - ergo it doesn’t work. Also, there’s a slight problem with this login script. For some completely retarded reason the developers have decided to incorporate a captcha not only into the registration form, but also into the login form if you are logging in from an IP you’ve never logged in from before. In other words, once you’ve logged in normally using a web browser, the script will work beautifully.
Fetching the relevant data
Now that we’ve downloaded the index page, we need a way to filter out those numbers that we need. I’ve chosen to do this in a function that takes a single argument. This argument decides which number to grab. So if I only want the number of unread guestbook entries I have, I would run:
./umscript.sh guestbook
Now for the actual function:
# Extracts the number of unread posts/comments, appends empty strings and
# prints the result
getNumber ()
{
case "$1" in
pm)
# Extract the number of PM:s
PM_NUMBER=`cat um.htm | LANG=sv_SE.iso88591 sed -n 's/.*ol.st.*pm.*count..\([0-9]*\).*/\1/p'`
echo "${PM_NUMBER:=0}"
;;
blog)
# Extract the number of blog comments
BLOGG_NUMBER=`cat um.htm | LANG=sv_SE.iso88591 sed -n 's/.*bloggkommentar.*count..\([0-9]*\).*/\1/p'`
echo "${BLOGG_NUMBER:=0}"
;;
quotes)
# Extracts the number of forum comments
FORUM_NUMBER=`cat um.htm | LANG=sv_SE.iso88591 sed -n 's/.*forumkommentar.*count..\([0-9]*\).*/\1/p'`
echo "${FORUM_NUMBER:=0}"
;;
comments)
# Extract the number of image comments
IMAGE_NUMBER=`cat um.htm | LANG=sv_SE.iso88591 sed -n 's/.*albumkommentar.*count..\([0-9]*\).*/\1/p'`
echo "${IMAGE_NUMBER:=0}"
;;
guestbook)
#Extract the number of guestbook posts
GUESTBOOK_NUMBER=`cat um.htm | LANG=sv_SE.iso88591 sed -n 's/.*g.stbok.*count..\([0-9]*\).*/\1/p'`
echo "${GUESTBOOK_NUMBER:=0}"
;;
subscribed)
# Extract the number of unread subscribed threads
SUBSCRIBED_NUMBER=`cat um.htm | LANG=sv_SE.iso88591 sed -n 's/.*bevakade.*count..\([0-9]*\).*/\1/p'`
echo "${SUBSCRIBED_NUMBER:=0}"
;;
*)
echo $"Usage: $0 {pm|blog|quotes|comments|guestbook|subscribed}"
exit 1
esac
}
It looks long, but it’s quite simple. It looks at the argument it’s given and goes to the relevant case. There it reads the file um.htm (that’s what we saved the index page as when we logged in) and passes the content to sed, which uses a regular expression to find the number associated with each item. It then saves that number to a variable and prints it to the screen.
Putting it together
Now we’re getting somewhere. In fact, we actually have a working version right now. However, we’re going to make some changes for the sake of efficiency first. We don’t really need to download the index page every time the script is run. Because let’s say you’re running it automatically to fetch all kinds of data once every two minutes. That’s 3 downloads per minute on average. Feels like a waste. So instead we’re going to download the file only if it’s more than 100 seconds older than the one we already have. To do this we need some kind of directory to save it in, so I’m going to have the script create a folder called ~/.umscript where I’ll save the cookies and um.htm, but only if the directory doesn’t already exist. The script will now look like this:
#!/bin/bash
# Ungdomar.se script
#
# Detta script skriver ut hur många PM, Bloggkommentarer, gästboksinlägg, etc.
# som är olästa.
#-------------------------------------------------
# Write your username and password here. Needs to be urlencoded.
# http://lajm.eu/emil/dump/stringfunctions.php
MY_USERNAME=username
MY_PASSWORD=password
REFRESH_TIME=100 #allowed age for um.htm in seconds
TEMP_DIR=~/.umscript
# No editing beyond this point.
#-------------------------------------------------
#Checks to see if TEMP_DIR exists and is writable.
dirCheck ()
{
if [ ! -d "$TEMP_DIR" ]; then
mkdir $TEMP_DIR
fi
if [ ! -w "$TEMP_DIR" ]; then
echo "$TEMP_DIR is not writable. Run chmod a+w $TEMP_DIR as root."; exit 1;
fi
}
fileDownloader()
{
#Check if um.htm exists, and if it does check if
# um.htm's timestamp is less than current timestamp - $REFRESH_TIME
if [ ! -e "$TEMP_DIR/um.htm" ] || [ "$(date -r $TEMP_DIR +%s)" -lt "$(($(date +%s) - $REFRESH_TIME))" ]; then
wget --quiet --save-cookies kakburk --keep-session-cookies --post-data $LOGIN_DATA --user-agent 'Firefox' -O um.htm http://ungdomar.se/index.php
fi
}
# Extracts the number of unread posts/comments, appends empty strings and
# prints the result
getNumber ()
{
case "$1" in
pm)
# Extract the number of PM:s
PM_NUMBER=`cat um.htm | LANG=sv_SE.iso88591 sed -n 's/.*ol.st.*pm.*count..\([0-9]*\).*/\1/p'`
echo "${PM_NUMBER:=0}"
;;
blog)
# Extract the number of blog comments
BLOGG_NUMBER=`cat um.htm | LANG=sv_SE.iso88591 sed -n 's/.*bloggkommentar.*count..\([0-9]*\).*/\1/p'`
echo "${BLOGG_NUMBER:=0}"
;;
quotes)
# Extracts the number of forum comments
FORUM_NUMBER=`cat um.htm | LANG=sv_SE.iso88591 sed -n 's/.*forumkommentar.*count..\([0-9]*\).*/\1/p'`
echo "${FORUM_NUMBER:=0}"
;;
comments)
# Extract the number of image comments
IMAGE_NUMBER=`cat um.htm | LANG=sv_SE.iso88591 sed -n 's/.*albumkommentar.*count..\([0-9]*\).*/\1/p'`
echo "${IMAGE_NUMBER:=0}"
;;
guestbook)
#Extract the number of guestbook posts
GUESTBOOK_NUMBER=`cat um.htm | LANG=sv_SE.iso88591 sed -n 's/.*g.stbok.*count..\([0-9]*\).*/\1/p'`
echo "${GUESTBOOK_NUMBER:=0}"
;;
subscribed)
# Extract the number of unread subscribed threads
SUBSCRIBED_NUMBER=`cat um.htm | LANG=sv_SE.iso88591 sed -n 's/.*bevakade.*count..\([0-9]*\).*/\1/p'`
echo "${SUBSCRIBED_NUMBER:=0}"
;;
*)
echo $"Usage: $0 {pm|blog|quotes|comments|guestbook|subscribed}"
exit 1
esac
}
#-------------------------------------------------
LOGIN_DATA="action=login&login_nick=$MY_USERNAME&login_pwd=$MY_PASSWORD"
dirCheck
# Changes working directory to TEMP_DIR
cd $TEMP_DIR
# Downloads the first page you get when you log in
fileDownloader
# Extract the number of unread posts/comments
getNumber $1
By now you should be able to get through this script and understand what it does. The dirCheck function checks whether or not ~/.umscript exists and creates it if it doesn’t, it also warns the user if it isn’t writable. FileDownloader is the same basic login action that we wrote before, only I’ve put it into a function and made it so that it only runs if the old um.htm is too old or doesn’t exist.
Setting up Conky
Setting up conky to suit your exact needs can be a pretty daunting task - one that I will leave for you to tackle on your own. However, I will show you how I’ve done to get the result from our script onto my desktop.
The layout of Conky is decided by a file called .conkyrc which can be found in your home directory. By default I believe it shows you some system information. In my setup I’ve gotten rid of all that and replaced it with the following:
#avoid flicker
double_buffer yes
#own window to run simultanious 2 or more conkys
own_window yes
own_window_type normal
own_window_transparent yes
own_window_hints undecorated,below,sticky,skip_taskbar,skip_pager
own_window_title mi_conky
#borders
draw_borders no
border_margin 1
stippled_borders 0
#shades
draw_shades no
draw_outline no
#position
gap_x 0
gap_y 0
alignment top_left
#behaviour
update_interval 1
background no
#colour
default_color 9f907d
#default_shade_color 000000
own_window_colour 3d352a
#font
use_xft yes
xftfont bauhaus:pixelsize=9
#to prevent window from moving
use_spacer none
minimum_size 1435 0
# stuff after 'TEXT' will be formatted on screen
TEXT
${alignr}${color white}Citeringar: ${color}${execi 120 ./Scripts/um.sh quotes} ${color white}PM: ${color}${execi 120 ./Scripts/um.sh pm} ${color white}Uppdaterade bevakade trådar: ${color}${execi 120 ./Scripts/um.sh subscribed} ${color white}Gästboksinlägg: ${color}${execi 120 ./Scripts/um.sh guestbook} ${color white}Bloggkommentarer: ${color}${execi 120 ./Scripts/um.sh blog} ${color white}Bildkommentarer: ${color}${execi 120 ./Scripts/um.sh comments}
Everything below TEXT is printed to the screen, so you can see that every 120 seconds I run the command “./Scripts/um.sh <argument>”. Everything above TEXT are different types of settings. This setup is designed for a resolution of 1440×900, so if you have a different resolution you might want to change the line “minimum_size 1435 0″ to something that works better with your resolution.
To see the changes you make to .conkyrc you need to restart conky. You can do this by running:
killall conky
And then:
conky &
For more information on configuring Conky, go here.
The result
All our hard work has led up to this moment. We now have a working script that fetches data from a website, and we have Conky to show it to us on the desktop. Want to see what it looks like?
Download the script
As usual I take no responsibility if you end up frying your harddrive or pissing off your ISP. If you have any suggestions for how to improve the script, if you have a question or just want to say something, please leave a comment.
Umscript.sh (Downloaded 109 times)



It's very quiet in here... Leave a comment, pretty please?