Skip to main content

My Train Orb (also Scraping Munich Train Schedules)

In an effort to optimize my commute, I wanted a glowing orb whose color indicated when I should start walking to the train stop. My train comes fairly frequently (every 10 minutes). The glowing orb lets me tell at a glance if I should be getting my stuff together, walking out the door, or running.

The glowing orb part is easy.
  • Arduino
  • Tri-color LED
  • Ping Pong ball
  • Mix with Python appropriately.
I'll post the code for the orb some other time. The hard part was actually getting the train schedule information. MVV (the mass transit authority in Munich) only provides PDFs for train schedules.



To parse those, I used pdftohtml. pdftohtml can generate XML which provides the absolute position of all the text elements on the schedule. Since they're in a grid, it wasn't too hard to parse that into something useful.
from BeautifulSoup import BeautifulSoup
import datetime
import re

_DIGITS_RE = re.compile(r'^\d+$')

# Schedule paramaters (page, hour column bounds, minute column bounds).
_MON_THUR_SCHEDULE = (1, (32, 38), (37, 311))
_FRI_SCHEDULE = (1, (32, 38), (311, 556))
_SAT_SCHEDULE = (1, (32, 38), (556, 802))
_SUN_SCHEDULE = (2, (41, 49), (48, 830))
_WEEK_SCHEDULE = {
0: _MON_THUR_SCHEDULE,
1: _MON_THUR_SCHEDULE,
2: _MON_THUR_SCHEDULE,
3: _MON_THUR_SCHEDULE,
4: _FRI_SCHEDULE,
5: _SAT_SCHEDULE,
6: _SUN_SCHEDULE,
}


class MvvSchedule(object):

def __init__(self):
schedule_xml = open('schedule.xml').read()
self.soup = BeautifulSoup(schedule_xml)

def _GetSchedule(self, page, hour_col, minutes_col):
soup = self.soup.find('page', number=str(page))

InRange = lambda bounds: lambda x: bounds[0] < int(x) < bounds[1]
hours = soup.findAll('text', left=InRange(hour_col))
# Get the row height for each hour to use as bounds.
hours.sort(key=lambda x: int(x.get('top')))
hour_bounds = [(h[0]['top'], h[1]['top']) for h in zip(hours, hours[1:])]
# We assume the bottom is a single row high.
hour_bounds.append(
(hours[-1]['top'], 2 * int(hours[-1]['top']) - int(hours[-2]['top'])))
hour_bounds = [(int(u) - 3, int(l) - 3) for u, l in hour_bounds]

schedule = {}
for hour, bounds in zip(hours, hour_bounds):
minutes = soup.findAll(
'text', top=InRange(bounds), left=InRange(minutes_col))
minutes = [int(m.string) for m in minutes
if m.string and _DIGITS_RE.match(m.string)]
schedule[int(hour.string)] = minutes

return schedule

def GetTimeToNextTrain(self):
now = datetime.datetime.now()
schedule_params = _WEEK_SCHEDULE[now.weekday()]
schedule = self._GetSchedule(*schedule_params)
times = [datetime.datetime(now.year, now.month, now.day, hour, minute)
for hour, minutes in schedule.iteritems() for minute in minutes]
times.sort()
stop_time = None
while times[-1] > now:
stop_time = times.pop()
if stop_time is not None:
return stop_time - now

if __name__ == '__main__':
mvv = MvvSchedule()
time_left = mvv.GetTimeToNextTrain()
if time_left is None:
print 'No more trains today.'
else:
print 'The next train arrives in %s.' % time_left

Popular posts from this blog

Bot Commander r1 Released

I just published Bot Commander , the code for my Lego NXT rover . There's a lot left to be done, but release early and often, right? Currently it provides a UI for controlling the direction and speed of all three motor ports on the NXT brick. You can link motors together to adjust their speed in unison. In addition, you can enable "Tilt Control" for a steering-wheel-type experience. To use tilt control: Hook up motor A and B to be the left and right wheels of your vehicle. Hold the phone sideways (i.e. landscape). Tilt the phone forward and backward to drive forward and backward. Turn the phone right and left (like a steering wheel) to steer right and left. As you tilt the phone, you'll see the UI update the slider controls for the speed of motors A and B. I plan to expand the UI to provide a lot more than just motor control. Before that, though, I'll push a JAR to make it easy to integrate control of Lego NXT robots into your own Android project. The code
Read more

Email Injection

Not so long ago, I ran a wiki called SecurePHP. On that wiki, there was one particular article about email injection that received a lot of attention. Naturally, with all the attention came lots of spam. As a result, I disabled editing of the wiki and content stagnated. Still, the email injection article remained popular. About a year later, the server that hosted SecurePHP died and I never had a chance to hook it all back up. I saved the article though and I'm reposting it now. It may be a bit old (I've been away from PHP for a long time), and I didn't write all of it, so feel free to leave comments about needed updates and corrections. Though this article focuses on PHP, it provides a lot of general information regarding email injection attacks. The PHP mail() Function There are a lot of ways to send anonymous emails, some use it to mass mail, some use it to spoof identity, and some (a few) use it to send email anonymously. Usually a web mailform using the mail() funct
Read more

XBee ZNet 2.5 Wireless Accelerometer

I managed to put together a wireless accelerometer the other night using my two new XBees, an Arduino XBee shield, an XBee Explorer USB, an ADXL330, and some Python. I struggled a bit with some of it, so here's what I learned: First, a parts list. XBee 2mW Series 2.5 Chip Antenna Arduino XBee (with XBee Series 2.5 module) XBee Explorer USB ADXL330 I'm not sure exactly what the specs are on the XBee that comes with the Arduino shield. But, it is definitely a series 2.5. The first thing to do is to configure and upgrade the firmware on your XBees. To do that, you'll need X-CTU (for the firmware upgrade at least, but it's also nice for configuration) which, unfortunately, is only available for Windows. But, it works fine from VMware. First up, the XBee we'll hook up to the computer to read incoming data from the accelerometer: Plug one of the XBees into the Explorer (it's also possible to do this from the Arduino shield by shifting the two XBee/USB jumpers to USB
Read more