Not too long ago, we had a big army of snowfall marching in to attack our streets (and gas bills, if we all wished to avoid the accompanying twenty-below-zero wind chill). Seeing this pending attack, local schools did the responsible thing and canceled classes, wanting to keep their students safe. Some businesses, often those pertaining to children, did the same.
Long story short, my school was waiting to call things off (superintendents from the various districts have phone conferences, discuss what the best plan is, and so forth). Tired of checking the news’ local website, having to tediously refresh everything manually, I wrote a little web scraping tool in Python to help me along.
It’s nothing too fancy, and I’m sure I’d have done things differently if I were turning this into a full-fledged program. However, I think it’s still worth posting, even in its immature state, because someone may learn something from it–or, someone may point something out to me that I could have done differently, so that I too can learn.
1URL = 'http://www.ksdk.com/weather/severe_weather/cancellations_closings/default.aspx'
2
3closings = {
4 'School': [],
5 'Business': [],
6}
7
8def get_closings():
9 """Downloads and parses the KSDK school and business cancellation page,
10 returning only the data pertaining to closings that we need."""
11 from urllib2 import urlopen
12 data = urlopen(URL).read()
13
14 from lxml import html
15 data = html.document_fromstring(data)
16 try:
17 data = data.get_element_by_id('schooclosings_dg')
18 except KeyError:
19 return None
20 data = data.findall('./tr/td/table/tr/td')[1::2]
21 return zip(*[iter(data)]*2)
22
23def add_closings():
24 """Maintains the list of closed schools and businesses and notifies only
25 for new cancellations."""
26 added = False
27 count = {'School': 0, 'Business': 0}
28 closings = get_closings()
29 if closings is None:
30 print "No closings at this time."
31 else:
32 for entity, place_type in closings:
33 entity = entity.text_content()
34 place_type = place_type.text_content()
35 if place_type == '': place_type = 'School'
36 if entity not in closings[place_type]:
37 added = True
38 closings[place_type].append(entity)
39 count[place_type] += 1
40 print 'Adding new %s: %s' % (place_type, entity)
41 if added:
42 import time
43 now = time.strftime('%I:%M %p', time.localtime())
44 print '%s) %d schools (%d new) and %d businesses (%d new) canceled as of now.' % (
45 now,
46 len(closings['School']),
47 count['School'],
48 len(closings['Business']),
49 count['Business']
50 )
51 print '=' * 78
52
53import time
54while True:
55 add_closings()
56 time.sleep(120)
I like learning new things. Previously: Kenchi founder, eng & ops teams at Stripe from 2012-2019. Say hi! 🏳️🌈