In this post, I’ll share three things –
- Using OS module to access and print directories,
- Using exceptions to handle runtime errors and
- Using urllib module to fetch data from webpage and store it as a string in a variable.
Starting with OS module – you can see how we print a file name in the current directory and then show it’s relative and absolute path:
# python program located at: /Users/omkarb/Desktop
import os
## Example pulls filenames from a dir, prints their relative and absolute paths
def printdir(dir):
filenames = os.listdir(dir)
print filenames[1] #boarding pass
print os.path.join(dir, filenames[1]) #./boarding pass
print os.path.abspath(os.path.join(dir, filenames[1])) #/Users/omkarb/Desktop/boarding pass
printdir('./')
#explanation
#for filename in filenames:
#print filename ## foo.txt
#print os.path.join(dir, filename) ## dir/foo.txt (relative to current dir)
#print os.path.abspath(os.path.join(dir, filename)) ## /home/nick/dir/foo.txt
Then you can also use the commands module to run a command in the terminal. For example, this is how you run pwd command:
import commands
## Given a dir path, run an external 'ls -l' on it --
## shows how to call an external program
def listdir():
cmd = 'pwd' #present working directory
print "Command to run:", cmd ## good to debug cmd before actually running it
(status, output) = commands.getstatusoutput(cmd)
if status: ## Error case, print the command's output to stderr and exit
sys.stderr.write(output)
sys.exit(status)
print output ## Otherwise do something with the command's output
listdir() #prints /Users/omkarb/desktop
If you know a certain piece of code is possibly going to return an error then you can put it inside try-except block as follows:
import sys
filename = 'line.txt'
try:
## Either of these two lines could throw an IOError, say
## if the file does not exist or the read() encounters a low level error.
f = open(filename, 'rU')
text = f.read()
f.close()
except IOError:
## Control jumps directly to here if any of the above lines throws IOError.
sys.stderr.write('problem reading:' + filename)
## In any case, the code then continues with the line after the try/except
The above code prints out the the error case because line.txt doesn’t exist on my computer.
Next, we can read from URL using the following code:
import urllib
## Given a url, try to retrieve it. If it's text/html,
## print its base url and its text.
def wget(url):
ufile = urllib.urlopen(url) ## get file-like object for url
info = ufile.info() ## meta-info about the url content
if info.gettype() == 'text/html':
print 'base url:' + ufile.geturl()
text = ufile.read() ## read all its text
# print text #prints text
wget('https://google.com')
However, it doesn’t include error handling. If the URL doesn’t work for some reason, we can handle it as follows:
import urllib
## Given a url, try to retrieve it. If it's text/html,
## print its base url and its text.
def wget2(url):
try:
ufile = urllib.urlopen(url) ## get file-like object for url
if ufile.info().gettype() == 'text/html':
print 'base url:' + ufile.geturl()
text = ufile.read() ## read all its text
# print text #prints text
except IOError:
print 'problem reading url:', url
wget2('https://google.com')
There’s also a simpler way to read the web page using urllib.urlretrieve method:
import urllib
result = urllib.urlretrieve("http://wordpress.org/")
print open(result[0]).read()
That’s all in this post 🙂
