Home Page

Tips page
c
cellulari
debian
egittologia
emacs
emacs-latex
hardware
html
inglese
java
latex
linux
matlab
misc
mysql
network
octave
programming
python

*Recursive generators

security
sed
tech
webapps
windows

University Page

Programming

Debian & Linux

Some works

About me

Del.icio.us Bookmarks

BOINC Combined Statistics

Site Statistics

Contact me sending an e-mail (antispam defense activated)

debian

hacker emblem

blogger

GeoURL

View Sandro Tosi's profile on LinkedIn

This is my Google PageRank

Title: Recursive generators
Author: Sandro Tosi
Last modified: 2007-08-26

Since  I've lost  a  couple of  days  on this  problem,  I'm going  to
describe how to write recursive generator functions.

The problem  I needed to  solve was to  recursive find pdf files  in a
directory tree.

Coming  back  to my  previous  programming  languages knowledge,  I've
started writing this code:

def find_pdf_files_in_dir_recursive(directory):
    for path in directory.split(os.pathsep):
        # check first in the dir passed as parameter
        for found_file in glob.glob(os.path.join(path, "*.pdf")):
            yield found_file
        # dircache output is sorted and cached
        for item in dircache.listdir(path):
            # if it's a dir, then go recursive on it
            if os.path.isdir(path+"/"+item):
                find_pdf_files_in_dir_recursive(path+"/"+item)

but with it, I received only the files in the first directory, the one
passed  as generator parameter,  and no  result from  the rest  of the
tree.

The   solution  is:   LOOP   OVER  RECURSIVE   CALL!   It's  sort   of
counterintuitive to me, but that's it:

def find_pdf_files_in_dir_recursive(directory):
    for path in directory.split(os.pathsep):
        # check first in the dir passed as parameter
        for match in glob.glob(os.path.join(path, "*.pdf")):
            yield match
        # dircache output is sorted and cached
        # let's join path and item, since files list
        # returned from listdir has path stripped off
        for subpath in [os.path.join(path, item) for item in dircache.listdir(path)]:
            # if it's a dir, then go recursive on it
            if os.path.isdir(subpath):
                # yield every item found in the recursive call!
                for subfile in find_pdf_files_in_dir_recursive(subpath):
                    yield subfile

And there you have your recursive generator.