Recursion with Python

Python recursion

I had a bug that was difficult to trace down. I had a double list that I removed some of the elements using the remove() method, however, not all of the elements were removed. In fact, the code was removing only every other element. The bug turned out to be the call to remove() would shorten the list by one, and thus cause a skipping effect. For example:

some_lst = [['a','b'],['c','d'],['e','f']]

for i in some_lst:
  if re.match(regexToMatch, i):
     some_lst.remove(i)

I needed to rewind the list if remove() was called. At first, I thought this would be perfect for recursion. As it turned out, with the amount of data I needed the algorithm to search through, recursion was not a the best solution.

A recursive algorithm is one that calls itself. These functions will additionally have a decrementing counter or meet some condition to exit the recursive call. Having an exit condition is required since the function calls itself, otherwise, the function will loop until the system runs out of memory. This is a fast algorithm, however, it is not suited for all conditions. For example, small datasets recursion is best with regards to performance, but as the data grows, performance decreases quickly. This is because memory is consumed and the process incurs a context switch for every function call. The code below worked great on a small subset of data, but did not scale to what was needed for the project.

import re

def remove_processes(the_list: list):
    for (hostname, process) in the_list:
        for line in exclude_processes:
            try:
                if re.search(line.strip(), process):
                    the_list.remove([hostname,process])
                    remove_processes(theList)
            except ValueError:
                continue
    return the_list

Eventually, I replaced the for loop with while() and a counter. When a match is found, count is subtracted by 1 so every element in theList is evaluated. This also removes any duplicates.

def remove_processes(the_list):
    count = 0
    while count <= len(the_list) - 1:
        hostname = the_list[count][0]
        process = the_list[count][1]
        count += 1
    try:
        if re.search(exclude_process, process):
            the_list.remove([hostname, process])
            # reduce the count by 1 which resolves skipping
            # every other element
            count -= 1
    except ValueError:
        continue

    return the_list

Using Recursion with JSON

Here is another example of recursion. recurs() searches for k in a JSON document. If key isn’t found, it continues to the next element until all elements are searched. Once key is found, the function updates the value to “changed”. I used this function to remove passwords in a JSON document so it could be sent externally.

def recurs(k: str, json_doc: dict):
    """
    Takes a string, k and searches through json_doc
    :param k: search string
    :param json_doc: JSON / dictionary to search through for k
    """
    for i in json_doc.keys():
        if isinstance(json_doc[i], list):
            for p in range(len(json_doc[i])):
                json_doc[i][p][k] = 'changed'
        elif isinstance(json_doc[i], str):
           if k in json_doc:
               json_doc[k] = 'changed'
        elif isinstance(json_doc[i], dict):
           json_doc[i][k] = 'changed'
        try:
            for v in json_doc[i].values():
                if isinstance(v, dict):
                   return recurs(k, v)
        except AttributeError:
            pass
    return json_doc