Computational Thinking (CO2412) Tutorial 2.3: Calendar Week 46

2.3.1 Linked-list data structures

  1. We will look at the singly linked list example implementation and discuss how the code would need to be modified in order to implement a doubly linked list. (Task for discussion during the tutorial session. The outcome of this session is available here as a notebook.)
  2. In the lecture, we defined the modulo-3 copying problem as follows:

    The input/argument is given by a sequential (list-like) data structure (e.g., dynamic array or linked list). Initially, that list contains n integer numbers.

    The function does not have an output/return value as such; its outcome should be that the content of the list is modified such that each list element x is replaced by x modulo 3 copies of itself. In other words, elements that are multiples of 3 are deleted (zero copies are retained). If their remainder upon division by 3 equals 1, nothing is done (one copy is retained). If the remainder is two (x % 3 == 2), a copy of the element is inserted into the list, next to the original element.

    In the linked-list Jupyter Notebook we implemented this for dynamic arrays (Python lists) and singly linked lists, comparing their performance. Do an additional performance measurement for doubly linked lists, evaluating whether they are a suitable data structure for addressing this problem.

2.3.2 Knapsack problem

Problem definition

The knapsack problem is one of the classical problems from computing with many relevant practical applications situations where finite resources need to be allocated, e.g., in scheduling. It is defined as follows. As input/arguments, we are given:

A function solving the knapsack problem should create and return a list of selected item indices such that the sum over all weights does not exceed the load capacity, while the total value of the selected items is as large as possible.

Dantzig's algorithm

A greedy algorithm for the knapsack problem was proposed by Dantzig: Compute the value-per-weight ratio v[i]/w[i] for each item, sort the items by that ratio, and then keep selecting the items with the greatest value-per-weight ratio as long as they fit into the load capacity. This is implemented in the knapsack Jupyter notebook as follows:

def knapsack_dantzig(c, w, v):
    
    # below, we are relying on Python's built-in sorting functionality:
    #
    # first, a list of (v[i]/w[i], i) pairs is created;
    # sorted() sorts that list by its first element, i.e., in ascending order of value per weight
    # reverse() turns that list around, so that it is arranged in descending order
    #
    value_per_weight = sorted([(v[i]/w[i], i) for i in range(len(w))])
    value_per_weight.reverse()
    
    selected_items = []
    remaining_capacity = c
    for (ratio, i) in value_per_weight:
        if remaining_capacity >= w[i]:
            selected_items.append(i)
            remaining_capacity -= w[i]
    return selected_items

Program analysis

Does Dantzig's greedy algorithm for the knapsack problem determine the best possible selection of items in all cases, i.e., the one with the greatest total value under the constraint that the total weight may not exceed the load capacity?

  1. If no, provide a counterexample, and assess to what extent the greedy algorithm is still suitable as a tool for determining approximations to the best possible solution in representative cases.
  2. If yes, document its correctness, e.g., by analysing a program flow graph.

Submission deadline: 4th December 2021; discussion planned for 16th December 2021. Group work by up to four people is welcome.