DSPy

Intro

DSPy kept popping up on my X timeline and I thought it looked pretty interesting, so I decided to take a few days to look into it. I didn't get super deep into it yet, but I think I have a high level understanding. The library is fairly new IMO (as of writing this). There is excitement around it though and a growing community. I am hopeful that the documentation and library will continue to improve throughout the year. If you are completely new to DSPy I would suggest the following resources below.

ENV Setup

python3 -m venv env
source env/bin/activate
pip install dspy-ai
pip install openai --upgrade
pip install --upgrade notebook ipywidgets
import os
os.environ["OPENAI_API_KEY"] = "YOUR_OPENAI_API_KEY"

BIG-Bench Hard Dataset - Penguins In a Table - Example

Within the BIG-Bench Hard dataset [@suzgun2022challenging] there are various tasks. You can use one of these strings when using load_dataset to load in the corresponding records for that task.

['tracking_shuffled_objects_seven_objects', 'salient_translation_error_detection', 'tracking_shuffled_objects_three_objects', 'geometric_shapes', 'object_counting', 'word_sorting', 'logical_deduction_five_objects', 'hyperbaton', 'sports_understanding', 'logical_deduction_seven_objects', 'multistep_arithmetic_two', 'ruin_names', 'causal_judgement', 'logical_deduction_three_objects', 'formal_fallacies', 'snarks', 'boolean_expressions', 'reasoning_about_colored_objects', 'dyck_languages', 'navigate', 'disambiguation_qa', 'temporal_sequences', 'web_of_lies', 'tracking_shuffled_objects_five_objects', 'penguins_in_a_table', 'movie_recommendation', 'date_understanding']

We will use the penguins_in_a_table task.

import dspy
from datasets import load_dataset

ds = load_dataset("maveriq/bigbenchhard", "penguins_in_a_table")["train"]
examples = [dspy.Example({"question": r["input"], "answer": r["target"]}).with_inputs("question") for r in ds]
print(f"There are {len(examples)} examples.")
trainset = examples[0:20]
valset = examples[20:]
There are 146 examples.
example = trainset[10]
for k, v in example.items():
    print(f"\n{k.upper()}:\n")
    print(v)

QUESTION:

Here is a table where the first line is a header and each subsequent line is a penguin:  name, age, height (cm), weight (kg) Louis, 7, 50, 11 Bernard, 5, 80, 13 Vincent, 9, 60, 11 Gwen, 8, 70, 15  For example: the age of Louis is 7, the weight of Gwen is 15 kg, the height of Bernard is 80 cm.  We then delete the penguin named Bernard from the table.
How many penguins are more than 8 years old?
Options:
(A) 1
(B) 2
(C) 3
(D) 4
(E) 5

ANSWER:

(A)

We will use the DSPy OpenAI connector to make calls to gpt-3.5. Note that DSPy caches API calls so that subsequent calls with the same input will read from the cache instead of calling the OpenAI API a second time.

llm = dspy.OpenAI(model="gpt-3.5-turbo-0125", max_tokens=250)
dspy.settings.configure(lm=llm)

We can test that the calls to OpenAI are working:

llm("Testing testing, is anyone out there?")
["Hello! I'm here to help. What can I assist you with today?"]
llm(example.question)
['There are 2 penguins who are more than 8 years old: Vincent (9 years old) and Gwen (8 years old). \n\nTherefore, the answer is (B) 2.']

At any point we can look at the last n calls to the llm:

llm.inspect_history(n=2)




Testing testing, is anyone out there? Hello! I'm here to help. What can I assist you with today?







Here is a table where the first line is a header and each subsequent line is a penguin:  name, age, height (cm), weight (kg) Louis, 7, 50, 11 Bernard, 5, 80, 13 Vincent, 9, 60, 11 Gwen, 8, 70, 15  For example: the age of Louis is 7, the weight of Gwen is 15 kg, the height of Bernard is 80 cm.  We then delete the penguin named Bernard from the table.
How many penguins are more than 8 years old?
Options:
(A) 1
(B) 2
(C) 3
(D) 4
(E) 5 There are 2 penguins who are more than 8 years old: Vincent (9 years old) and Gwen (8 years old). 

Therefore, the answer is (B) 2.


Our evaluation metric will check if the llm output contains the correct multiple choice answer. To define an evaluation metric in DSPy we create a function like the example below. The first two inputs should be instances of dspy.Example. The metric function can contain any logic you need to evaluate your task. You can read more about the trace argument in the documentation. It needs to be there, even if not explicitly using it.

import re


def eval_metric(true, prediction, trace=None):
    pred = prediction.answer
    matches = re.findall(r"\([A-Z]\)", pred)
    parsed_answer = matches[-1] if matches else ""
    return parsed_answer == true.answer

We set up an evaluation pipeline:

from dspy.evaluate import Evaluate

evaluate = Evaluate(devset=valset, metric=eval_metric, num_threads=6, display_progress=True, display_table=10)

Here is a simple module in DSPy for basic question and answer.

class BasicQA(dspy.Module):
    def __init__(self):
        super().__init__()
        self.prog = dspy.Predict("question -> answer")

    def forward(self, question):
        return self.prog(question=question)


basic_qa = BasicQA()

The forward method calls __call__ similar to how things work in pytorch.

pred = basic_qa(question=example.question)
print("\nQUESTION:\n")
print(example.question)
print("\nANSWER:\n")
print(example.answer)
print("\nPREDICTION:\n")
print(pred.answer)

QUESTION:

Here is a table where the first line is a header and each subsequent line is a penguin:  name, age, height (cm), weight (kg) Louis, 7, 50, 11 Bernard, 5, 80, 13 Vincent, 9, 60, 11 Gwen, 8, 70, 15  For example: the age of Louis is 7, the weight of Gwen is 15 kg, the height of Bernard is 80 cm.  We then delete the penguin named Bernard from the table.
How many penguins are more than 8 years old?
Options:
(A) 1
(B) 2
(C) 3
(D) 4
(E) 5

ANSWER:

(A)

PREDICTION:

(B) 2
eval_metric(example, pred)
False
llm.inspect_history(n=1)




Given the fields `question`, produce the fields `answer`.

---

Follow the following format.

Question: ${question}
Answer: ${answer}

---

Question: Here is a table where the first line is a header and each subsequent line is a penguin: name, age, height (cm), weight (kg) Louis, 7, 50, 11 Bernard, 5, 80, 13 Vincent, 9, 60, 11 Gwen, 8, 70, 15 For example: the age of Louis is 7, the weight of Gwen is 15 kg, the height of Bernard is 80 cm. We then delete the penguin named Bernard from the table. How many penguins are more than 8 years old? Options: (A) 1 (B) 2 (C) 3 (D) 4 (E) 5
Answer: (B) 2


Now we can pass each example question through the LLM in the validation set and check if we get the correct answer:

evaluate(basic_qa)
Average Metric: 44 / 126  (34.9): 100%|██████████| 126/126 [00:00<00:00, 1308.82it/s]
Average Metric: 44 / 126  (34.9%)

/Users/christopher/personal_projects/mysite/posts/dspy/env/lib/python3.11/site-packages/dspy/evaluate/evaluate.py:137: FutureWarning: DataFrame.applymap has been deprecated. Use DataFrame.map instead.
  df = df.applymap(truncate_cell)
  question example_answer pred_answer eval_metric
0 Here is a table where the first line is a header and each subsequent line is a penguin: name, age, height (cm), weight (kg) Louis,... (A) 3 False
1 Here is a table where the first line is a header and each subsequent line is a penguin: name, age, height (cm), weight (kg) Louis,... (D) (C) 50 False
2 Here is a table where the first line is a header and each subsequent line is a penguin: name, age, height (cm), weight (kg) Louis,... (A) Answer: (C) 3 False
3 Here is a table where the first line is a header and each subsequent line is a penguin: name, age, height (cm), weight (kg) Louis,... (A) Answer: (B) 2 False
4 Here is a table where the first line is a header and each subsequent line is a penguin: name, age, height (cm), weight (kg) Louis,... (B) (B) 5 ✔️ [True]
5 Here is a table where the first line is a header and each subsequent line is a penguin: name, age, height (cm), weight (kg) Louis,... (C) (B) 2 False
6 Here is a table where the first line is a header and each subsequent line is a penguin: name, age, height (cm), weight (kg) Louis,... (E) James False
7 Here is a table where the first line is a header and each subsequent line is a penguin: name, age, height (cm), weight (kg) Louis,... (A) (B) 2 False
8 Here is a table where the first line is a header and each subsequent line is a penguin: name, age, height (cm), weight (kg) Louis,... (C) Answer: Vincent False
9 Here is a table where the first line is a header and each subsequent line is a penguin: name, age, height (cm), weight (kg) Louis,... (D) Answer: Donna False
... 116 more rows not displayed ...
34.92

DSPy uses optimizers to optimize the modules. In this example, optimization is a process that will choose which demos/examples are best to put into the prompt in order to increase the evaluation metric. At the time of writing the optimizers are called teleprompters (prompting from a distance). I think they will change the name though to optimizers in future refactoring. The DSPy documentation states that the optimizer can adjust/edit:

  • Demo examples in the prompt.
  • Instructions of the prompt.
  • Weights of the actual LLM (for example fine tuning an open source model).

I have only played around with optimizers that optimize which demos/examples are put into the prompt.

from dspy.teleprompt import BootstrapFewShotWithRandomSearch

config = dict(max_bootstrapped_demos=2, max_labeled_demos=4, num_candidate_programs=2, num_threads=6)

teleprompter = BootstrapFewShotWithRandomSearch(metric=eval_metric, **config)
optimized_qa = teleprompter.compile(basic_qa, trainset=trainset, valset=valset)

There is a lot of output from the above code block which I am hiding to keep things cleaner. You can now evaluate the optimized model to see if the accuracy has improved.

evaluate(optimized_qa)
Average Metric: 54 / 126  (42.9): 100%|██████████| 126/126 [00:00<00:00, 1346.62it/s]
Average Metric: 54 / 126  (42.9%)
  question example_answer pred_answer eval_metric
0 Here is a table where the first line is a header and each subsequent line is a penguin: name, age, height (cm), weight (kg) Louis,... (A) (C) False
1 Here is a table where the first line is a header and each subsequent line is a penguin: name, age, height (cm), weight (kg) Louis,... (D) (C) 50 False
2 Here is a table where the first line is a header and each subsequent line is a penguin: name, age, height (cm), weight (kg) Louis,... (A) (B) False
3 Here is a table where the first line is a header and each subsequent line is a penguin: name, age, height (cm), weight (kg) Louis,... (A) (B) False
4 Here is a table where the first line is a header and each subsequent line is a penguin: name, age, height (cm), weight (kg) Louis,... (B) (B) ✔️ [True]
5 Here is a table where the first line is a header and each subsequent line is a penguin: name, age, height (cm), weight (kg) Louis,... (C) (C) ✔️ [True]
6 Here is a table where the first line is a header and each subsequent line is a penguin: name, age, height (cm), weight (kg) Louis,... (E) (D) False
7 Here is a table where the first line is a header and each subsequent line is a penguin: name, age, height (cm), weight (kg) Louis,... (A) (B) False
8 Here is a table where the first line is a header and each subsequent line is a penguin: name, age, height (cm), weight (kg) Louis,... (C) (D) Gwen False
9 Here is a table where the first line is a header and each subsequent line is a penguin: name, age, height (cm), weight (kg) Louis,... (D) (D) Donna ✔️ [True]
... 116 more rows not displayed ...
42.86
llm.inspect_history()




Given the fields `question`, produce the fields `answer`.

---

Follow the following format.

Question: ${question}
Answer: ${answer}

---

Question: Here is a table where the first line is a header and each subsequent line is a penguin: name, age, height (cm), weight (kg) Louis, 7, 50, 11 Bernard, 5, 80, 13 Vincent, 9, 60, 11 Gwen, 8, 70, 15 For example: the age of Louis is 7, the weight of Gwen is 15 kg, the height of Bernard is 80 cm. And here is a similar table, but listing giraffes: name, age, height (cm), weight (kg) Jody, 5, 430, 620 Gladys, 10, 420, 590 Marian, 2, 310, 410 Donna, 9, 440, 650 How many giraffes are more than 5 years old? Options: (A) 1 (B) 2 (C) 3 (D) 4 (E) 5
Answer: (B)

---

Question: Here is a table where the first line is a header and each subsequent line is a penguin: name, age, height (cm), weight (kg) Louis, 7, 50, 11 Bernard, 5, 80, 13 Vincent, 9, 60, 11 Gwen, 8, 70, 15 For example: the age of Louis is 7, the weight of Gwen is 15 kg, the height of Bernard is 80 cm. What is the name of the last penguin sorted by alphabetic order? Options: (A) Louis (B) Bernard (C) Vincent (D) Gwen (E) James
Answer: (C)

---

Question: Here is a table where the first line is a header and each subsequent line is a penguin: name, age, height (cm), weight (kg) Louis, 7, 50, 11 Bernard, 5, 80, 13 Vincent, 9, 60, 11 Gwen, 8, 70, 15 For example: the age of Louis is 7, the weight of Gwen is 15 kg, the height of Bernard is 80 cm. We now add a penguin to the table: James, 12, 90, 12 We then delete the penguin named Bernard from the table. How many penguins are more than 5 years old and weight more than 12 kg? Options: (A) 1 (B) 2 (C) 3 (D) 4 (E) 5
Answer: (A)

---

Question: Here is a table where the first line is a header and each subsequent line is a penguin: name, age, height (cm), weight (kg) Louis, 7, 50, 11 Bernard, 5, 80, 13 Vincent, 9, 60, 11 Gwen, 8, 70, 15 For example: the age of Louis is 7, the weight of Gwen is 15 kg, the height of Bernard is 80 cm. How many animals are listed in the table? Options: (A) 1 (B) 2 (C) 3 (D) 4 (E) 5
Answer: (D)

---

Question: Here is a table where the first line is a header and each subsequent line is a penguin: name, age, height (cm), weight (kg) Louis, 7, 50, 11 Bernard, 5, 80, 13 Vincent, 9, 60, 11 Gwen, 8, 70, 15 For example: the age of Louis is 7, the weight of Gwen is 15 kg, the height of Bernard is 80 cm. Which is the second heaviest penguin? Options: (A) Louis (B) Bernard (C) Vincent (D) Gwen (E) James
Answer: (B) Bernard


Now we can try a Chain of Thought [@wei2023chainofthought] prompt.

class CoT(dspy.Module):
    def __init__(self):
        super().__init__()
        self.prog = dspy.ChainOfThought("question -> answer")

    def forward(self, question):
        return self.prog(question=question)


cot_qa = CoT()
evaluate(cot_qa)
Average Metric: 90 / 126  (71.4): 100%|██████████| 126/126 [00:00<00:00, 1254.30it/s]
Average Metric: 90 / 126  (71.4%)
  question example_answer rationale pred_answer eval_metric
0 Here is a table where the first line is a header and each subsequent line is a penguin: name, age, height (cm), weight (kg) Louis,... (A) produce the answer. We first identify the penguins who are less than 8 years old. From the table, we see that Louis is 7 years... (B) 2 False
1 Here is a table where the first line is a header and each subsequent line is a penguin: name, age, height (cm), weight (kg) Louis,... (D) produce the answer. We need to add up the weights of all the penguins in the table. Louis weighs 11 kg, Bernard weighs 13 kg,... (D) 62 ✔️ [True]
2 Here is a table where the first line is a header and each subsequent line is a penguin: name, age, height (cm), weight (kg) Louis,... (A) produce the answer. We need to go through each penguin's age and count how many are more than 8 years old. (C) 3 False
3 Here is a table where the first line is a header and each subsequent line is a penguin: name, age, height (cm), weight (kg) Louis,... (A) produce the answer. We need to identify the penguins who are both more than 5 years old and weigh more than 12 kg. Looking at... (C) 3 False
4 Here is a table where the first line is a header and each subsequent line is a penguin: name, age, height (cm), weight (kg) Louis,... (B) produce the answer. We can see from the table that Bernard's age is 5. (B) 5 ✔️ [True]
5 Here is a table where the first line is a header and each subsequent line is a penguin: name, age, height (cm), weight (kg) Louis,... (C) produce the answer. We first identify the penguins who are less than 10 years old. Louis is 7 years old, Bernard is 5 years old,... (D) 4 False
6 Here is a table where the first line is a header and each subsequent line is a penguin: name, age, height (cm), weight (kg) Louis,... (E) produce the answer. We need to identify the last penguin added to the table. By looking at the last entry in the penguin table, we... James False
7 Here is a table where the first line is a header and each subsequent line is a penguin: name, age, height (cm), weight (kg) Louis,... (A) produce the answer. We first need to identify the penguins who are more than 5 years old and weigh more than 12 kg. From the... (A) 1 ✔️ [True]
8 Here is a table where the first line is a header and each subsequent line is a penguin: name, age, height (cm), weight (kg) Louis,... (C) produce the answer. We need to find the penguin with a height of 60 cm. Looking at the table, we see that Vincent is the... (C) Vincent ✔️ [True]
9 Here is a table where the first line is a header and each subsequent line is a penguin: name, age, height (cm), weight (kg) Louis,... (D) produce the answer. We need to look at the last entry in the table listing giraffes. The last giraffe listed is Donna. (D) Donna ✔️ [True]
... 116 more rows not displayed ...
71.43
llm.inspect_history()




Given the fields `question`, produce the fields `answer`.

---

Follow the following format.

Question: ${question}
Reasoning: Let's think step by step in order to ${produce the answer}. We ...
Answer: ${answer}

---

Question: Here is a table where the first line is a header and each subsequent line is a penguin: name, age, height (cm), weight (kg) Louis, 7, 50, 11 Bernard, 5, 80, 13 Vincent, 9, 60, 11 Gwen, 8, 70, 15 For example: the age of Louis is 7, the weight of Gwen is 15 kg, the height of Bernard is 80 cm. We now add a penguin to the table: James, 12, 90, 12 Which penguin is taller than the other ones? Options: (A) Louis (B) Bernard (C) Vincent (D) Gwen (E) James
Reasoning: Let's think step by step in order to produce the answer. We need to compare the height of each penguin in the table and determine which one is the tallest. Louis is 50 cm tall, Bernard is 80 cm tall, Vincent is 60 cm tall, Gwen is 70 cm tall, and James is 90 cm tall. Therefore, James is taller than all the other penguins.
Answer: (E) James


Now we will try and optimize our chain of thought program. I am also hiding the output from this cell to keep things cleaner.

tqdm._instances.clear()
config = dict(max_bootstrapped_demos=1, max_labeled_demos=4, num_candidate_programs=4, num_threads=6)
teleprompter = BootstrapFewShotWithRandomSearch(metric=eval_metric, **config)
optimized_cot_qa = teleprompter.compile(cot_qa, trainset=trainset, valset=valset)
evaluate(optimized_cot_qa)
Average Metric: 102 / 126  (81.0): 100%|██████████| 126/126 [00:00<00:00, 4503.05it/s]
Average Metric: 102 / 126  (81.0%)
  question example_answer rationale pred_answer eval_metric
0 Here is a table where the first line is a header and each subsequent line is a penguin: name, age, height (cm), weight (kg) Louis,... (A) produce the answer. After deleting Bernard, the penguins left are Louis, Vincent, and Gwen. Among them, Louis and Gwen are less than 8 years old. (B) 2 False
1 Here is a table where the first line is a header and each subsequent line is a penguin: name, age, height (cm), weight (kg) Louis,... (D) produce the answer. We sum up the weights of all the penguins: 11 + 13 + 11 + 15 + 12 = 62. (D) 62 ✔️ [True]
2 Here is a table where the first line is a header and each subsequent line is a penguin: name, age, height (cm), weight (kg) Louis,... (A) produce the answer. We know that after deleting Bernard, the penguins left are Louis, Vincent, and Gwen. Among them, only Vincent is more than 8... (A) 1 ✔️ [True]
3 Here is a table where the first line is a header and each subsequent line is a penguin: name, age, height (cm), weight (kg) Louis,... (A) produce the answer. We have Louis, Vincent, Gwen, and James in the table. Among them, only James is more than 5 years old and weighs... (A) 1 ✔️ [True]
4 Here is a table where the first line is a header and each subsequent line is a penguin: name, age, height (cm), weight (kg) Louis,... (B) produce the answer. We know that the age of Bernard is 5 years old. (B) 5 ✔️ [True]
5 Here is a table where the first line is a header and each subsequent line is a penguin: name, age, height (cm), weight (kg) Louis,... (C) produce the answer. We know that after deleting Bernard, the penguins left are Louis, Vincent, and Gwen. Among them, Louis and Gwen are less than... (B) 2 False
6 Here is a table where the first line is a header and each subsequent line is a penguin: name, age, height (cm), weight (kg) Louis,... (E) produce the answer. We know that the last penguin added to the table is James. (E) James ✔️ [True]
7 Here is a table where the first line is a header and each subsequent line is a penguin: name, age, height (cm), weight (kg) Louis,... (A) produce the answer. After deleting Bernard, we are left with Louis, Vincent, and Gwen. Among them, only Gwen is more than 5 years old and... (A) 1 ✔️ [True]
8 Here is a table where the first line is a header and each subsequent line is a penguin: name, age, height (cm), weight (kg) Louis,... (C) produce the answer. We know that the only penguin with a height of 60 cm is Vincent. (C) Vincent ✔️ [True]
9 Here is a table where the first line is a header and each subsequent line is a penguin: name, age, height (cm), weight (kg) Louis,... (D) produce the answer. We know that the last giraffe listed is Donna. (D) Donna ✔️ [True]
... 116 more rows not displayed ...
80.95
llm.inspect_history(n=1)




Given the fields `question`, produce the fields `answer`.

---

Follow the following format.

Question: ${question}
Reasoning: Let's think step by step in order to ${produce the answer}. We ...
Answer: ${answer}

---

Question: Here is a table where the first line is a header and each subsequent line is a penguin: name, age, height (cm), weight (kg) Louis, 7, 50, 11 Bernard, 5, 80, 13 Vincent, 9, 60, 11 Gwen, 8, 70, 15 For example: the age of Louis is 7, the weight of Gwen is 15 kg, the height of Bernard is 80 cm. We then delete the penguin named Bernard from the table. How many penguins are more than 8 years old? Options: (A) 1 (B) 2 (C) 3 (D) 4 (E) 5
Reasoning: Let's think step by step in order to produce the answer. We know that after deleting Bernard, the penguins left are Louis, Vincent, and Gwen. Among them, only Vincent is more than 8 years old.
Answer: (A) 1

---

Question: Here is a table where the first line is a header and each subsequent line is a penguin: name, age, height (cm), weight (kg) Louis, 7, 50, 11 Bernard, 5, 80, 13 Vincent, 9, 60, 11 Gwen, 8, 70, 15 For example: the age of Louis is 7, the weight of Gwen is 15 kg, the height of Bernard is 80 cm. How many penguins are more than 5 years old? Options: (A) 1 (B) 2 (C) 3 (D) 4 (E) 5
Answer: (C)

---

Question: Here is a table where the first line is a header and each subsequent line is a penguin: name, age, height (cm), weight (kg) Louis, 7, 50, 11 Bernard, 5, 80, 13 Vincent, 9, 60, 11 Gwen, 8, 70, 15 For example: the age of Louis is 7, the weight of Gwen is 15 kg, the height of Bernard is 80 cm. And here is a similar table, but listing giraffes: name, age, height (cm), weight (kg) Jody, 5, 430, 620 Gladys, 10, 420, 590 Marian, 2, 310, 410 Donna, 9, 440, 650 How many animals are more than 5 years old? Options: (A) 5 (B) 6 (C) 7 (D) 8 (E) 9
Answer: (A)

---

Question: Here is a table where the first line is a header and each subsequent line is a penguin: name, age, height (cm), weight (kg) Louis, 7, 50, 11 Bernard, 5, 80, 13 Vincent, 9, 60, 11 Gwen, 8, 70, 15 For example: the age of Louis is 7, the weight of Gwen is 15 kg, the height of Bernard is 80 cm. Which penguin is older than Gwen? Options: (A) Louis (B) Bernard (C) Vincent (D) Gwen (E) James
Answer: (C)

---

Question: Here is a table where the first line is a header and each subsequent line is a penguin: name, age, height (cm), weight (kg) Louis, 7, 50, 11 Bernard, 5, 80, 13 Vincent, 9, 60, 11 Gwen, 8, 70, 15 For example: the age of Louis is 7, the weight of Gwen is 15 kg, the height of Bernard is 80 cm. We then delete the penguin named Bernard from the table. What is the name of the last penguin sorted by alphabetic order? Options: (A) Louis (B) Bernard (C) Vincent (D) Gwen (E) James
Reasoning: Let's think step by step in order to produce the answer. After deleting Bernard, the remaining penguins are Louis, Vincent, and Gwen. Sorting them alphabetically, the last penguin is Vincent.
Answer: (C) Vincent


It's really nice that the above focused on:

  • Writing small modules/programs.
  • Choosing an optimizer.
  • Running the compile/optimization step.
  • Running an evaluation.

I really like this idea instead of manually writing prompts and hoping for the best.