Featured image of post Developing Parallax Maker with code help from LLMs

Developing Parallax Maker with code help from LLMs

My Experience Using LLMs to develop Parallax Maker in Python.

I no longer write code on a regular basis, but on occasion, I feel the need to get back into it. For the last few months, I have been writing a mobile game in Unreal Engine 5.3 that will teach players about the different musical modes. I need an introductory level for each musical mode and thought that using 2.5D parallax animation would be a good way to create the visuals. Unfortunately, the manual route via Photoshop can be very time-consuming. Instead, I decided to write an AI-powered frontend that would help me create these 2.5D animations more quickly.

I also wanted to take the opportunity to explore new tools and technologies that would allow me to develop faster. In addition to GitHub Copilot, which helps with autocompletion, I also used GPT-4.0-Turbo and Claude 3 Opus. This post details my experiences, indicating the tools’ capabilities and limitations, and places where multiple prompts were necessary.

Starting with GitHub Copilot

Writing with Github Copilot is quite convenient. It has very good native integration into VS Code and often times provides auto completions for very mechanical code. For example, I like to use longer variables names and constants, Copilot usually completes them faster than I can type on the keyboard. The same is true for function calls with many input parameters. On the other hand, Github Copilot falls apart quickly for many more complex tasks.

Transitioning to Large Language Models

Both GPT-4.0-Turbo and Claude 3 Opus are quite helpful with more complex challenges. For Parallax Maker, I encountered my first problem when I realized I needed a Python-powered web front but did not know which technology to use. I was aware of Gradio and Streamlit from previous explorations but I did not know their API surface and their limitations. My first task for the LLM was to rewrite my CLI Python tool to accept configurable parameters from the Web UI. Claude provided the following to me:

1
2
3
4
5
6
7
8
9
import streamlit as st
from your_script import process_image

num_slices = st.slider("Number of slices", 2, 10, 5, 1)
use_simple = st.checkbox("Use simple thresholds")

process_image("input.png", "output", num_slices, use_simple, ...)

st.image("output/rendered_image.png")

Looking at the code, I quickly realized that streamlit required the whole program to be re-executed which would make it difficult to create an interative stateful UI. I asked the model for recommendations for other frameworks and it provides a good list:

  • Dash (also shown as option 5 under the name of Plotly Dash)
  • Gradio
  • Panel
  • Bokeh

Since I was aware of Gradio, I asked the model to instead turn my Python script into a Gradio App:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
image_input = gr.Image(label="Input Image")
num_slices_slider = gr.Slider(minimum=2, maximum=10, step=1, value=5, label="Number of Slices")
use_simple_thresholds_checkbox = gr.Checkbox(label="Use Simple Thresholds")
skip_depth_map_checkbox = gr.Checkbox(label="Skip Depth Map Generation")
skip_image_slices_checkbox = gr.Checkbox(label="Skip Image Slices Generation")
skip_image_animation_checkbox = gr.Checkbox(label="Skip Image Animation Generation")

image_output = gr.Image(label="Output Image")

inputs = [
    image_input,
    num_slices_slider,
    use_simple_thresholds_checkbox,
    skip_depth_map_checkbox,
    skip_image_slices_checkbox,
    skip_image_animation_checkbox
]

title = "2.5D Parallax Animation Generator"
description = "Upload an image and adjust parameters to generate a 2.5D parallax animation."

gr.Interface(fn=update_image, inputs=inputs, outputs=image_output, title=title, description=description).launch()

It required three iterations, telling the model about Python errors each time before I got code that seemed to work. Since I knew next to nothing about Gradio, this was still faster than trying to read documentation or tutorials. In the end, Gradio did not suite my needs either, so I finally landed on Dash. The ability to quickly prototype different frameworks was a huge time saver for me.

Leveraging AI for Code Generation

The LLMs were incredibly helpful at quickly navigating interfaces for modules or libraries that were new to me. For instance, I wanted to process a click event on an image within the Dash app and needed an example on how to set that up:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
app.layout = html.Div([
    html.Img(id='clickable-image', src='/path/to/image.png'),
    html.Div(id='output-container')
])

@app.callback(
    Output('output-container', 'children'),
    [Input('clickable-image', 'n_clicks')],
    [State('clickable-image', 'n_clicks_timestamp')]
)
def display_click_data(n_clicks, n_clicks_timestamp):
    if n_clicks:
        return f'Image clicked at {n_clicks_timestamp}'
    return "Image not clicked yet"

While the models are quite powerful, they get a lot of the code wrong. Their benefit to me was that while the details might be off, the used interfaces and overall structure pointed me in the right direction. The AI-generated code snippets for features like dynamically generated sliders looked like this:

1
2
3
4
5
6
7
8
9
@app.callback(
    Output('slider-container', 'children'),
    [Input('num-slider', 'value')]
)
def update_sliders(number_of_sliders):
    sliders = []
    for i in range(number_of_sliders):
        sliders.append(dcc.Slider(id=f'slider-{i}'))
    return html.Div(sliders)

Human Intervention Required

This implementation still required me to write the concrete code and set it up to use Dash Pattern matching callbacks. And yes, I had to read the documentation on how to use them:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
@app.callback(
    Output('thresholds-container', 'children'),
    Input('update-thresholds-container', 'data'),
    State('application-state-filename', 'data'),
    prevent_initial_call=True
)
def update_thresholds_html(value, filename):
    state = AppState.from_cache(filename)
    thresholds = []
    for i in state.num_slices:
        thresholds.append(html.Div([
            dcc.Slider(
                id={'type': 'threshold-slider', 'index': i},
                min=0,
                max=255,
                value=state.imgThresholds[i],
            )
        ]))

    return thresholds

Moments of Delight

There were also moments were I was completely blown away. I was writing code to patch part of an image with the nearest surrounding pixels to help stable diffusion to be less anchored on the parts of the picture that needed to be painted out. I had written a Python function to do so and asked the model:

Is there a way to compile this part of the program to make it run faster?

Niels

Anthropic’s Claude 3 Opus enlightened me:

Yes, you can use numba, a Just-In-Time (JIT) compiler for Python, to compile this part of the program and make it run faster.

Claude 3 Opus

And it then provided a code snippet that was pointing me in the right direction:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
import numba as nb

@nb.jit(nopython=True, parallel=True)
def fill_nearest_alpha(alpha, nearest_alpha, height, width):
    for i in nb.prange(height):
        for j in range(width):
            if alpha[i, j] == 255:
                nearest_alpha[i, j] = np.array([[i, j], [i, j], [i, j], [i, j]])
            else:
                # other code

# Initialize the nearest_alpha array with -1 (indicating no nearest alpha pixel)
nearest_alpha = np.full((height, width, 4, 2), -1, dtype=np.int64)

fill_nearest_alpha(alpha, nearest_alpha, height, width)

For additional context, I had used dynamic programming to speed up the calculations. This meant that each data row depended on the previous one and it also meant that neither parallel=True nor nb.prange were applicable here. I was not aware of Numba though and also did not know that there was an llvmlite backend for Python that would help with translating Python and numpy code into IR to be compiled. The resulting code was 10x faster in the end.

Conclusion

Overall, using LLMs to help me write code has significantly sped up my Dash application development. These tools are great at exploring large API surfaces and generating initial code drafts. However, as devlopers, we still need to pay a lot of attention. Our intuition and expertise remain crucial, particularly when writing complex logic that involves tricky corner cases. That’s where test-driven development can help and it turns out Github Copilot happily offers to write test cases for you, too. They just need additional help to work correctly.

You can find my Parallax Maker code on Github.

The views expressed on these pages are my own and do not represent the views of anyone else.
Built with Hugo - Theme Stack designed by Jimmy