Feb 2026/8 min

Fixing Lost Rich Output in CAMEL's E2B Code Interpreter

An early-return chain silently dropped matplotlib plots and images. The fix: replace it with an accumulation pattern that collects output from all sources.

Pull Request Issue

CAMELPythonDebugging

The Problem

CAMEL's E2BInterpreter is a sandboxed code execution environment powered by the e2b SDK. Users reported that when their agents generated matplotlib plots or other visual outputs, the results were silently lost — the interpreter returned only text output or an empty string.

For example, running:

python

import matplotlib.pyplot as plt
plt.plot([1, 2, 3], [4, 5, 6])
plt.savefig("plot.png")
plt.show()

Would return empty output, even though the plot was successfully generated in the sandbox.

Finding the Root Cause

I opened camel/interpreters/e2b_interpreter.py and found the run() method's output handling:

python

if execution.text and execution.text.lower() != "none":
    return execution.text

if execution.logs:
    if execution.logs.stdout:
        return ",".join(execution.logs.stdout)
    elif execution.logs.stderr:
        return ",".join(execution.logs.stderr)

return str(execution.error)

The problem is an early-return chain that only checks two places: execution.text and execution.logs. But the e2b SDK stores rich outputs (images, HTML, SVG) in a third location: execution.results. This field was completely ignored.

I verified this by reading the e2b SDK source — execution.results is a list of Result objects, each with methods like _repr_png_(), _repr_jpeg_(), _repr_svg_(), and _repr_html_(). This is the standard Jupyter-style _repr_*() protocol.

I also noticed two other bugs in the original code:

stdout and stderr were mutually exclusive due to elif — if both existed, stderr was lost
`str(execution.error)` when there's no error returned the string "None" instead of something meaningful

The Fix

I replaced the early-return chain with an accumulation pattern that collects output from all sources:

python

output_parts = []

# 1. Text output
if execution.text and execution.text.lower() != "none":
    output_parts.append(execution.text)

# 2. Rich outputs from execution.results
if execution.results:
    for result in execution.results:
        png_data = result._repr_png_()
        if png_data:
            output_parts.append(f"\n![image](data:image/png;base64,{png_data})\n")
            continue
        jpeg_data = result._repr_jpeg_()
        if jpeg_data:
            output_parts.append(f"\n![image](data:image/jpeg;base64,{jpeg_data})\n")
            continue
        svg_data = result._repr_svg_()
        if svg_data:
            output_parts.append(f"\n{svg_data}\n")
            continue
        html_data = result._repr_html_()
        if html_data:
            output_parts.append(html_data)
            continue

# 3. Logs (both stdout AND stderr now)
if execution.logs:
    if execution.logs.stdout:
        output_parts.append(",".join(execution.logs.stdout))
    if execution.logs.stderr:  # Changed from elif to if
        output_parts.append(f"[stderr] {','.join(execution.logs.stderr)}")

# 4. Combine all parts
if output_parts:
    return "\n".join(output_parts)

# 5. Error handling (only when there's no output)
if execution.error:
    return f"{execution.error.name}: {execution.error.value}\n{execution.error.traceback}"

return "Code executed successfully (no output)."

Key design decisions:

Images are returned as markdown image embeds with base64 data URIs. This means downstream consumers (chat UIs, notebooks) can render them directly without file I/O.
Priority order for rich results: PNG > JPEG > SVG > HTML > text fallback.
Stderr is prefixed with [stderr] so users can distinguish it from stdout.

Key Takeaways

Read the SDK documentation. The e2b SDK had a clear API for rich output (execution.results with _repr_*() methods), but the original code simply didn't use it.
Early-return chains can hide bugs. The return execution.text at the top meant everything below was dead code when text output existed.
Accumulate, don't early-return. When output can come from multiple sources, collect everything first and combine at the end.

Impact & Reflection

Impact: CAMEL is one of the leading multi-agent frameworks with 10,000+ GitHub stars. This fix restored a core capability — visual output from code execution — that was completely broken. Any agent workflow involving data visualization (matplotlib plots, chart generation, image processing) was silently returning empty strings. The fix also improved stdout/stderr handling for all E2B code execution, not just rich output.

What I learned about reading unfamiliar codebases: This was my first contribution to CAMEL, and I had to understand their interpreter abstraction layer quickly. The key was comparing the broken E2BInterpreter against the working JupyterKernelInterpreter in the same codebase — the Jupyter version already handled execution.results correctly. This "find a working sibling" pattern has become my go-to strategy when reading new codebases: before debugging broken code, find similar code that works and diff the approaches.

A pattern I now watch for: Early-return chains that silently swallow data are a code smell I actively look for during code review. The pattern if A: return A; if B: return B; return C is fine when the cases are mutually exclusive, but dangerous when they can overlap. I now default to the accumulation pattern (collect all outputs, combine at the end) whenever multiple data sources might produce results simultaneously.

All Posts