Understanding LLMs and Using Chain of Thought

November 4, 2024

                                                                           

***This article was originally published on Understanding LLM and using Chain of Thoughts on July 24, 2023 by Rick Hightower.***

Understanding LLMs and Using Chain of Thought

Author: Rick Hightower

We will look at a real-world use case that most developers and tech managers should understand. We will give ChatGPT a Java method and ask it to create a Mermaid sequence diagram.

Chain of Thought (CoT) prompting is a technique that improves the performance of Large Language Models (LLMs) on reasoning tasks. It uses few-shot learning. According to Toward Data Science, CoT helps LLMs handle complex tasks like common sense reasoning and arithmetic. It does this by breaking down multi-step requests into smaller steps. This creates a way to see and understand the process. It makes both the input and output easier to manage and tweak.

CoT prompting breaks a problem into a series of intermediate reasoning steps. This greatly improves the ability of LLMs to perform complex reasoning. There are different ways to use CoT prompting, such as few-shot CoT and zero-shot CoT. In few-shot CoT, you provide examples of Question-Answer pairs where the answer is explained step by step. In zero-shot CoT, you add “Let’s think step by step” before the answer to prompt the LLM to explain its reasoning. The benefits of CoT prompting are more noticeable with larger models. It leads to much better performance than standard prompting for large models. These findings are supported by experiments on three large language models, as described in the Google AI Blog and arXiv.

It is the context.

ChatGPT is an AI language model that can generate human-like text and have conversations. It is like a smart computer program that can understand and create text based on context. But it is important to remember that ChatGPT is not perfect. It can sometimes forget or misunderstand the context, especially in long or complex conversations. To help ChatGPT understand and remember the context better, you can try these things:

  1. Keep the context clear and concise: Make sure your instructions are easy to understand and short. This will help ChatGPT focus on the important information and reduce the chances of it forgetting the context.
  2. Repeat important information: If there is a specific rule or piece of information you want ChatGPT to remember, you can remind it of that information during the conversation. This will help reinforce the context and make it more likely that ChatGPT will remember it.
  3. Be patient and adjust your expectations: ChatGPT is an impressive AI tool, but it is not perfect. It might not always remember the context or follow the rules you set. In those cases, you can remind it of the context or rules and try again.
  4. Use Chain of Thought (CoT) to give ChatGPT context so that the output it generates is more likely to be what you want.

ChatGPT is like predictive text, so it is important to give it the right context to get the desired output. By following these tips, you can help improve ChatGPT’s understanding of the context and increase the chances of it generating the text you want. Let’s break down what CoT is and then look at an example. This article focuses on CoT.

Basics of Chain of Thought

Chain of Thought (CoT) prompting is a technique that improves the performance of Large Language Models (LLMs) on reasoning tasks. It uses few-shot learning. According to Toward Data Science, CoT helps LLMs handle complex tasks like common sense reasoning and arithmetic. It does this by breaking down multi-step requests into smaller steps. This creates a way to see and understand the process. It makes both the input and output easier to manage and tweak.

CoT prompting breaks a problem into a series of intermediate reasoning steps. This greatly improves the ability of LLMs to perform complex reasoning. There are different ways to use CoT prompting, such as few-shot CoT and zero-shot CoT. In few-shot CoT, you provide examples of Question-Answer pairs where the answer is explained step by step. In zero-shot CoT, you add “Let’s think step by step” before the answer to prompt the LLM to explain its reasoning. The benefits of CoT prompting are more noticeable with larger models. It leads to much better performance than standard prompting for large models. These findings are supported by experiments on three large language models, as described in the Google AI Blog and arXiv.

Let’s break CoT down a bit.

Imagine you are teaching a child to solve a problem step by step. Instead of just giving them the answer, you guide them through each step of the process. This helps them understand how to get to the solution. CoT is a similar approach used with LLMs, which are smart computer programs that can understand and generate text. Using CoT with LLMs means breaking down a problem into smaller, more manageable steps, just like you would with a child. This helps the LLM understand the problem better and improves its ability to reason and solve it.

Let’s look at a simple example of CoT prompting for a math word problem:

Question: Alice has 3 apples and Bob has 4 apples. If they both give 2 apples to Charlie, how many apples will Charlie have?

To solve this problem using CoT, we can break it down into these reasoning steps:

  1. Find out how many apples Alice gives to Charlie: Alice gives 2 apples.
  2. Find out how many apples Bob gives to Charlie: Bob gives 2 apples.
  3. Calculate the total number of apples Charlie gets: 2 (from Alice) + 2 (from Bob).

Answer: Charlie will have four apples.

In this example, the CoT approach helps to break the problem into smaller, more manageable steps. This makes it easier for an LLM to understand and solve the problem. By guiding the LLM through a structured sequence of steps, CoT prompting encourages the model to follow a logical thought process and improves its reasoning abilities.

For more information on CoT prompting, see these references:

A more real-world example

To test this, we will give ChatGPT a Java method and ask it to create a Mermaid sequence diagram. We want something that looks like this:

sequenceDiagram
    participant S as System
    participant M as Method
    participant O as OpenAIClient
    participant F as File
    participant B as Byte Array
    participant R as Request
    participant C as ClientResponse

    M->>S: Get OPENAI_API_KEY
    M->>O: Initialize with OPENAI_API_KEY
    M->>F: Create file object
    M->>F: Read all bytes from file
    F-->>M: Return byte array
    M->>B: Store byte array
    M->>R: Create transcription request
    M->>O: Call transcribe method with request
    O-->>M: Return ClientResponse
    M->>C: Store ClientResponse
    M->>C: Get response from ClientResponse
    C-->>M: Print body of response (if present)
    M->>C: Get exception from ClientResponse
    C-->>M: Print exception (if present)
    M->>C: Get status message from ClientResponse
    C-->>M: Print status message and status code (if present)

Do not worry if you do not know what Mermaid is. You will by the end of this article because we will have ChatGPT explain it multiple times as part of our CoT.

Failed one-shot attempt

Let’s look at an example of creating a Mermaid sequence diagram from a Java method using ChatGPT that fails. We will fix this later. The code provided is a method called callTranscribe(), which interacts with several different classes and methods. The user requested a step-by-step breakdown of how to evaluate the Java code to create a Mermaid sequence diagram.

ChatGPT fails to create the mermaid diagram with our first one-shot attempt.

Prompt

here is a method, can you create a mermaind sequence diagram from it?
private static void callTranscribe() throws IOException {
    // Create the client
    final var openAiKey = System.getenv("OPENAI_API_KEY");
    final OpenAIClient client = OpenAIClient.builder().setApiKey(openAiKey).build();

    File file = new File("/Users/richardhightower/Documents/audio_notes_patents/meeting_notes1.m4a");

    byte[] bytes = Files.readAllBytes(file.toPath());
    // Create the chat request
    final TranscriptionRequest request = TranscriptionRequest.builder()
            .model("whisper-1").prompt("Write up notes").language("en").file(bytes)
            .build();

    // Call Open AI API with chat message
    final ClientResponse<TranscriptionRequest, AudioResponse> response = client.transcribe(request);

    response.getResponse().ifPresent(r -> System.out.println(r.getBody()));

    response.getException().ifPresent(Throwable::printStackTrace);

    response.getStatusMessage().ifPresent(error -> System.out.printf("status message %s %d \n", error, response.getStatusCode().orElse(0)));
}

ChatGPT Response

In a UML sequence diagram, we typically show how objects interact in a system. These diagrams focus on the time sequence of interactions. The arrows between objects show which is sending the message and which is receiving it.

Here is the textual representation of the sequence diagram for your method:

  1. callTranscribe() -> System: getenv("OPENAI_API_KEY")
  2. callTranscribe() -> OpenAIClient: builder()
  3. callTranscribe() -> OpenAIClient.Builder: setApiKey(openAiKey)
  4. OpenAIClient.Builder -> OpenAIClient: build()
  5. callTranscribe() -> File: new File(path)
  6. callTranscribe() -> Files: readAllBytes(file.toPath())
  7. callTranscribe() -> TranscriptionRequest: builder()
  8. callTranscribe() -> TranscriptionRequest.Builder: model("whisper-1").prompt("Write up notes").language("en").file(bytes)
  9. TranscriptionRequest.Builder -> TranscriptionRequest: build()
  10. callTranscribe() -> OpenAIClient: transcribe(request)
  11. OpenAIClient -> ClientResponse<TranscriptionRequest, AudioResponse>: return response
  12. callTranscribe() -> ClientResponse<TranscriptionRequest, AudioResponse>: getResponse()
  13. ClientResponse<TranscriptionRequest, AudioResponse> -> Optional<AudioResponse>: return Optional
  14. callTranscribe() -> Optional<AudioResponse>: ifPresent(System.out.println(r.getBody()))
  15. callTranscribe() -> ClientResponse<TranscriptionRequest, AudioResponse>: getException()
  16. ClientResponse<TranscriptionRequest, AudioResponse> -> Optional<Throwable>: return Optional
  17. callTranscribe() -> Optional<Throwable>: ifPresent(Throwable::printStackTrace)
  18. callTranscribe() -> ClientResponse<TranscriptionRequest, AudioResponse>: getStatusMessage()
  19. ClientResponse<TranscriptionRequest, AudioResponse> -> Optional<String>: return Optional
  20. callTranscribe() -> Optional<String>: ifPresent(error -> System.out.printf("status message %s %d \n", error, response.getStatusCode().orElse(0)))

Second attempt at a one-shot prompt with some prompt engineering. What if we give it more context in our prompt? I will add this extra hint: “using mermaid markup for sequence diagrams.”

Prompt

using mermaid markup for sequnece diagrams...
here is a method, can you create a mermaind sequence diagram from it?
private static void callTranscribe() throws IOException {
    // Create the client
    final var openAiKey = System.getenv("OPENAI_API_KEY");
    final OpenAIClient client = OpenAIClient.builder().setApiKey(openAiKey).build();

    File file = new File("/Users/richardhightower/Documents/audio_notes_patents/meeting_notes1.m4a");

    byte[] bytes = Files.readAllBytes(file.toPath());
    // Create the chat request
    final TranscriptionRequest request = TranscriptionRequest.builder()
            .model("whisper-1").prompt("Write up notes").language("en").file(bytes)
            .build();

    // Call Open AI API with chat message
    final ClientResponse<TranscriptionRequest, AudioResponse> response = client.transcribe(request);

    response.getResponse().ifPresent(r -> System.out.println(r.getBody()));

    response.getException().ifPresent(Throwable::printStackTrace);

    response.getStatusMessage().ifPresent(error -> System.out.printf("status message %s %d \n", error, response.getStatusCode().orElse(0)));
}

ChatGPT Response

Mermaid markup language can be used to generate sequence diagrams. Here’s a sequence diagram representing the callTranscribe() method flow:

sequenceDiagram
    participant SysEnv as System Environment
    participant U as User
    participant C as callTranscribe Method
    participant OAC as OpenAIClient
    participant F as File
    participant FS as File System
    participant TR as TranscriptionRequest
    participant CR as ClientResponse

    U->>C: callTranscribe()
    C->>SysEnv: get OPENAI_API_KEY
    SysEnv-->>C: return OPENAI_API_KEY
    C->>OAC: create OpenAIClient
    C->>F: create File object
    F->>FS: locate the file
    FS-->>F: return the file's path
    C->>F: readAllBytes from file
    F-->>C: return bytes
    C->>TR: create TranscriptionRequest
    C->>OAC: transcribe(request)
    OAC-->>C: return ClientResponse
    C->>CR: getResponse
    CR-->>C: return AudioResponse or null
    C->>CR: getException
    CR-->>C: return exception or null
    C->>CR: getStatusMessage
    CR-->>C: return status message or null
    C->>U: end of callTranscribe()

Please note that this diagram only covers the flow in the code you provided, not potential branches for error handling. MermaidJS uses a simple language structure to create diagrams, so it may not completely or accurately represent all the nuanced branches of a complex process.

How did it do? I call this a pass because it is a valid Mermaid diagram, but I give it a D. Why? It uses code for the descriptions, which is verbose. So it passed, but I wanted something else. For descriptions, it is putting Java code. This breaks when the Java code is too long (like a stream.filter().filter().map().map().Collects.collect(…)), and I want something human-readable for a sequence diagram.

First attempt at the chain of thought

Let’s try our chain of thought approach. First, we will make ChatGPT think about the problem.

Prompt

how do you go about generating a mermaid sequence diagram from code
from a java method?

ChatGPT Response

Mermaid.js is a powerful tool that lets you create diagrams and flowcharts, but it does not have built-in functionality for automatically generating sequence diagrams from Java code. However, you can create a Mermaid sequence diagram based on the logic in your Java method.

Here’s an example of a sequence diagram in Mermaid:

sequenceDiagram
    participant Alice
    participant Bob
    Alice->>Bob: Hello Bob, how are you?
    Bob-->>Alice: Not bad, thank you!

However, if you are looking to generate a sequence diagram automatically from Java code, you might need to use a tool specifically designed for that purpose, such as PlantUML, UMLet, or ObjectAid.

Then, if you still want to use Mermaid.js for your diagrams, you would need to manually convert the output from these UML tools into the format used by Mermaid.js. This might involve writing a script or tool to automate the process, depending on the complexity and size of your codebase.

To summarize, the general process might look like this:

  1. Use a tool like PlantUML, UMLet, or ObjectAid to generate a UML sequence diagram from your Java code.
  2. Manually or programmatically convert the UML diagram into Mermaid.js syntax.
  3. Use Mermaid.js to render the sequence diagram based on the converted syntax.

Remember that automatic generation might not always capture the logic perfectly, especially for complex code, and manual adjustments may be necessary. It is also worth noting that generating diagrams from code is a form of reverse engineering, and its usefulness can depend on the complexity of your code and your specific needs.

Prompt

Break it down step by step how you would evalauate the Java code

This prompt forces it to think about what steps are involved, and we prime the context.

ChatGPT Response

Generating a Mermaid sequence diagram manually from Java code involves understanding the logic of the code and then translating that logic into a sequence diagram. Below is an example process.

Let’s assume we have the following simple Java code snippet:

class Main {
    public static void main(String[] args) {
        System.out.println("Hello, world!");

        Foo foo = new Foo();
        foo.bar();
    }
}

class Foo {
    void bar() {
        System.out.println("Hello from bar!");
    }
}

Here is the step-by-step process to convert this to a Mermaid sequence diagram:

  1. Identify the objects (or “participants” in sequence diagram terms) involved in the code. In this case, we have Main and Foo.
  2. Identify the method calls and the flow of interaction between the objects. In this case, Main calls Foo’s bar method.
  3. Using the Mermaid syntax, we will create a sequence diagram. The general syntax of a sequence diagram is:
sequenceDiagram
    participant Object1
    participant Object2
    Object1->>Object2: MethodCall
  1. Now let’s apply our objects and method calls from step 2:
sequenceDiagram
    participant Main
    participant Foo
    Main->>Foo: bar()

In this diagram, we see that Main calls the bar method on Foo.

Remember that real-world applications often involve more complex interactions and may include multiple method calls, branches (like if-else conditions), loops, etc. These can all be represented in a sequence diagram, but will require a more complex syntax. Make sure to refer to the Mermaid documentation for the full syntax options.

Now it has had time to think about the problem, and we have primed the context of the chat. Remember, ChatGPT is like predictive text, so if we do not prime its context, it will be less likely to produce what we want.

Prompt

here is a method, can you create a mermaind sequence diagram from it?
private static void callTranscribe() throws IOException {
    // Create the client
    final var openAiKey = System.getenv("OPENAI_API_KEY");
    final OpenAIClient client = OpenAIClient.builder().setApiKey(openAiKey).build();

    File file = new File("/Users/richardhightower/Documents/audio_notes_patents/meeting_notes1.m4a");

    byte[] bytes = Files.readAllBytes(file.toPath());
    // Create the chat request
    final TranscriptionRequest request = TranscriptionRequest.builder()
            .model("whisper-1").prompt("Write up notes").language("en").file(bytes)
            .build();

    // Call Open AI API with chat message
    final ClientResponse<TranscriptionRequest, AudioResponse> response = client.transcribe(request);

    response.getResponse().ifPresent(r -> System.out.println(r.getBody()));

    response.getException().ifPresent(Throwable::printStackTrace);

    response.getStatusMessage().ifPresent(error -> System.out.printf("status message %s %d \n", error, response.getStatusCode().orElse(0)));
}

ChatGPT Response

Here is the corresponding Mermaid sequence diagram based on the provided Java code:

sequenceDiagram
    participant callTranscribe as callTranscribe
    participant System as System
    participant OpenAIClient as OpenAIClient
    participant File as File
    participant Files as Files
    participant TranscriptionRequest as TranscriptionRequest
    participant ClientResponse as ClientResponse
    participant Optional as Optional
    participant System.out as System.out
    participant Throwable as Throwable
    participant StatusCode as StatusCode

    callTranscribe->>System: getenv("OPENAI_API_KEY")
    note over callTranscribe: Create OpenAIClient using the retrieved API key
    callTranscribe->>OpenAIClient: builder().setApiKey(openAiKey).build()
    note over callTranscribe: Create a File object with specified path
    callTranscribe->>File: new File(path)
    callTranscribe->>Files: readAllBytes(file.toPath())
    note over callTranscribe: Create TranscriptionRequest using builder pattern
    callTranscribe->>TranscriptionRequest: builder().model().prompt().language().file().build()
    note over callTranscribe: Call OpenAI API with chat message
    callTranscribe->>OpenAIClient: transcribe(request)
    callTranscribe->>ClientResponse: response
    callTranscribe->>Optional: getResponse()
    Optional->>System.out: println(r.getBody())
    callTranscribe->>Optional: getException()
    Optional->>Throwable: printStackTrace()
    callTranscribe->>Optional: getStatusMessage()
    Optional->>System.out: printf("status message %s %d \n", error, StatusCode)

Note: Sequence diagrams are typically used to visualize interactions between different objects or classes. Using it to represent the flow within a single method might be overkill or not particularly useful. Nevertheless, the diagram above illustrates how the callTranscribe method interacts with different classes and their methods within its body.

This works, but I wanted something else. It is verbose. The message description is all in code instead of actual human-readable text. We can do better. Let’s try again. It is worse than the one with some light prompt engineering. It even has an extra dangling participant. Ok. It seems like it needs to think more. If the last one was a D, this is a D-. We got ChatGPT to think more, but perhaps not about the right things.

Second attempt of Chain of Thought

This time we will break down the syntax more and have it think about the elements it failed at. This one does better.

Let’s prime it with what a mermaid sequence diagram is and what its major elements and components are.

Prompt

describe all of the main elements in a mermaid markup sequence diagram

ChatGPT Response

Mermaid is a popular open-source tool used to generate diagrams and flowcharts from text in a similar way to Markdown. This is especially handy when working on software documentation, as it allows for easy updates and version control.

Here are the main elements in a Mermaid sequence diagram:

  1. Participants: These are the main actors interacting in the system. They are typically represented by labeled vertical lines. They can be users, systems, or subsystems. In Mermaid, you declare them like this:
participant Alice
participant Bob
  1. Messages: These are the interactions between participants. They are represented by horizontal arrows between the participants’ lines. You can denote messages in Mermaid like this:
Alice->>Bob: Hello Bob, how are you?
  1. Response Messages: The recipient of a message can also send a message back, which is represented by an arrow pointing in the other direction:
Bob-->>Alice: Not bad, thank you!
  1. Activation Bars / Lifelines: These bars show the period during which an actor is performing an action. They are represented as rectangles on the participant’s line. In Mermaid, the activation bars will appear automatically as you define the messages.

  2. Notes: These can be added to provide additional information about parts of the sequence diagram. They can be attached to the diagram in different ways:

Note right of Alice: Alice thinks about it
Note left of Bob: Bob responds
  1. Loops: A loop can be used when a specific sequence of messages is repeated.
loop Every Minute
    Alice->>Bob: Ping
    Bob-->>Alice: Pong
end
  1. Conditions/Alternative paths: These are used to represent a decision in the flow of interaction:
alt successful case
    Alice->>Bob: Nice to meet you
else unsuccessful case
    Alice-xBob: I am sad
end
  1. Parallel Paths: If two or more processes are happening at the same time, you can use parallel paths:
par
    Alice->>Bob: Hello Bob
    Bob->>Charlie: Hello Charlie
end

This is not an exhaustive list, but these are the main elements you will likely use when constructing a Mermaid sequence diagram. To combine these elements in your sequence diagram, ensure you structure them within a sequenceDiagram code block like so:

sequenceDiagram
    participant Alice
    participant Bob
    Alice->>Bob: Hello Bob, how are you?
    Bob-->>Alice: Not bad, thank you!
    Note right of Alice: Alice thinks about it

Please ensure that you are using a version of Mermaid that supports the features you are using, as the library is regularly updated and extended with new features.

Ok. Now that the model is primed, let’s ask again for it to do what we want.

Prompt

here is a method, can you create a mermaind sequence diagram from it?
private static void callTranscribe() throws IOException {
    // Create the client
    final var openAiKey = System.getenv("OPENAI_API_KEY");
    final OpenAIClient client = OpenAIClient.builder().setApiKey(openAiKey).build();

    File file = new File("/Users/richardhightower/Documents/audio_notes_patents/meeting_notes1.m4a");

    byte[] bytes = Files.readAllBytes(file.toPath());
    // Create the chat request
    final TranscriptionRequest request = TranscriptionRequest.builder()
            .model("whisper-1").prompt("Write up notes").language("en").file(bytes)
            .build();

    // Call Open AI API with chat message
    final ClientResponse<TranscriptionRequest, AudioResponse> response = client.transcribe(request);

    response.getResponse().ifPresent(r -> System.out.println(r.getBody()));

    response.getException().ifPresent(Throwable::printStackTrace);

    response.getStatusMessage().ifPresent(error -> System.out.printf("status message %s %d \n", error, response.getStatusCode().orElse(0)));
}

ChatGPT Response

Certainly! Below is the sequence diagram for the method callTranscribe(). It illustrates how the method creates an OpenAIClient instance, transcribes an audio file, and handles the response.

sequenceDiagram
    participant Environment as Env
    participant OpenAIClient as Client
    participant File
    participant TranscriptionRequest as Request
    participant ClientResponse as Response

    activate Client
    Env->>Client: Get OPENAI_API_KEY
    Client->>Client: Build with OPENAI_API_KEY
    deactivate Client

    activate File
    File->>File: Read bytes from meeting_notes1.m4a
    deactivate File

    activate Request
    Client->>Request: Build with "whisper-1", "Write up notes", "en", file bytes
    deactivate Request

    activate Response
    Client->>Response: Transcribe Request
    Response->>Response: Get response body
    note over Response: Print the response body if it exists
    Response->>Response: Get exception
    note over Response: Print the exception stack trace if it exists
    Response->>Response: Get status message
    note over Response: Print the status message if it exists
    deactivate Response

This diagram simplifies the process into primary actions and interactions. You can modify it to suit your needs. For instance, it does not illustrate conditional flows (using the alt or opt blocks) for response.getResponse(), response.getException(), and response.getStatusMessage() because it is not explicitly indicated in the method whether these are alternative paths. If these calls are exclusive alternatives in your actual use case, you might want to represent them using alt blocks.

FAIL!

It looks great except for a tiny part. It fails, and when generating mermaid files, I see this same failure a lot. But, if it did not mess up the syntax of the participants, it would be perfect.

participant Environment as Env

should be

participant Env as Environment

Again, it looked good but did not work as a Mermaid file. It needs to consider how mermaid participants work a bit more. (I even manually switched it, and it worked well. So, after the switch, it would have been an A, but since it failed, it gets an F).

Let’s improve this by asking for more details on the participants, and let’s lock down what we want on messages by asking it to reason on the components of a message in mermaid sequence diagrams.

Chain of thought with prompt engineering and more thought

Prompt

Pretend you are a world class software architect well versed in UML and Mermaid.
Describe all of the main elements in a mermaid markup sequence diagram.

ChatGPT Response

Sure, Mermaid is a simple markdown-like language that allows developers to generate diagrams and flowcharts from text in a similar way to Markdown. It is often used for documenting software architectures, data flows, etc., because of its simplicity and the ability to keep documentation alongside the code.

Now, for a sequence diagram in Mermaid, there are several key elements:

  1. Participants: They represent the different entities in your diagram. They could be classes, systems, subsystems, or actors interacting in the system. They are declared by the keyword ‘participant’, followed by an alias and a descriptor.

    Example:

    sequenceDiagram
        participant A as Alice
        participant B as Bob
    
  2. Messages: These are shown as arrows between participants. The arrow’s direction indicates the sender and receiver of the message. Solid lines represent synchronous calls, while dashed lines represent asynchronous ones.

    Example:

    Alice->>Bob: Hello Bob, how are you?
    Note right of Bob: Bob thinks
    Bob-->>Alice: I'm good thanks!
    

    Here, Alice->>Bob: Hello Bob, how are you? represents a synchronous message from Alice to Bob, while Bob-->>Alice: I'm good thanks! represents an asynchronous message from Bob to Alice.

  3. Activations: To show that a participant is active, you can use the activate and deactivate keywords.

    Example:

    Alice->>Bob: Hello Bob, how are you?
    activate Bob
    Note right of Bob: Bob thinks
    Bob-->>Alice: I'm good thanks!
    deactivate Bob
    
  4. Notes: Notes can be used to add comments or explanations to your diagram. They can be placed to the right or left of a participant.

    Example:

    Note right of Bob: Bob thinks
    
  5. Loops: Loops are used to indicate repeating interactions. They are defined with the loop keyword, an end keyword, and a descriptor of what is being looped over.

    Example:

    loop Every Minute
        Alice->>Bob: Ping
        Bob-->>Alice: Pong
    end
    
  6. Conditions: Conditional interactions can be defined using the opt keyword (which stands for “Optional”). It is used like a loop, with a descriptor and an end keyword.

    Example:

    opt Successful Login
        Alice->>Bob: Login
        Bob-->>Alice: Success
    end
    
  7. Groups: These are used to visually group interactions together.

    Example:

    group My Group
        Alice->>Bob: Message within group
    end
    
  8. Parallel Interactions: You can define interactions that happen at the same time with the par and and keywords.

    Example:

    par
        Alice->>Bob: Message 1
    and
        Alice->>Bob: Message 2
    end
    

Remember, in Mermaid, the syntax is very sensitive, and a missing or misplaced character might cause an error in the diagram.

Now we will break it down further and ask it to reason on participants.

Prompt

Great please go into detail how participants work and the different
options for participants.

ChatGPT Response

Participants in a Mermaid sequence diagram are the entities that interact with each other over the course of the diagram. They represent the “actors” in the sequence, which can be individuals, systems, components, objects, or even another sequence.

The declaration of a participant begins with the keyword participant, followed by a user-defined alias, and then a descriptor. The alias is used as a shorthand to represent the participant in the rest of the diagram.

Here’s an example:

sequenceDiagram
    participant A as Alice
    participant B as Bob

In this case, ‘A’ and ‘B’ are aliases representing ‘Alice’ and ‘Bob’. All the messages that ‘Alice’ and ‘Bob’ send or receive are referenced using these aliases.

It is important to note that the order of declaration of participants defines their order from left to right in the diagram. So, in the above example, ‘Alice’ will be shown on the left and ‘Bob’ on the right.

However, Mermaid also provides flexibility for changing the order in which participants are displayed without changing their order of declaration. This can be done using the participant keyword with an index.

For instance:

sequenceDiagram
    participant B as Bob
    participant A as Alice
    participant C as Charlie index 2

In this case, even though ‘Charlie’ is declared last, they will be displayed second in the diagram due to the index provided.

If you do not provide a descriptor for the participant, then the alias is displayed in the diagram:

sequenceDiagram
    participant A

In this case, ‘A’ is both the alias and the descriptor, so ‘A’ will be displayed in the diagram.

That is about it for participants in Mermaid sequence diagrams. They are a powerful feature, enabling a clear and concise depiction of the interactions between different entities in your diagrams. Remember to choose meaningful aliases and descriptors to make your diagrams easier to understand.

Now let’s lock in our gains on message generation and have it reason on how messages work in mermaid.

Prompt

Please go into detail how messages work

ChatGPT Response

Sure, messages are a core component of sequence diagrams in Mermaid. They are used to show interactions between participants. In a sequence diagram, a message is represented as an arrowed line and can be of two types: synchronous and asynchronous.

  1. Synchronous Messages: These are shown with a solid line and a solid arrow between the participants. Synchronous messages are blocking, meaning the sender waits for the receiver to process the message and respond before moving on.

    Syntax:

    Alice->>Bob: Message
    

    This represents a synchronous message from Alice to Bob. The message can be any text that describes the interaction.

  2. Asynchronous Messages: These are shown with a dotted line and an open arrow between the participants. Asynchronous messages are non-blocking, meaning the sender does not wait for the receiver to process the message and can move on to other tasks.

    Syntax:

    Alice-->>Bob: Message
    

    This represents an asynchronous message from Alice to Bob.

    A single participant can also send a message to itself. This is typically used to denote internal processing or a state change.

    Syntax:

    Alice->>Alice: Self Message
    
  3. Message Responses: Mermaid sequence diagrams also support the drawing of a response to a message. Responses are depicted as dashed arrows pointing in the opposite direction of the original message. The sender and receiver are reversed.

    Syntax:

    Alice->>Bob: Request
    Bob-->>Alice: Response
    

    In this case, Alice sends a request to Bob, who then sends a response back to Alice.

That is it for messages in Mermaid. The simplicity and expressiveness of the Mermaid syntax allow complex interactions to be quickly and clearly visualized in a diagram, making it an excellent tool for documenting software systems and designs.

Just to be pedantic, let’s make sure it knows and reasons on exactly what a message description should be.

Prompt

What is the purpose of the part of the message after the :

ChatGPT Response

The text after the : (colon) in a Mermaid message declaration is used to describe the nature of the message or interaction that is happening between the participants. It provides context to the arrowed line that connects the participants.

For example, consider the following message syntax:

Alice->>Bob: Hello Bob, how are you?

Here, “Hello Bob, how are you?” is the description of the message that Alice is sending to Bob. It is a brief way to represent what that interaction or message is about.

In the context of software architecture, it could be the name of the method being called, the event being sent, or some other kind of interaction. For instance:

Client->>Server: GET /api/v1/users

Here, “GET /api/v1/users” describes an HTTP GET request being sent from a client to a server. This description helps provide context and make the sequence diagram more understandable.

In summary, the part after the colon : in the message is used to give meaningful context to the interactions happening between the participants in the sequence diagram.

Ok, we have primed the pump. Let’s ask our question and see if it generates the valid Mermaid Markdown that we want.

Prompt

here is a method, can you create a mermaind sequence diagram from it?
private static void callTranscribe() throws IOException {
    // Create the client
    final var openAiKey = System.getenv("OPENAI_API_KEY");
    final OpenAIClient client = OpenAIClient.builder().setApiKey(openAiKey).build();

    File file = new File("/Users/richardhightower/Documents/audio_notes_patents/meeting_notes1.m4a");

    byte[] bytes = Files.readAllBytes(file.toPath());
    // Create the chat request
    final TranscriptionRequest request = TranscriptionRequest.builder()
            .model("whisper-1").prompt("Write up notes").language("en").file(bytes)
            .build();

    // Call Open AI API with chat message
    final ClientResponse<TranscriptionRequest, AudioResponse> response = client.transcribe(request);

    response.getResponse().ifPresent(r -> System.out.println(r.getBody()));

    response.getException().ifPresent(Throwable::printStackTrace);

    response.getStatusMessage().ifPresent(error -> System.out.printf("status message %s %d \n", error, response.getStatusCode().orElse(0)));
}

ChatGPT Response

Sure, here’s a Mermaid sequence diagram based on the given Java method:

sequenceDiagram
    participant S as System
    participant M as Method
    participant O as OpenAIClient
    participant F as File
    participant B as Byte Array
    participant R as Request
    participant C as ClientResponse

    M->>S: Get OPENAI_API_KEY
    M->>O: Initialize with OPENAI_API_KEY
    M->>F: Create file object
    M->>F: Read all bytes from file
    F-->>M: Return byte array
    M->>B: Store byte array
    M->>R: Create transcription request
    M->>O: Call transcribe method with request
    O-->>M: Return ClientResponse
    M->>C: Store ClientResponse
    M->>C: Get response from ClientResponse
    C-->>M: Print body of response (if present)
    M->>C: Get exception from ClientResponse
    C-->>M: Print exception (if present)
    M->>C: Get status message from ClientResponse
    C-->>M: Print status message and status code (if present)

This diagram visually represents the flow of control and data in the callTranscribe method. It shows interactions with the system environment, the OpenAI client, a file, a byte array, a transcription request, and a client response object.

Note that this is a high-level representation. Depending on the level of detail needed, you might include more or less detail in the diagram. This was my desired end state.

We did it. We used Chain of Thought to improve a one-shot attempt that did not work to make ChatGPT into a mermaid-generating machine!

Conclusion

In this article, we covered the basics of Chain of Thought (CoT) and how letting ChatGPT think about a problem can improve context and help it produce better results. We explored the generation of Mermaid sequence diagrams for Java source code with ChatGPT. We also discussed how to improve the generation by using a CoT approach. We showed how a one-shot chat can fail because it needs more context and how simple prompt engineering gets better results, but not what we wanted. We provided some examples of CoT that did not work because they did not provide the proper context. Finally, we showed a model that worked well using CoT by focusing on the context we needed for the Mermaid sequence diagram generation.

We showed an existing Mermaid diagram as the end state and then explored the various ways to produce this diagram by making ChatGPT think about the task. We did this by asking ChatGPT questions to help it refine its understanding of Mermaid diagrams, giving it more context to produce better results. Including the role of participants, messages, and message descriptions helped ChatGPT to produce the right results.

Finally, we tested our context gathering via CoT by asking ChatGPT to generate a Mermaid sequence diagram from a Java method. Using this approach, we can get ChatGPT to understand Mermaid sequence diagrams comprehensively and how to generate them with ChatGPT just by loading the proper context to make the predicted text work better.

Key Concepts and Review

  • Chain of Thought: A method of iteratively prompting an AI language model with questions to refine its understanding of a topic or problem.
  • Mermaid: A simple markdown-like language used to generate diagrams and flowcharts from text.
  • Mermaid Sequence Diagram: A type of diagram used to visualize interactions between different entities in a system.
  • Participants: The different entities in a Mermaid sequence diagram.
  • Messages: The interactions between participants in a Mermaid sequence diagram.
  • Synchronous Messages: A type of message in a Mermaid sequence diagram represented by a solid line and solid arrow between participants.
  • Asynchronous Messages: A type of message in a Mermaid sequence diagram represented by a dotted line and open arrow between participants.
  • Message Descriptions: The text following the colon in a Mermaid message declaration, used to describe the nature of the interaction or message between participants.
  • Prompt Engineering: The process of carefully crafting prompts to provide the necessary context for an AI language model to generate accurate and relevant responses.
  • Chain of Thought (CoT) prompting: A technique that improves the performance of Large Language Models (LLMs) on reasoning-based tasks through few-shot learning.
  • CoT enables LLMs to address complex tasks by breaking them into intermediate steps, allowing for manageable granularity for both input and output.
  • CoT prompting breaks a problem down into a series of intermediate reasoning steps, improving the ability of LLMs to perform complex reasoning.
  • There are different strategies for implementing CoT prompting, such as few-shot CoT and zero-shot CoT.
  • Few-shot CoT involves providing examples of Question-Answer pairs where the answer is explained step by step.
  • Zero-shot CoT involves prefixing the Answer block with “Let’s think step by step” to prompt the LLM to complete the output in that format.
  • The benefits of CoT prompting become more apparent as the model scale increases, leading to improved performance that substantially outperforms standard prompting for large model sizes.
  • CoT is a similar approach used with LLMs as teaching a child to solve a problem step by step.
  • CoT helps to decompose the problem into smaller, more manageable steps, making it easier for an LLM to understand and solve the problem.
  • CoT prompting encourages the model to follow a logical thought process and improves its reasoning abilities.
  • For more information on CoT prompting, see these references:
  • It is essential to keep the context clear and concise.
  • Remember that ChatGPT is like predictive text, so it is essential to prime its context to produce the desired output.
  • Keep the context clear and concise.
  • Use Chain of Thought (CoT) to prime ChatGPT with context so the generation it does is more likely to be your desired results.
  • Repeat vital information to reinforce the context.
  • Be patient and adjust your expectations.

Review questions

  • What is Chain of Thought (CoT), and how does it improve the performance of Large Language Models (LLMs) on reasoning-based tasks?
  • Do you know what strategies are for implementing CoT prompting?
  • How can CoT prompting improve the ability of LLMs to perform complex reasoning?
  • Could you tell me the importance of clear and concise context when you are thinking about an AI language model?
  • Do you have any tips for using CoT with an AI language model?
  • What is Mermaid, and how is it used?
  • What are the critical elements of a Mermaid sequence diagram?
  • What are participants in a Mermaid sequence diagram, and how are they declared?
  • What is the purpose of messages in a Mermaid sequence diagram, and what are the two types of messages?
  • What is the purpose of the text following the colon in a Mermaid message declaration?
Term Definition Relevance to AI/LLMs
Chain of Thought (CoT) A technique that improves LLM performance on reasoning tasks through few-shot learning Enables LLMs to break complex tasks into manageable steps
Large Language Models (LLMs) AI models trained on vast amounts of text data to understand and generate human-like text Core technology behind advanced AI systems like ChatGPT
Prompt Engineering The process of crafting inputs to AI models to achieve desired outputs Critical for optimizing LLM performance and accuracy
Few-shot CoT Providing examples of question-answer pairs with step-by-step explanations Improves LLM’s ability to break down and solve complex problems
Zero-shot CoT Prefixing the answer block with “Let’s think step by step” to guide the LLM’s output Encourages structured thinking without specific examples
Mermaid A markdown-like language for creating diagrams and flowcharts from text Used to visualize processes and interactions in AI systems
Embeddings Vector representations of words or phrases in a high-dimensional space Used in AI for semantic understanding and similarity comparisons

About the Author

Rick Hightower is a seasoned software engineer, tech entrepreneur, and thought leader in the field of artificial intelligence and large language models. With decades of experience in Java development and a passion for emerging technologies, Rick has established himself as a prominent voice in the tech community.

As the creator of JAI (Java AI Open API Client) and a prolific writer on topics ranging from AI implementation to software engineering best practices, Rick consistently shares his expertise through articles, tutorials, and open-source contributions. His work focuses on bridging the gap between complex AI concepts and practical applications for developers and businesses alike.

Rick’s insights into Chain of Thought prompting, LLMs, and their real-world applications have helped countless professionals navigate the rapidly evolving landscape of AI technology. Through his writing and development work, he continues to push the boundaries of what is possible in AI-driven software solutions.

For more of Rick’s work and the latest in AI and software development, follow him on LinkedIn and check out his GitHub repositories.

                                                                           
comments powered by Disqus

Apache Spark Training
Kafka Tutorial
Akka Consulting
Cassandra Training
AWS Cassandra Database Support
Kafka Support Pricing
Cassandra Database Support Pricing
Non-stop Cassandra
Watchdog
Advantages of using Cloudurable™
Cassandra Consulting
Cloudurable™| Guide to AWS Cassandra Deploy
Cloudurable™| AWS Cassandra Guidelines and Notes
Free guide to deploying Cassandra on AWS
Kafka Training
Kafka Consulting
DynamoDB Training
DynamoDB Consulting
Kinesis Training
Kinesis Consulting
Kafka Tutorial PDF
Kubernetes Security Training
Redis Consulting
Redis Training
ElasticSearch / ELK Consulting
ElasticSearch Training
InfluxDB/TICK Training TICK Consulting