November 4, 2024
Understanding LLMs and Using Chain of Thought
Author: Rick Hightower
We will look at a real-world use case that most developers and tech managers should understand. We will give ChatGPT a Java method and ask it to create a Mermaid sequence diagram.
Chain of Thought (CoT) prompting is a technique that improves the performance of Large Language Models (LLMs) on reasoning tasks. It uses few-shot learning. According to Toward Data Science, CoT helps LLMs handle complex tasks like common sense reasoning and arithmetic. It does this by breaking down multi-step requests into smaller steps. This creates a way to see and understand the process. It makes both the input and output easier to manage and tweak.
CoT prompting breaks a problem into a series of intermediate reasoning steps. This greatly improves the ability of LLMs to perform complex reasoning. There are different ways to use CoT prompting, such as few-shot CoT and zero-shot CoT. In few-shot CoT, you provide examples of Question-Answer pairs where the answer is explained step by step. In zero-shot CoT, you add “Let’s think step by step” before the answer to prompt the LLM to explain its reasoning. The benefits of CoT prompting are more noticeable with larger models. It leads to much better performance than standard prompting for large models. These findings are supported by experiments on three large language models, as described in the Google AI Blog and arXiv.
It is the context.
ChatGPT is an AI language model that can generate human-like text and have conversations. It is like a smart computer program that can understand and create text based on context. But it is important to remember that ChatGPT is not perfect. It can sometimes forget or misunderstand the context, especially in long or complex conversations. To help ChatGPT understand and remember the context better, you can try these things:
- Keep the context clear and concise: Make sure your instructions are easy to understand and short. This will help ChatGPT focus on the important information and reduce the chances of it forgetting the context.
- Repeat important information: If there is a specific rule or piece of information you want ChatGPT to remember, you can remind it of that information during the conversation. This will help reinforce the context and make it more likely that ChatGPT will remember it.
- Be patient and adjust your expectations: ChatGPT is an impressive AI tool, but it is not perfect. It might not always remember the context or follow the rules you set. In those cases, you can remind it of the context or rules and try again.
- Use Chain of Thought (CoT) to give ChatGPT context so that the output it generates is more likely to be what you want.
ChatGPT is like predictive text, so it is important to give it the right context to get the desired output. By following these tips, you can help improve ChatGPT’s understanding of the context and increase the chances of it generating the text you want. Let’s break down what CoT is and then look at an example. This article focuses on CoT.
Basics of Chain of Thought
Chain of Thought (CoT) prompting is a technique that improves the performance of Large Language Models (LLMs) on reasoning tasks. It uses few-shot learning. According to Toward Data Science, CoT helps LLMs handle complex tasks like common sense reasoning and arithmetic. It does this by breaking down multi-step requests into smaller steps. This creates a way to see and understand the process. It makes both the input and output easier to manage and tweak.
CoT prompting breaks a problem into a series of intermediate reasoning steps. This greatly improves the ability of LLMs to perform complex reasoning. There are different ways to use CoT prompting, such as few-shot CoT and zero-shot CoT. In few-shot CoT, you provide examples of Question-Answer pairs where the answer is explained step by step. In zero-shot CoT, you add “Let’s think step by step” before the answer to prompt the LLM to explain its reasoning. The benefits of CoT prompting are more noticeable with larger models. It leads to much better performance than standard prompting for large models. These findings are supported by experiments on three large language models, as described in the Google AI Blog and arXiv.
Let’s break CoT down a bit.
Imagine you are teaching a child to solve a problem step by step. Instead of just giving them the answer, you guide them through each step of the process. This helps them understand how to get to the solution. CoT is a similar approach used with LLMs, which are smart computer programs that can understand and generate text. Using CoT with LLMs means breaking down a problem into smaller, more manageable steps, just like you would with a child. This helps the LLM understand the problem better and improves its ability to reason and solve it.
Let’s look at a simple example of CoT prompting for a math word problem:
Question: Alice has 3 apples and Bob has 4 apples. If they both give 2 apples to Charlie, how many apples will Charlie have?
To solve this problem using CoT, we can break it down into these reasoning steps:
- Find out how many apples Alice gives to Charlie: Alice gives 2 apples.
- Find out how many apples Bob gives to Charlie: Bob gives 2 apples.
- Calculate the total number of apples Charlie gets: 2 (from Alice) + 2 (from Bob).
Answer: Charlie will have four apples.
In this example, the CoT approach helps to break the problem into smaller, more manageable steps. This makes it easier for an LLM to understand and solve the problem. By guiding the LLM through a structured sequence of steps, CoT prompting encourages the model to follow a logical thought process and improves its reasoning abilities.
For more information on CoT prompting, see these references:
- Toward Data Science article on CoT prompting for LLMs
- Google AI Blog on LLMs performing reasoning via CoT prompting
- arXiv paper on CoT prompting for LLMs
A more real-world example
To test this, we will give ChatGPT a Java method and ask it to create a Mermaid sequence diagram. We want something that looks like this:
sequenceDiagram
participant S as System
participant M as Method
participant O as OpenAIClient
participant F as File
participant B as Byte Array
participant R as Request
participant C as ClientResponse
M->>S: Get OPENAI_API_KEY
M->>O: Initialize with OPENAI_API_KEY
M->>F: Create file object
M->>F: Read all bytes from file
F-->>M: Return byte array
M->>B: Store byte array
M->>R: Create transcription request
M->>O: Call transcribe method with request
O-->>M: Return ClientResponse
M->>C: Store ClientResponse
M->>C: Get response from ClientResponse
C-->>M: Print body of response (if present)
M->>C: Get exception from ClientResponse
C-->>M: Print exception (if present)
M->>C: Get status message from ClientResponse
C-->>M: Print status message and status code (if present)
Do not worry if you do not know what Mermaid is. You will by the end of this article because we will have ChatGPT explain it multiple times as part of our CoT.
Failed one-shot attempt
Let’s look at an example of creating a Mermaid sequence diagram from a Java method using ChatGPT that fails. We will fix this later. The code provided is a method called callTranscribe()
, which interacts with several different classes and methods. The user requested a step-by-step breakdown of how to evaluate the Java code to create a Mermaid sequence diagram.
ChatGPT fails to create the mermaid diagram with our first one-shot attempt.
Prompt
here is a method, can you create a mermaind sequence diagram from it?
private static void callTranscribe() throws IOException {
// Create the client
final var openAiKey = System.getenv("OPENAI_API_KEY");
final OpenAIClient client = OpenAIClient.builder().setApiKey(openAiKey).build();
File file = new File("/Users/richardhightower/Documents/audio_notes_patents/meeting_notes1.m4a");
byte[] bytes = Files.readAllBytes(file.toPath());
// Create the chat request
final TranscriptionRequest request = TranscriptionRequest.builder()
.model("whisper-1").prompt("Write up notes").language("en").file(bytes)
.build();
// Call Open AI API with chat message
final ClientResponse<TranscriptionRequest, AudioResponse> response = client.transcribe(request);
response.getResponse().ifPresent(r -> System.out.println(r.getBody()));
response.getException().ifPresent(Throwable::printStackTrace);
response.getStatusMessage().ifPresent(error -> System.out.printf("status message %s %d \n", error, response.getStatusCode().orElse(0)));
}
ChatGPT Response
In a UML sequence diagram, we typically show how objects interact in a system. These diagrams focus on the time sequence of interactions. The arrows between objects show which is sending the message and which is receiving it.
Here is the textual representation of the sequence diagram for your method:
callTranscribe() -> System: getenv("OPENAI_API_KEY")
callTranscribe() -> OpenAIClient: builder()
callTranscribe() -> OpenAIClient.Builder: setApiKey(openAiKey)
OpenAIClient.Builder -> OpenAIClient: build()
callTranscribe() -> File: new File(path)
callTranscribe() -> Files: readAllBytes(file.toPath())
callTranscribe() -> TranscriptionRequest: builder()
callTranscribe() -> TranscriptionRequest.Builder: model("whisper-1").prompt("Write up notes").language("en").file(bytes)
TranscriptionRequest.Builder -> TranscriptionRequest: build()
callTranscribe() -> OpenAIClient: transcribe(request)
OpenAIClient -> ClientResponse<TranscriptionRequest, AudioResponse>: return response
callTranscribe() -> ClientResponse<TranscriptionRequest, AudioResponse>: getResponse()
ClientResponse<TranscriptionRequest, AudioResponse> -> Optional<AudioResponse>: return Optional
callTranscribe() -> Optional<AudioResponse>: ifPresent(System.out.println(r.getBody()))
callTranscribe() -> ClientResponse<TranscriptionRequest, AudioResponse>: getException()
ClientResponse<TranscriptionRequest, AudioResponse> -> Optional<Throwable>: return Optional
callTranscribe() -> Optional<Throwable>: ifPresent(Throwable::printStackTrace)
callTranscribe() -> ClientResponse<TranscriptionRequest, AudioResponse>: getStatusMessage()
ClientResponse<TranscriptionRequest, AudioResponse> -> Optional<String>: return Optional
callTranscribe() -> Optional<String>: ifPresent(error -> System.out.printf("status message %s %d \n", error, response.getStatusCode().orElse(0)))
Second attempt at a one-shot prompt with some prompt engineering. What if we give it more context in our prompt? I will add this extra hint: “using mermaid markup for sequence diagrams.”
Prompt
using mermaid markup for sequnece diagrams...
here is a method, can you create a mermaind sequence diagram from it?
private static void callTranscribe() throws IOException {
// Create the client
final var openAiKey = System.getenv("OPENAI_API_KEY");
final OpenAIClient client = OpenAIClient.builder().setApiKey(openAiKey).build();
File file = new File("/Users/richardhightower/Documents/audio_notes_patents/meeting_notes1.m4a");
byte[] bytes = Files.readAllBytes(file.toPath());
// Create the chat request
final TranscriptionRequest request = TranscriptionRequest.builder()
.model("whisper-1").prompt("Write up notes").language("en").file(bytes)
.build();
// Call Open AI API with chat message
final ClientResponse<TranscriptionRequest, AudioResponse> response = client.transcribe(request);
response.getResponse().ifPresent(r -> System.out.println(r.getBody()));
response.getException().ifPresent(Throwable::printStackTrace);
response.getStatusMessage().ifPresent(error -> System.out.printf("status message %s %d \n", error, response.getStatusCode().orElse(0)));
}
ChatGPT Response
Mermaid markup language can be used to generate sequence diagrams. Here’s a sequence diagram representing the callTranscribe()
method flow:
sequenceDiagram
participant SysEnv as System Environment
participant U as User
participant C as callTranscribe Method
participant OAC as OpenAIClient
participant F as File
participant FS as File System
participant TR as TranscriptionRequest
participant CR as ClientResponse
U->>C: callTranscribe()
C->>SysEnv: get OPENAI_API_KEY
SysEnv-->>C: return OPENAI_API_KEY
C->>OAC: create OpenAIClient
C->>F: create File object
F->>FS: locate the file
FS-->>F: return the file's path
C->>F: readAllBytes from file
F-->>C: return bytes
C->>TR: create TranscriptionRequest
C->>OAC: transcribe(request)
OAC-->>C: return ClientResponse
C->>CR: getResponse
CR-->>C: return AudioResponse or null
C->>CR: getException
CR-->>C: return exception or null
C->>CR: getStatusMessage
CR-->>C: return status message or null
C->>U: end of callTranscribe()
Please note that this diagram only covers the flow in the code you provided, not potential branches for error handling. MermaidJS uses a simple language structure to create diagrams, so it may not completely or accurately represent all the nuanced branches of a complex process.
How did it do? I call this a pass because it is a valid Mermaid diagram, but I give it a D. Why? It uses code for the descriptions, which is verbose. So it passed, but I wanted something else. For descriptions, it is putting Java code. This breaks when the Java code is too long (like a stream.filter().filter().map().map().Collects.collect(…)
), and I want something human-readable for a sequence diagram.
First attempt at the chain of thought
Let’s try our chain of thought approach. First, we will make ChatGPT think about the problem.
Prompt
how do you go about generating a mermaid sequence diagram from code
from a java method?
ChatGPT Response
Mermaid.js is a powerful tool that lets you create diagrams and flowcharts, but it does not have built-in functionality for automatically generating sequence diagrams from Java code. However, you can create a Mermaid sequence diagram based on the logic in your Java method.
Here’s an example of a sequence diagram in Mermaid:
sequenceDiagram
participant Alice
participant Bob
Alice->>Bob: Hello Bob, how are you?
Bob-->>Alice: Not bad, thank you!
However, if you are looking to generate a sequence diagram automatically from Java code, you might need to use a tool specifically designed for that purpose, such as PlantUML, UMLet, or ObjectAid.
Then, if you still want to use Mermaid.js for your diagrams, you would need to manually convert the output from these UML tools into the format used by Mermaid.js. This might involve writing a script or tool to automate the process, depending on the complexity and size of your codebase.
To summarize, the general process might look like this:
- Use a tool like PlantUML, UMLet, or ObjectAid to generate a UML sequence diagram from your Java code.
- Manually or programmatically convert the UML diagram into Mermaid.js syntax.
- Use Mermaid.js to render the sequence diagram based on the converted syntax.
Remember that automatic generation might not always capture the logic perfectly, especially for complex code, and manual adjustments may be necessary. It is also worth noting that generating diagrams from code is a form of reverse engineering, and its usefulness can depend on the complexity of your code and your specific needs.
Prompt
Break it down step by step how you would evalauate the Java code
This prompt forces it to think about what steps are involved, and we prime the context.
ChatGPT Response
Generating a Mermaid sequence diagram manually from Java code involves understanding the logic of the code and then translating that logic into a sequence diagram. Below is an example process.
Let’s assume we have the following simple Java code snippet:
class Main {
public static void main(String[] args) {
System.out.println("Hello, world!");
Foo foo = new Foo();
foo.bar();
}
}
class Foo {
void bar() {
System.out.println("Hello from bar!");
}
}
Here is the step-by-step process to convert this to a Mermaid sequence diagram:
- Identify the objects (or “participants” in sequence diagram terms) involved in the code. In this case, we have
Main
andFoo
. - Identify the method calls and the flow of interaction between the objects. In this case,
Main
callsFoo
’sbar
method. - Using the Mermaid syntax, we will create a sequence diagram. The general syntax of a sequence diagram is:
sequenceDiagram
participant Object1
participant Object2
Object1->>Object2: MethodCall
- Now let’s apply our objects and method calls from step 2:
sequenceDiagram
participant Main
participant Foo
Main->>Foo: bar()
In this diagram, we see that Main
calls the bar
method on Foo
.
Remember that real-world applications often involve more complex interactions and may include multiple method calls, branches (like if-else conditions), loops, etc. These can all be represented in a sequence diagram, but will require a more complex syntax. Make sure to refer to the Mermaid documentation for the full syntax options.
Now it has had time to think about the problem, and we have primed the context of the chat. Remember, ChatGPT is like predictive text, so if we do not prime its context, it will be less likely to produce what we want.
Prompt
here is a method, can you create a mermaind sequence diagram from it?
private static void callTranscribe() throws IOException {
// Create the client
final var openAiKey = System.getenv("OPENAI_API_KEY");
final OpenAIClient client = OpenAIClient.builder().setApiKey(openAiKey).build();
File file = new File("/Users/richardhightower/Documents/audio_notes_patents/meeting_notes1.m4a");
byte[] bytes = Files.readAllBytes(file.toPath());
// Create the chat request
final TranscriptionRequest request = TranscriptionRequest.builder()
.model("whisper-1").prompt("Write up notes").language("en").file(bytes)
.build();
// Call Open AI API with chat message
final ClientResponse<TranscriptionRequest, AudioResponse> response = client.transcribe(request);
response.getResponse().ifPresent(r -> System.out.println(r.getBody()));
response.getException().ifPresent(Throwable::printStackTrace);
response.getStatusMessage().ifPresent(error -> System.out.printf("status message %s %d \n", error, response.getStatusCode().orElse(0)));
}
ChatGPT Response
Here is the corresponding Mermaid sequence diagram based on the provided Java code:
sequenceDiagram
participant callTranscribe as callTranscribe
participant System as System
participant OpenAIClient as OpenAIClient
participant File as File
participant Files as Files
participant TranscriptionRequest as TranscriptionRequest
participant ClientResponse as ClientResponse
participant Optional as Optional
participant System.out as System.out
participant Throwable as Throwable
participant StatusCode as StatusCode
callTranscribe->>System: getenv("OPENAI_API_KEY")
note over callTranscribe: Create OpenAIClient using the retrieved API key
callTranscribe->>OpenAIClient: builder().setApiKey(openAiKey).build()
note over callTranscribe: Create a File object with specified path
callTranscribe->>File: new File(path)
callTranscribe->>Files: readAllBytes(file.toPath())
note over callTranscribe: Create TranscriptionRequest using builder pattern
callTranscribe->>TranscriptionRequest: builder().model().prompt().language().file().build()
note over callTranscribe: Call OpenAI API with chat message
callTranscribe->>OpenAIClient: transcribe(request)
callTranscribe->>ClientResponse: response
callTranscribe->>Optional: getResponse()
Optional->>System.out: println(r.getBody())
callTranscribe->>Optional: getException()
Optional->>Throwable: printStackTrace()
callTranscribe->>Optional: getStatusMessage()
Optional->>System.out: printf("status message %s %d \n", error, StatusCode)
Note: Sequence diagrams are typically used to visualize interactions between different objects or classes. Using it to represent the flow within a single method might be overkill or not particularly useful. Nevertheless, the diagram above illustrates how the callTranscribe
method interacts with different classes and their methods within its body.
This works, but I wanted something else. It is verbose. The message description is all in code instead of actual human-readable text. We can do better. Let’s try again. It is worse than the one with some light prompt engineering. It even has an extra dangling participant. Ok. It seems like it needs to think more. If the last one was a D, this is a D-. We got ChatGPT to think more, but perhaps not about the right things.
Second attempt of Chain of Thought
This time we will break down the syntax more and have it think about the elements it failed at. This one does better.
Let’s prime it with what a mermaid sequence diagram is and what its major elements and components are.
Prompt
describe all of the main elements in a mermaid markup sequence diagram
ChatGPT Response
Mermaid is a popular open-source tool used to generate diagrams and flowcharts from text in a similar way to Markdown. This is especially handy when working on software documentation, as it allows for easy updates and version control.
Here are the main elements in a Mermaid sequence diagram:
- Participants: These are the main actors interacting in the system. They are typically represented by labeled vertical lines. They can be users, systems, or subsystems. In Mermaid, you declare them like this:
participant Alice
participant Bob
- Messages: These are the interactions between participants. They are represented by horizontal arrows between the participants’ lines. You can denote messages in Mermaid like this:
Alice->>Bob: Hello Bob, how are you?
- Response Messages: The recipient of a message can also send a message back, which is represented by an arrow pointing in the other direction:
Bob-->>Alice: Not bad, thank you!
-
Activation Bars / Lifelines: These bars show the period during which an actor is performing an action. They are represented as rectangles on the participant’s line. In Mermaid, the activation bars will appear automatically as you define the messages.
-
Notes: These can be added to provide additional information about parts of the sequence diagram. They can be attached to the diagram in different ways:
Note right of Alice: Alice thinks about it
Note left of Bob: Bob responds
- Loops: A loop can be used when a specific sequence of messages is repeated.
loop Every Minute
Alice->>Bob: Ping
Bob-->>Alice: Pong
end
- Conditions/Alternative paths: These are used to represent a decision in the flow of interaction:
alt successful case
Alice->>Bob: Nice to meet you
else unsuccessful case
Alice-xBob: I am sad
end
- Parallel Paths: If two or more processes are happening at the same time, you can use parallel paths:
par
Alice->>Bob: Hello Bob
Bob->>Charlie: Hello Charlie
end
This is not an exhaustive list, but these are the main elements you will likely use when constructing a Mermaid sequence diagram. To combine these elements in your sequence diagram, ensure you structure them within a sequenceDiagram
code block like so:
sequenceDiagram
participant Alice
participant Bob
Alice->>Bob: Hello Bob, how are you?
Bob-->>Alice: Not bad, thank you!
Note right of Alice: Alice thinks about it
Please ensure that you are using a version of Mermaid that supports the features you are using, as the library is regularly updated and extended with new features.
Ok. Now that the model is primed, let’s ask again for it to do what we want.
Prompt
here is a method, can you create a mermaind sequence diagram from it?
private static void callTranscribe() throws IOException {
// Create the client
final var openAiKey = System.getenv("OPENAI_API_KEY");
final OpenAIClient client = OpenAIClient.builder().setApiKey(openAiKey).build();
File file = new File("/Users/richardhightower/Documents/audio_notes_patents/meeting_notes1.m4a");
byte[] bytes = Files.readAllBytes(file.toPath());
// Create the chat request
final TranscriptionRequest request = TranscriptionRequest.builder()
.model("whisper-1").prompt("Write up notes").language("en").file(bytes)
.build();
// Call Open AI API with chat message
final ClientResponse<TranscriptionRequest, AudioResponse> response = client.transcribe(request);
response.getResponse().ifPresent(r -> System.out.println(r.getBody()));
response.getException().ifPresent(Throwable::printStackTrace);
response.getStatusMessage().ifPresent(error -> System.out.printf("status message %s %d \n", error, response.getStatusCode().orElse(0)));
}
ChatGPT Response
Certainly! Below is the sequence diagram for the method callTranscribe()
. It illustrates how the method creates an OpenAIClient
instance, transcribes an audio file, and handles the response.
sequenceDiagram
participant Environment as Env
participant OpenAIClient as Client
participant File
participant TranscriptionRequest as Request
participant ClientResponse as Response
activate Client
Env->>Client: Get OPENAI_API_KEY
Client->>Client: Build with OPENAI_API_KEY
deactivate Client
activate File
File->>File: Read bytes from meeting_notes1.m4a
deactivate File
activate Request
Client->>Request: Build with "whisper-1", "Write up notes", "en", file bytes
deactivate Request
activate Response
Client->>Response: Transcribe Request
Response->>Response: Get response body
note over Response: Print the response body if it exists
Response->>Response: Get exception
note over Response: Print the exception stack trace if it exists
Response->>Response: Get status message
note over Response: Print the status message if it exists
deactivate Response
This diagram simplifies the process into primary actions and interactions. You can modify it to suit your needs. For instance, it does not illustrate conditional flows (using the alt
or opt
blocks) for response.getResponse()
, response.getException()
, and response.getStatusMessage()
because it is not explicitly indicated in the method whether these are alternative paths. If these calls are exclusive alternatives in your actual use case, you might want to represent them using alt
blocks.
FAIL!
It looks great except for a tiny part. It fails, and when generating mermaid files, I see this same failure a lot. But, if it did not mess up the syntax of the participants, it would be perfect.
participant Environment as Env
should be
participant Env as Environment
Again, it looked good but did not work as a Mermaid file. It needs to consider how mermaid participants work a bit more. (I even manually switched it, and it worked well. So, after the switch, it would have been an A, but since it failed, it gets an F).
Let’s improve this by asking for more details on the participants, and let’s lock down what we want on messages by asking it to reason on the components of a message in mermaid sequence diagrams.
Chain of thought with prompt engineering and more thought
Prompt
Pretend you are a world class software architect well versed in UML and Mermaid.
Describe all of the main elements in a mermaid markup sequence diagram.
ChatGPT Response
Sure, Mermaid is a simple markdown-like language that allows developers to generate diagrams and flowcharts from text in a similar way to Markdown. It is often used for documenting software architectures, data flows, etc., because of its simplicity and the ability to keep documentation alongside the code.
Now, for a sequence diagram in Mermaid, there are several key elements:
-
Participants: They represent the different entities in your diagram. They could be classes, systems, subsystems, or actors interacting in the system. They are declared by the keyword ‘participant’, followed by an alias and a descriptor.
Example:
sequenceDiagram participant A as Alice participant B as Bob
-
Messages: These are shown as arrows between participants. The arrow’s direction indicates the sender and receiver of the message. Solid lines represent synchronous calls, while dashed lines represent asynchronous ones.
Example:
Alice->>Bob: Hello Bob, how are you? Note right of Bob: Bob thinks Bob-->>Alice: I'm good thanks!
Here,
Alice->>Bob: Hello Bob, how are you?
represents a synchronous message from Alice to Bob, whileBob-->>Alice: I'm good thanks!
represents an asynchronous message from Bob to Alice. -
Activations: To show that a participant is active, you can use the
activate
anddeactivate
keywords.Example:
Alice->>Bob: Hello Bob, how are you? activate Bob Note right of Bob: Bob thinks Bob-->>Alice: I'm good thanks! deactivate Bob
-
Notes: Notes can be used to add comments or explanations to your diagram. They can be placed to the right or left of a participant.
Example:
Note right of Bob: Bob thinks
-
Loops: Loops are used to indicate repeating interactions. They are defined with the
loop
keyword, anend
keyword, and a descriptor of what is being looped over.Example:
loop Every Minute Alice->>Bob: Ping Bob-->>Alice: Pong end
-
Conditions: Conditional interactions can be defined using the
opt
keyword (which stands for “Optional”). It is used like a loop, with a descriptor and anend
keyword.Example:
opt Successful Login Alice->>Bob: Login Bob-->>Alice: Success end
-
Groups: These are used to visually group interactions together.
Example:
group My Group Alice->>Bob: Message within group end
-
Parallel Interactions: You can define interactions that happen at the same time with the
par
andand
keywords.Example:
par Alice->>Bob: Message 1 and Alice->>Bob: Message 2 end
Remember, in Mermaid, the syntax is very sensitive, and a missing or misplaced character might cause an error in the diagram.
Now we will break it down further and ask it to reason on participants.
Prompt
Great please go into detail how participants work and the different
options for participants.
ChatGPT Response
Participants in a Mermaid sequence diagram are the entities that interact with each other over the course of the diagram. They represent the “actors” in the sequence, which can be individuals, systems, components, objects, or even another sequence.
The declaration of a participant begins with the keyword participant
, followed by a user-defined alias, and then a descriptor. The alias is used as a shorthand to represent the participant in the rest of the diagram.
Here’s an example:
sequenceDiagram
participant A as Alice
participant B as Bob
In this case, ‘A’ and ‘B’ are aliases representing ‘Alice’ and ‘Bob’. All the messages that ‘Alice’ and ‘Bob’ send or receive are referenced using these aliases.
It is important to note that the order of declaration of participants defines their order from left to right in the diagram. So, in the above example, ‘Alice’ will be shown on the left and ‘Bob’ on the right.
However, Mermaid also provides flexibility for changing the order in which participants are displayed without changing their order of declaration. This can be done using the participant
keyword with an index.
For instance:
sequenceDiagram
participant B as Bob
participant A as Alice
participant C as Charlie index 2
In this case, even though ‘Charlie’ is declared last, they will be displayed second in the diagram due to the index provided.
If you do not provide a descriptor for the participant, then the alias is displayed in the diagram:
sequenceDiagram
participant A
In this case, ‘A’ is both the alias and the descriptor, so ‘A’ will be displayed in the diagram.
That is about it for participants in Mermaid sequence diagrams. They are a powerful feature, enabling a clear and concise depiction of the interactions between different entities in your diagrams. Remember to choose meaningful aliases and descriptors to make your diagrams easier to understand.
Now let’s lock in our gains on message generation and have it reason on how messages work in mermaid.
Prompt
Please go into detail how messages work
ChatGPT Response
Sure, messages are a core component of sequence diagrams in Mermaid. They are used to show interactions between participants. In a sequence diagram, a message is represented as an arrowed line and can be of two types: synchronous and asynchronous.
-
Synchronous Messages: These are shown with a solid line and a solid arrow between the participants. Synchronous messages are blocking, meaning the sender waits for the receiver to process the message and respond before moving on.
Syntax:
Alice->>Bob: Message
This represents a synchronous message from Alice to Bob. The message can be any text that describes the interaction.
-
Asynchronous Messages: These are shown with a dotted line and an open arrow between the participants. Asynchronous messages are non-blocking, meaning the sender does not wait for the receiver to process the message and can move on to other tasks.
Syntax:
Alice-->>Bob: Message
This represents an asynchronous message from Alice to Bob.
A single participant can also send a message to itself. This is typically used to denote internal processing or a state change.
Syntax:
Alice->>Alice: Self Message
-
Message Responses: Mermaid sequence diagrams also support the drawing of a response to a message. Responses are depicted as dashed arrows pointing in the opposite direction of the original message. The sender and receiver are reversed.
Syntax:
Alice->>Bob: Request Bob-->>Alice: Response
In this case, Alice sends a request to Bob, who then sends a response back to Alice.
That is it for messages in Mermaid. The simplicity and expressiveness of the Mermaid syntax allow complex interactions to be quickly and clearly visualized in a diagram, making it an excellent tool for documenting software systems and designs.
Just to be pedantic, let’s make sure it knows and reasons on exactly what a message description should be.
Prompt
What is the purpose of the part of the message after the :
ChatGPT Response
The text after the :
(colon) in a Mermaid message declaration is used to describe the nature of the message or interaction that is happening between the participants. It provides context to the arrowed line that connects the participants.
For example, consider the following message syntax:
Alice->>Bob: Hello Bob, how are you?
Here, “Hello Bob, how are you?” is the description of the message that Alice is sending to Bob. It is a brief way to represent what that interaction or message is about.
In the context of software architecture, it could be the name of the method being called, the event being sent, or some other kind of interaction. For instance:
Client->>Server: GET /api/v1/users
Here, “GET /api/v1/users” describes an HTTP GET request being sent from a client to a server. This description helps provide context and make the sequence diagram more understandable.
In summary, the part after the colon :
in the message is used to give meaningful context to the interactions happening between the participants in the sequence diagram.
Ok, we have primed the pump. Let’s ask our question and see if it generates the valid Mermaid Markdown that we want.
Prompt
here is a method, can you create a mermaind sequence diagram from it?
private static void callTranscribe() throws IOException {
// Create the client
final var openAiKey = System.getenv("OPENAI_API_KEY");
final OpenAIClient client = OpenAIClient.builder().setApiKey(openAiKey).build();
File file = new File("/Users/richardhightower/Documents/audio_notes_patents/meeting_notes1.m4a");
byte[] bytes = Files.readAllBytes(file.toPath());
// Create the chat request
final TranscriptionRequest request = TranscriptionRequest.builder()
.model("whisper-1").prompt("Write up notes").language("en").file(bytes)
.build();
// Call Open AI API with chat message
final ClientResponse<TranscriptionRequest, AudioResponse> response = client.transcribe(request);
response.getResponse().ifPresent(r -> System.out.println(r.getBody()));
response.getException().ifPresent(Throwable::printStackTrace);
response.getStatusMessage().ifPresent(error -> System.out.printf("status message %s %d \n", error, response.getStatusCode().orElse(0)));
}
ChatGPT Response
Sure, here’s a Mermaid sequence diagram based on the given Java method:
sequenceDiagram
participant S as System
participant M as Method
participant O as OpenAIClient
participant F as File
participant B as Byte Array
participant R as Request
participant C as ClientResponse
M->>S: Get OPENAI_API_KEY
M->>O: Initialize with OPENAI_API_KEY
M->>F: Create file object
M->>F: Read all bytes from file
F-->>M: Return byte array
M->>B: Store byte array
M->>R: Create transcription request
M->>O: Call transcribe method with request
O-->>M: Return ClientResponse
M->>C: Store ClientResponse
M->>C: Get response from ClientResponse
C-->>M: Print body of response (if present)
M->>C: Get exception from ClientResponse
C-->>M: Print exception (if present)
M->>C: Get status message from ClientResponse
C-->>M: Print status message and status code (if present)
This diagram visually represents the flow of control and data in the callTranscribe
method. It shows interactions with the system environment, the OpenAI client, a file, a byte array, a transcription request, and a client response object.
Note that this is a high-level representation. Depending on the level of detail needed, you might include more or less detail in the diagram. This was my desired end state.
We did it. We used Chain of Thought to improve a one-shot attempt that did not work to make ChatGPT into a mermaid-generating machine!
Conclusion
In this article, we covered the basics of Chain of Thought (CoT) and how letting ChatGPT think about a problem can improve context and help it produce better results. We explored the generation of Mermaid sequence diagrams for Java source code with ChatGPT. We also discussed how to improve the generation by using a CoT approach. We showed how a one-shot chat can fail because it needs more context and how simple prompt engineering gets better results, but not what we wanted. We provided some examples of CoT that did not work because they did not provide the proper context. Finally, we showed a model that worked well using CoT by focusing on the context we needed for the Mermaid sequence diagram generation.
We showed an existing Mermaid diagram as the end state and then explored the various ways to produce this diagram by making ChatGPT think about the task. We did this by asking ChatGPT questions to help it refine its understanding of Mermaid diagrams, giving it more context to produce better results. Including the role of participants, messages, and message descriptions helped ChatGPT to produce the right results.
Finally, we tested our context gathering via CoT by asking ChatGPT to generate a Mermaid sequence diagram from a Java method. Using this approach, we can get ChatGPT to understand Mermaid sequence diagrams comprehensively and how to generate them with ChatGPT just by loading the proper context to make the predicted text work better.
Key Concepts and Review
- Chain of Thought: A method of iteratively prompting an AI language model with questions to refine its understanding of a topic or problem.
- Mermaid: A simple markdown-like language used to generate diagrams and flowcharts from text.
- Mermaid Sequence Diagram: A type of diagram used to visualize interactions between different entities in a system.
- Participants: The different entities in a Mermaid sequence diagram.
- Messages: The interactions between participants in a Mermaid sequence diagram.
- Synchronous Messages: A type of message in a Mermaid sequence diagram represented by a solid line and solid arrow between participants.
- Asynchronous Messages: A type of message in a Mermaid sequence diagram represented by a dotted line and open arrow between participants.
- Message Descriptions: The text following the colon in a Mermaid message declaration, used to describe the nature of the interaction or message between participants.
- Prompt Engineering: The process of carefully crafting prompts to provide the necessary context for an AI language model to generate accurate and relevant responses.
- Chain of Thought (CoT) prompting: A technique that improves the performance of Large Language Models (LLMs) on reasoning-based tasks through few-shot learning.
- CoT enables LLMs to address complex tasks by breaking them into intermediate steps, allowing for manageable granularity for both input and output.
- CoT prompting breaks a problem down into a series of intermediate reasoning steps, improving the ability of LLMs to perform complex reasoning.
- There are different strategies for implementing CoT prompting, such as few-shot CoT and zero-shot CoT.
- Few-shot CoT involves providing examples of Question-Answer pairs where the answer is explained step by step.
- Zero-shot CoT involves prefixing the Answer block with “Let’s think step by step” to prompt the LLM to complete the output in that format.
- The benefits of CoT prompting become more apparent as the model scale increases, leading to improved performance that substantially outperforms standard prompting for large model sizes.
- CoT is a similar approach used with LLMs as teaching a child to solve a problem step by step.
- CoT helps to decompose the problem into smaller, more manageable steps, making it easier for an LLM to understand and solve the problem.
- CoT prompting encourages the model to follow a logical thought process and improves its reasoning abilities.
- For more information on CoT prompting, see these references:
- It is essential to keep the context clear and concise.
- Remember that ChatGPT is like predictive text, so it is essential to prime its context to produce the desired output.
- Keep the context clear and concise.
- Use Chain of Thought (CoT) to prime ChatGPT with context so the generation it does is more likely to be your desired results.
- Repeat vital information to reinforce the context.
- Be patient and adjust your expectations.
Review questions
- What is Chain of Thought (CoT), and how does it improve the performance of Large Language Models (LLMs) on reasoning-based tasks?
- Do you know what strategies are for implementing CoT prompting?
- How can CoT prompting improve the ability of LLMs to perform complex reasoning?
- Could you tell me the importance of clear and concise context when you are thinking about an AI language model?
- Do you have any tips for using CoT with an AI language model?
- What is Mermaid, and how is it used?
- What are the critical elements of a Mermaid sequence diagram?
- What are participants in a Mermaid sequence diagram, and how are they declared?
- What is the purpose of messages in a Mermaid sequence diagram, and what are the two types of messages?
- What is the purpose of the text following the colon in a Mermaid message declaration?
Follow up links
- Java Open AI Client
- Using ChatGpt embeddings and Hyde to improve search results
- Anthropics Claude Chatbot Gets Upgrade
- Elon Musks XAi’s new frontier for artificial intelligence
- Using Mockito to test JAI Java Open AI Client
- Fine-tuning journey with Open AI API
- Using Open AI to create callback functions, the basis for plugins
- Using Java Open AI Client Async
- Fastest Java JSON Parser
- Java Open AI API Client on Github
- Medium: Introducing Java Open AI Client
- Medium: Using ChatGPT, Embeddings, and HyDE to Improve Search Results
Term | Definition | Relevance to AI/LLMs |
---|---|---|
Chain of Thought (CoT) | A technique that improves LLM performance on reasoning tasks through few-shot learning | Enables LLMs to break complex tasks into manageable steps |
Large Language Models (LLMs) | AI models trained on vast amounts of text data to understand and generate human-like text | Core technology behind advanced AI systems like ChatGPT |
Prompt Engineering | The process of crafting inputs to AI models to achieve desired outputs | Critical for optimizing LLM performance and accuracy |
Few-shot CoT | Providing examples of question-answer pairs with step-by-step explanations | Improves LLM’s ability to break down and solve complex problems |
Zero-shot CoT | Prefixing the answer block with “Let’s think step by step” to guide the LLM’s output | Encourages structured thinking without specific examples |
Mermaid | A markdown-like language for creating diagrams and flowcharts from text | Used to visualize processes and interactions in AI systems |
Embeddings | Vector representations of words or phrases in a high-dimensional space | Used in AI for semantic understanding and similarity comparisons |
About the Author
Rick Hightower is a seasoned software engineer, tech entrepreneur, and thought leader in the field of artificial intelligence and large language models. With decades of experience in Java development and a passion for emerging technologies, Rick has established himself as a prominent voice in the tech community.
As the creator of JAI (Java AI Open API Client) and a prolific writer on topics ranging from AI implementation to software engineering best practices, Rick consistently shares his expertise through articles, tutorials, and open-source contributions. His work focuses on bridging the gap between complex AI concepts and practical applications for developers and businesses alike.
Rick’s insights into Chain of Thought prompting, LLMs, and their real-world applications have helped countless professionals navigate the rapidly evolving landscape of AI technology. Through his writing and development work, he continues to push the boundaries of what is possible in AI-driven software solutions.
For more of Rick’s work and the latest in AI and software development, follow him on LinkedIn and check out his GitHub repositories.
TweetApache Spark Training
Kafka Tutorial
Akka Consulting
Cassandra Training
AWS Cassandra Database Support
Kafka Support Pricing
Cassandra Database Support Pricing
Non-stop Cassandra
Watchdog
Advantages of using Cloudurable™
Cassandra Consulting
Cloudurable™| Guide to AWS Cassandra Deploy
Cloudurable™| AWS Cassandra Guidelines and Notes
Free guide to deploying Cassandra on AWS
Kafka Training
Kafka Consulting
DynamoDB Training
DynamoDB Consulting
Kinesis Training
Kinesis Consulting
Kafka Tutorial PDF
Kubernetes Security Training
Redis Consulting
Redis Training
ElasticSearch / ELK Consulting
ElasticSearch Training
InfluxDB/TICK Training TICK Consulting