Greetings all,
You know that ORI has been active in a variety of regulatory work for several years now. One of those areas has been in AI/ML and the effect on communications. We served on the US FCC TAC in 2022, have presented and written on the subject, and are working towards an AI/ML Handbook for Amateur Radio.
You may have seen news about OpenAI GPT, or ChatGPT.
https://openai.com/api/
There’s a lot of people testing this chatbot out for reverse engineering and in writing software. There’s been some successes in cases where the model would have had access to code or documentation about protocols, observations, or technical behavior in order to train, and requests match up well with the models.
For example, a friend asked the chatbot to create a Python script to access the webcam on their computer, and they shared a still image from the resulting feed. Another example from another friend resulted in better reverse engineering results than they had gotten with either IDA or Ghidra over many weeks.
So, how does ChatGPT handle VHDL?
We have something specific to compare against, since we just completed a COBS decoder in VHDL for our open source HEO/GEO transponder work for #Haifuraiya.
Let’s start off simple. Here’s the first dialogue with the chatbot, and the results.
-=-=-=-=-=-=-=-=-=-=-=-=-=-
Abraxas3d: Design a COBS decoder in VHDL
Sure, here’s an example of a COBS decoder in VHDL:
library IEEE;
use IEEE.STD_LOGIC_1164.ALL;entity COBS_decoder is
Port (
input_data : in std_logic_vector;
decoded_data : out std_logic_vector
);
end COBS_decoder;architecture Behavioral of COBS_decoder is
— Declare a variable to keep track of the current index
— in the input and output data streams
variable current_index : integer := 1;
— Declare a variable to store the length of the
— run of zeros in the input data
variable zero_run_length : integer := 0;
begin
— Iterate over the input data
for i in input_data’range loop
— If the current input value is a zero, increment
— the zero run length variable
if input_data(i) = ‘0’ then
zero_run_length := zero_run_length + 1;
— Otherwise, the current input value is not a zero,
— so we’ve reached the end of the zero run
else
— Set the decoded value at the current index to
— the length of the zero run
decoded_data(current_index) := conv_std_logic_vector(zero_run_length, 8);
— Increment the current index and reset the zero
— run length for the next iteration
current_index := current_index + 1;
zero_run_length := 0;
end if;
end loop;
end Behavioral;
This code decodes the input data using the COBS (Consistent Overhead Byte Stuffing) algorithm, which involves replacing runs of zeros in the input data with a single byte that indicates the length of the zero run. For example, the input ‘101001000’ would be decoded as ‘103101’.
This decoder iterates over the input data and maintains a variable zero_run_length that is incremented for each zero in the input data. When a non-zero value is encountered, the length of the zero run is written to the output data and the zero_run_length variable is reset.
I hope this helps! Let me know if you have any other questions.”
-=-=-=-=-=-=-=-=-=-=-=-=-=-
It knew what COBS meant. It produced code, with comments, and a brief “design document”, with examples.
The code produced is not licensed or cited.
Our open source HEO transponder project uses AXI-based blocks in the FPGA. So, let’s add AXI to the request.
-=-=-=-=-=-=-=-=-=-=-=-=-=-
Abraxas3d: Design a COBS decoder with AXI interface in VHDL
Sure, here’s an example of a COBS decoder with an AXI interface in VHDL:
library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
use IEEE.NUMERIC_STD.ALL;— AXI interface definitions
library unisim;
use unisim.vcomponents.all;entity COBS_decoder is
Port (
AXI_IN : in AXI4Stream;
AXI_OUT : out AXI4Stream
);
end COBS_decoder;architecture Behavioral of COBS_decoder is
— Declare a variable to keep track of the current index
— in the input and output data streams
variable current_index : integer := 1;
— Declare a variable to store the length of the
— run of zeros in the input data
variable zero_run_length : integer := 0;
begin
— Iterate over the input data
for i in AXI_IN.data’range loop
— If the current input value is a zero, increment
— the zero run length variable
if AXI_IN.data(i) = ‘0’ then
zero_run_length := zero_run_length + 1;
— Otherwise, the current input value is not a zero,
— so we’ve reached the end of the zero run
else
— Set the decoded value at the current index to
— the length of the zero run
AXI_OUT.data(current_index) := conv_std_logic_vector(zero_run_length, 8);
— Increment the current index and reset the zero
— run length for the next iteration
current_index := current_index + 1;
zero_run_length := 0;
end if;
end loop;
— Set the valid and last flags in the output AXI stream
AXI_OUT.valid <= ‘1’;
AXI_OUT.last <= ‘1’;
end Behavioral;
This code is similar to the previous example, but it uses an AXI (Advanced eXtensible Interface) interface for the input and output data. The AXI interface is a standard interface used in many FPGA designs, and it provides a way to stream data into and out of the decoder.
In this code, the input data is read from the AXI_IN.data port, and the decoded output is written to the AXI_OUT.data port. The valid and last flags are also set in the output AXI stream to indicate that the data is valid and that it is the last data in the stream.
I hope this helps! Let me know if you have any other questions.
-=-=-=-=-=-=-=-=-=-=-=-=-=-
It knows about the previous request and updated the design. No license.
Here’s a review of the COBS implementation from Paul Williamson KB5MU.
“I don’t know what this is, but it isn’t a COBS decoder with AXI interfaces.
It does seem at first glance to be some sort of thing with AXI interfaces, but it does not implement an AXI interface at all. It just relies on a (possibly imaginary) library type called AXI4Stream that somehow magically does AXI interfaces, in a way that’s both implausibly smart and inexplicably dumb. I think the classic phrase “not even wrong” applies here.
The AXI4Stream object apparently has a member .data that is an array of bytes, which might make sense if the entire AXI transfer is being treated as a single entity here. The code iterates through the input .data using the ’range attribute, as if the AXI4Stream object has a defined fixed transfer length.
Which it can’t have, if it’s a general purpose AXI stream implementation.
So, if this was ever anything real, it must have been a special purpose AXI implementation with a fixed transfer length. Assuming that the AXI4Stream .data actually works that way, the entity does
do some sort of transformation on the input data to create the output data.
We asked for that to be a COBS decoder, but the function that it actually implements bears only the slightest resemblance to a COBS decoder. What it actually does is output a stream of words that are the lengths of the runs of zeroes in the input data. Nonzero data does not get passed through at all.
Instead, a run of nonzero bytes gets translated into a run of zeroes, as if two adjacent nonzero bytes meant nothing but a zero-length run of zeroes.
Here’s an example of what this code seems to do:
Input .data: 1 2 3 4 0 0 0 0 1 2 3 4
Output .data: 0 0 0 0 4 0 0 0
That’s certainly not a COBS decoder, or even anything useful at all.
Note that the output is variable length, but never longer than the input.
We had to assume that the AXI4Stream object had a fixed .data size to get this far, but now it has to have a variable data size for the output.
Supposing we had successfully suspended disbelief up to here, we now examine the final two lines of code in the Behavioral architecture. It sets the .valid and .last members of the AXI_OUT port to ‘1’.
That makes absolutely no sense if it’s supposed to encapsulate an entire AXI stream transfer. Those signals only make sense if we’re treating one word at a time, as you’d expect to see in a normal, general-purpose AXI stream object.
The extremely verbose commenting style in this code suggests that the original source material was from an introductory tutorial. Maybe the AI has been tuned to favor heavily commented code, and ends up picking up partially completed junk from the middle of tutorials?
Anyhow, this is not an example of automated programming in response to a high-level query. This is gibberish that succeeds only in having a superficial resemblance to the code requested.”
So, does this work?
Let’s try it out in the lab!
The first thing we notice is that there is an obvious syntax error with a missing semicolon. Aside from that, there is no AXI4Stream port type. It’s not recognized. Vivado appears to be disappointed in this file, and wants little to do with this code. This is where we should stop, but I spent some time looking to see if the ports were at least close to being declared correctly.
The model appears to have taken some custom or maybe enumerated type from somewhere, and come up with a custom AXI4Stream type. This is not recognized by the unisim library right off the bat, and that’s the sort of thing that automated code writing needs to get right.
Let’s look at the code written by humans from ORI. The decoder can be found here:
https://github.com/Abraxas3d/fpga_cores/blob/master/src/decoder.vhd
-=-=-=-=-=-=-=-=-=-=-=-=-=-
— Company: Open Research Institute, Inc.
— Engineer: Skunkwrx, Abraxas3d
— Design Name: COBS protocol decoder
— Module Name: decoder – Behavioral
— Project Name: Phase 4 “Haifuraiya”
— Target Devices: 7000 Zynq
— Tool Versions: 2021.1
— Description: COBS protocol decoder.
— Additional Comments: This work is Open Source and licensed using CERN OHL v2.0
library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
use IEEE.NUMERIC_STD.ALL;
–Entity Declaration
entity decoder is
Port ( rst : in STD_LOGIC;
clk : in STD_LOGIC;
s_tdata : in STD_LOGIC_VECTOR (7 downto 0);
s_tlast : in STD_LOGIC;
s_tvalid : in STD_LOGIC;
s_tready : out STD_LOGIC;
m_tdata : out STD_LOGIC_VECTOR (7 downto 0);
m_tlast : out STD_LOGIC;
m_tvalid : out STD_LOGIC;
m_tready : in STD_LOGIC);
end decoder;
–Architecture
architecture Behavioral of decoder is
— internal copy of s_tdata
signal input_data : STD_LOGIC_VECTOR (7 downto 0);
— s_tdata delayed by one clk where s_tvalid is high
signal input_data_d : STD_LOGIC_VECTOR (7 downto 0);
— internal version of s_tlast
— not used for anything
— We create our own m_tlast from the COBS framing instead
signal s_tlast_i : STD_LOGIC;
— internal version of input handshaking signal s_tvalid
signal s_tvalid_i : STD_LOGIC;
— s_tvalid delayed by one clk
— thus, a version of s_tvalid aligned with input_data_d
signal s_tvalid_i_d : STD_LOGIC;
— s_tvalid delayed by two clks
— thus, a version of s_tvalid aligned with output_data
signal s_tvalid_i_d_d : STD_LOGIC;
— countdown of bytes in this sequence
— loads from input_data_d when it represents the frame count
— as determined by signal counter_load
— contains the count while m_tdata contains the implied 0 at
— the end of a sequence, or the next count that replaced it
— counts down thereafter, to contain 1 when the last non-zero
— byte of the sequence is on m_tdata
— allowed to count down to 0, but not beyond
signal count : STD_LOGIC_VECTOR (7 downto 0);
— enable to load count from input_data_d on this clk edge
— two cases detected:
— * first valid non-zero byte after a frame separator
— * first valid byte after count is exhausted
— allowed to be high for multiple cycles
signal counter_load : STD_LOGIC;
— counter_load delayed by one clk where s_tvalid is high
— used to identify the first valid data byte of any sequence,
— for purposes of computing m_tvalid (via pre_tvalid)
signal counter_load_d : STD_LOGIC;
— detection of a valid frame separator (zero) byte in input_data_d
signal frame_sep : STD_LOGIC;
— frame_sep delayed by one clk where s_tvalid is high
— used to compute counter_load
— used to compute rising edge of pre_tvalid
signal frame_sep_d : STD_LOGIC;
— frame_sep_d delayed by an additional clk (not depending on s_tvalid)
— used to find the first non-zero byte of the new frame
signal frame_sep_d_d : STD_LOGIC;
— move the frame_sep signal that occurred during m_tready low
— out to the first cycle when m_tready is high again
signal use_saved_frame_sep : STD_LOGIC;
— flag to remember that the frame count for this sequence was 255,
— to handle the special case that such a sequence does not have
— an implied zero byte at the end.
— set when loading count with 255
— cleared when the counter is reloaded with anything else
signal case_255 : STD_LOGIC;
— internal version of m_tdata output
signal output_data : STD_LOGIC_VECTOR (7 downto 0);
— internal version of m_tlast output
— high when the last byte of a frame is valid on m_tdata
signal m_tlast_i : STD_LOGIC;
— delayed versions of m_tlast
signal m_tlast_i_d : STD_LOGIC;
signal m_tlast_i_d_d : STD_LOGIC;
— intermediate result for m_tvalid.
— high across all data bytes of each sequence on m_tdata
— does not go low for bytes on m_tdata corresponding to
— bytes invalidated by s_tvalid.
signal pre_tvalid : STD_LOGIC;
— internal version of m_tvalid output.
— pre_tvalid with periods of low s_tvalid_d_d punched out
signal m_tvalid_i : STD_LOGIC;
— internal version of m_tready input
— also the internal version of s_tready output
— passes through m_tready to s_tready with no clk delays
signal m_tready_i : STD_LOGIC;
— constant byte value 0xFF, for comparison purposes
signal all_ones : STD_LOGIC_VECTOR(input_data’range) := (others => ‘1’);
— constant byte value 0x00, for comparison purposes
signal all_zeros : STD_LOGIC_VECTOR(input_data’range) := (others => ‘0’);
begin
— asynchronous assignments
frame_sep <= ‘1’ when input_data_d = all_zeros and s_tvalid_i_d = ‘1’
else ‘0’;
m_tlast_i <= ‘1’ when ((frame_sep = ‘1’ and m_tvalid_i = ‘1’ and m_tready = ‘1’))
else ‘0’;
counter_load <= ‘1’ when (input_data_d /= all_zeros and frame_sep_d = ‘1’ and s_tvalid_i_d = ‘1’) — start of frame
or (to_integer(unsigned(count)) = 1 and s_tvalid_i_d = ‘1’) — start of next sequence in frame
else ‘0’;
m_tvalid_i <= ‘1’ when ((pre_tvalid = ‘1’ and s_tvalid_i_d_d = ‘1’ — usual case, if input_data was valid
and not (to_integer(unsigned(count)) = 1 and s_tvalid_i_d = ‘0’)) — defer last byte; might be m_tlast
or (pre_tvalid = ‘1’ and to_integer(unsigned(count)) = 1
and s_tvalid_i_d = ‘1’ and s_tvalid_i_d_d = ‘0’)) — pick up that deferred last byte
else ‘0’;
s_tready <= m_tready_i;
m_tdata <= output_data;
input_data <= s_tdata;
s_tvalid_i <= s_tvalid;
m_tready_i <= m_tready;
m_tvalid <= m_tvalid_i;
m_tlast <= m_tlast_i;
— processes
set_case_255 : process (rst, clk)
begin
if rst = ‘1’ then
case_255 <= ‘0’;
elsif rising_edge(clk) and m_tready_i = ‘1’ then
if counter_load = ‘1’ and input_data_d = all_ones then
case_255 <= ‘1’;
elsif counter_load = ‘1’ and input_data_d /= all_ones then
case_255 <= ‘0’;
end if;
end if;
end process set_case_255;
delay_s_tvalid : process (rst, clk)
begin
if rst = ‘1’ then
s_tvalid_i_d <= ‘0’;
s_tvalid_i_d_d <= ‘0’;
elsif rising_edge(clk) and m_tready_i = ‘1’ then
s_tvalid_i_d <= s_tvalid_i;
s_tvalid_i_d_d <= s_tvalid_i_d;
end if;
end process delay_s_tvalid;
create_pre_tvalid : process (rst, clk)
begin
if rst = ‘1’ then
counter_load_d <= ‘0’;
pre_tvalid <= ‘0’;
elsif rising_edge(clk) and m_tready_i = ‘1’ then
if s_tvalid_i_d = ‘1’ then
counter_load_d <= counter_load;
if (frame_sep_d_d = ‘1’ and frame_sep_d = ‘0’) — normal last byte of frame
or (counter_load_d = ‘1’ and frame_sep_d = ‘0’) — normal first byte of a sequence
then
pre_tvalid <= ‘1’;
end if;
end if;
if frame_sep = ‘1’
then
pre_tvalid <= ‘0’;
end if;
if counter_load = ‘1’ and case_255 = ‘1’ then
pre_tvalid <= ‘0’;
end if;
end if;
end process create_pre_tvalid;
delay_m_tlast_i : process (rst, clk)
begin
if rst = ‘1’ then
m_tlast_i_d <= ‘0’;
m_tlast_i_d_d <= ‘0’;
elsif rising_edge(clk) and m_tready_i = ‘1’ then
m_tlast_i_d <= m_tlast_i;
m_tlast_i_d_d <= m_tlast_i_d;
end if;
end process delay_m_tlast_i;
set_counter : process (rst,clk)
begin
if rst = ‘1’ then
count <= (others => ‘0’);
frame_sep_d <= ‘0’; frame_sep_d_d <= ‘0’; elsif rising_edge(clk) and m_tready_i = ‘1’ then frame_sep_d_d <= frame_sep_d; if s_tvalid_i_d = ‘1’ then frame_sep_d <= frame_sep; if counter_load = ‘1’ then count <= input_data_d; elsif count /= all_zeros then count <= STD_LOGIC_VECTOR(unsigned(count) – 1); end if; end if; end if; end process set_counter; create_output : process (rst, clk) begin if rst = ‘1’ then output_data <= (others => ‘0’);
elsif rising_edge(clk) and m_tready_i = ‘1’ then
if counter_load = ‘1’
then
output_data <= all_zeros;
elsif s_tvalid_i_d = ‘1’ then
output_data <= input_data_d;
end if;
end if;
end process create_output;
selective_delay_of_input_data : process (rst,clk)
begin
if rst = ‘1’ then
input_data_d <= all_zeros;
elsif rising_edge(clk) and m_tready_i = ‘1’ then
if s_tvalid_i = ‘1’ then
input_data_d <= input_data;
end if;
end if;
end process selective_delay_of_input_data;
end Behavioral;
-=-=-=-=-=-=-=-=-=-=-=-=-=-
This codebase doesn’t include just the source code, but also a testbench file, a design document, timing diagrams, a Python notebook that creates test vectors, and Python code that compares results produced by the testbench that uses the test vectors.
None of this sort of thing is produced by ChatGPT.
There is code in some languages for some functions that can be correctly written by AI at this point.
This is a situation similar to algorithmic musical composition. Algorithms can come up with useful ideas, motifs, sketches, or traverse a search space more quickly than a human. They can help a composer, but do not replace one. Algorithms can produce some sparkling gems of ideas. These are fresh cuts and views. This is mixed in with a lot of music that is just plain wrong.
If your programming job happens to be in a well-defined area that is easily automated, then you do have something to worry about today.
All jobs are a target for eventual automation with AI/ML. Yes, all of them. As AI/ML becomes more capable, it will automate even things like testbench writing (HDL Coder already does something like this) and support code (like a Python notebook that creates a solid set of test vectors for a testbench) and design documents. All of us should be aware of what we do being replaced by an algorithm some number of tomorrows in the future.
That day is not yet “today” with Chat GPT, when it comes to writing #VHDL.
What do you think? We’re interested in your experience and opinions with AI/ML writing code.
Hello,
neat article. I can’t find your python notebooks where you generated your test vectors. Can you point me to them?
They are buried in the part of the project where we were looking to integrate a COBS decoder in HDL, here:
https://github.com/OpenResearchInstitute/adi_adrv9371_zc706/tree/adrv9371-zc706-dvbs2-integ/test/cobs_decoder