M4 Board – The SPI IPC

A little progress on the M4 Board, I managed to get the IPC (inter processor comms) system between the Cortex-M4 and the ESP8266 going.
It uses HSPI interface + GPIO0 on the ESP and SPI1 interface + a gpio on the M4.

SPI comms from ESP to Cortex-M4
Cortex-M4 acts as spi-slave, it will listen to the any incoming HSPI command from the ESP, via SPI_RXNE interrupt (on SPI1 irq handler, max. 42 MHz).
The irq handler is done with some very optimized assembler code and simply buffers whatever is send from the ESP, whenever there is something in the rx buffer. It also deals with sending any data if there is something in the transmit buffer to send.
HSPI CS will trigger an external interrupt on both edges on the M4.
When HSPI CS goes low, it will clear the index of the spi1 rx buffer.
When HSPI CS goes high, it will process the command/data received in the spi1 buffer.

Cortex-M4 to ESP
GPIO0 is set as input on the ESP and the M4 will pull this low whenever it wants to send command/data to the ESP. Prior to that command and data is written to the transmit buffer of the SPI IRQ handler.
This triggers an interrupt on the ESP, which will then start clocking dummy data out on the SPI to read/process the responses (full duplex).
This wont come in any faster than the ESP is driving the clock. The HSPI clock is set to 40 MHz.

Results
This system works pretty well now. It’s not as fast as I’d like. Currently transfer is about 500 KB/s, while I can download to the ESP with about 900KB/s under optimal conditions.
So the SPI IPC is a bottleneck, however there is still room for improvement.
As the clock speed is really fine on the SPI, the problem is the delay between each byte write from the ESP which is done via CPU writes.
Ideally I could use DMA, but there seems to be no info on this anywhere in regards to the ESP.
Another option is to use 16 bit read/writes instead of 8 bits and probably almost double the speed.
Or I could use the entire 64 byte buffer of the ESP HSPI interface and clock them out nicely without delays 64 bytes at a time.

Other
On the hardware side, I have re-designed the board and ordered new pcb’s. I screwed up the footprint of the GTL2000 chip and there simply was no way about. Obviously other improvements & fixes were added.

– Duke

spi_delaym4board_mess