Introduction.
Currently developing “Mia,” a talking cat-shaped robot that speaks dialect.
In a previous article here, I wrote that the PlatformIO IDE on VS Code now supports the ESP-IDF framework as well as Arduino.
I was relieved to finally be able to build, but after establishing MQTT communication after connecting to Wifi via SmartConfig, I got the following error and had to restart endlessly.
SyncShadow is already started
MQTTPubSubClient::onMessage: $aws/things/XXXXXXX
***ERROR*** A stack overflow in task loopTask has been detected.
Backtrace: 0x400829fa :0x3ffbdec0 0x401684e5 :0x3ffbdee0 0x4008a3d2 :0x3ffbdf00 0x4008c622 :0x3ffbdf80 0x4008a548 :0x3ffbdfa0 0x4008a4 fa :0x3 ffbe020 0x4008a548 :0x0000523a |<-CORRUPTED
ELF file SHA256: a98d6ccbac74a ac1
Rebooting...
The above error response is described in this issue.
What is loopTask?
loopTask
refers to the main loop task of the ESP32 program. This corresponds to the loop() function
in the Arduino framework, an infinite loop task that runs on the ESP32. So it is most likely caused by the recursive process in void loop() described in main.cpp.
Causes of stack overflow
Stack overflow is a phenomenon in which a task runs out of allocated stack memory. This has the following causes
- Excessive recursive calls: functions are called recursively, exhausting the stack.
- Large local variables: Too many or too large temporary local variables.
- Infinite loop: Stack-consuming processing is taking place in an infinite loop.
For stacks, see below.
Stack overflow may occur due to the receiving and parsing process of MQTT messages. Be especially careful when processing large messages.
Why does this occur with ESP-IDF when it did not occur with Arduino?
Differences between Arduino and ESP-IDF frameworks can cause stack overflow due to differences in stack size and task memory management.
Default stack size: 1
- Arduino: The task stack size is relatively large, so stack overflow is unlikely to occur.
- ESP-IDF: By default, task stack size may be small, causing stack overflow in the same code as the Arduino framework
Task Management:.
- Arduino Framework: High level of abstraction and simple task management.
- ESP-IDF framework: lower-level task management, finer granularity, but this can lead to problems if misconfigured.
Add debug messages
To identify the cause, add a debug message in the void loop() function where the cause may be.
The current void loop() function and the functions called within loop() are as follows.
// Execute the process during Wi-Fi connection
void executeWiFiConnectedRoutines() {
static unsigned long lastOneMinutesEventUnixTime = getNowUnixTime(); // Record event execution time every minute
// device shadow state monitoring
SyncShadow::getInstance().loop();
// apply device configuration changes
applyAndReportConfigUpdates();
// run once per minute
if ( (getNowUnixTime() - lastOneMinutesEventUnixTime) > 60) {
executeOneMinutesAction();
lastOneMinutesEventUnixTime = getNowUnixTime();
}
}
void loop() {
if (inSafeMode) {
safeModeLoop();
return;
}
monitorWiFiConnectionChange();
if (isWiFiConnected()) {
executeWiFiConnectedRoutines(); }
}
buttonManager.handleButtonPress(); }
ExpressionService::getInstance().render(); }
delay(10);
}
Use the uxTaskGetStackHighWaterMark(NULL) function to
get how much stack space the current task is not using (amount of unused stack) and output that value to the serial port.
uxTaskGetStackHighWaterMark(NULL)
- This function obtains the “High Water Mark” of the stack of the specified task. The “High Water Mark” is the maximum amount of stack used by the task so far. In other words, it returns the amount of stack remaining when the stack is at its lowest. This can be used to determine how much stack a task has actually used.
- Passing
NULL
as an argument retrieves the high water mark for the currently executing task
Therefore, add the above debug messages before and after each process that is likely to reduce the amount of unused stack.
// Execute the process during Wi-Fi connection
void executeWiFiConnectedRoutines() {
static unsigned long lastOneMinutesEventUnixTime = getNowUnixTime(); // Record event execution time every minute
// add debug message
uint32_t freeStack = uxTaskGetStackHighWaterMark (NULL);
Serial.print("Free stack at executeWiFiConnectedRoutines start: ");
Serial.println(freeStack);
// Device shadow state monitoring
SyncShadow::getInstance().loop();
// add debug message
freeStack = uxTaskGetStackHighWaterMark (NULL);
Serial.print("Free stack after SyncShadow loop: ");
Serial.println(freeStack);
// Apply device configuration changes
applyAndReportConfigUpdates();
// add debug messages
freeStack = uxTaskGetStackHighWaterMark (NULL);
Serial.print("Free stack after applyAndReportConfigUpdates: ");
Serial.println(freeStack);
// Run once per minute
if ( (getNowUnixTime() - lastOneMinutesEventUnixTime) > 60) {
executeOneMinutesAction();
lastOneMinutesEventUnixTime = getNowUnixTime();
}
// add debug message
freeStack = uxTaskGetStackHighWaterMark (NULL);
Serial.print("Free stack at executeWiFiConnectedRoutines end: ");
Serial.println(freeStack);
}
void loop() {
// Add debug message to check stack usage
uint32_t freeStack = uxTaskGetStackHighWaterMark (NULL);
Serial.print("Free stack at loop start: ");
Serial.println(freeStack);
if (inSafeMode) {
safeModeLoop();
return;
}
monitorWiFiConnectionChange();
if (isWiFiConnected()) {
// add debug message
freeStack = uxTaskGetStackHighWaterMark (NULL);
Serial.print("Free stack before executeWiFiConnectedRoutines: ");
Serial.println(freeStack);
executeWiFiConnectedRoutines();
// Add a debug message
freeStack = uxTaskGetStackHighWaterMark (NULL);
Serial.print("Free stack after executeWiFiConnectedRoutines: ");
Serial.println(freeStack);
}
buttonManager.handleButtonPress();
// add debug message
freeStack = uxTaskGetStackHighWaterMark (NULL);
Serial.print("Free stack before ExpressionService: ");
Serial.println(freeStack);
ExpressionService::getInstance().render();
// add debug message
freeStack = uxTaskGetStackHighWaterMark (NULL);
Serial.print("Free stack after ExpressionService: ");
Serial.println(freeStack);
delay(10);
// add debug message
freeStack = uxTaskGetStackHighWaterMark (NULL);
Serial.print("Free stack at loop end: ");
Serial.println(freeStack);
}
Build again and look at the log.
Results of debug messages
The debug message indicates that only 1916 bytes of stack remain. Perhaps the stack ran out when the MQTTPubSubClient::onMessage function was used.
The large stack usage for MQTT message subscribe and publish operations and message processing ( onMessage
function) was the direct cause of the stack overflow.
MQTTPubSubClient::subscribe: $aws/things/XXXXXXXXXXXXX/shadow/get/accepted
[ 6987][V][ssl_client.cpp:369] send_ssl_data(): Writing HTTP request with 71 bytes...
MQTTPubSubClient::subscribe: $aws/things/XXXXXXXXXXXXXXX/shadow/update/delta
[ 7064][V][ssl_client.cpp:369] send_ssl_data(): Writing HTTP request with 71 bytes...
MQTTPubSubClient::publish: $aws/things/XXXXXXXXXXXXX/shadow/get
[ 7135 ][V][ssl_client.cpp:369] send_ssl_data(): Writing HTTP request with 59 bytes...
setup done
Free stack at loop start: 1916
start initialize service after wifi setup
SyncShadow is already started
Free stack before executeWiFiConnectedRoutines: 1916
Free stack at executeWiFiConnectedRoutines start: 1916
Free stack after SyncShadow loop: 1916
Free stack after applyAndReportConfigUpdates: 1916
Free stack at executeWiFiConnectedRoutines end: 1916
Free stack after executeWiFiConnectedRoutines: 1916
Free stack before ExpressionService: 1916
Free stack after ExpressionService: 1916
Free stack at loop end: 1916
Free stack at loop start: 1916
Free stack before executeWiFiConnectedRoutines: 1916
Free stack at executeWiFiConnectedRoutines start: 1916
MQTTPubSubClient::onMessage: $aws/things/XXXXXXXXXXXXXXX/shadow/get/accepted
***ERROR*** A stack overflow in task loopTask has been detected.
Backtrace: 0x400829fe :0x3ffbf190 0x40168fb5 :0x3ffbf1b0 0x4008a3d6 :0x3ffbf1d0 0x4008c626 :0x3ffbf250 0x4008a54c :0x3ffbf270 0x4008a 4fe :0x 000000ff |<-CORRUPTED
Change stack size setting in sdkconfig file
So, try increasing the stack size as a solution: open the sdkconfig
file in the project root of ESP-IDF and find the following setting to increase the stack size from 8192 to 16384.
// sdkconfig.default
// before
CONFIG_ARDUINO_LOOP_STACK_SIZE=8192
// after
CONFIG_ARDUINO_LOOP_STACK_SIZE=16384
By the way, someone else encountered the same stack overflow after enabling MQTT.
https://esp32.com/viewtopic.php?t=17104
In the answer in the URL above, the Main Task Size (CONFIG_MAIN_TASK_STACK_SIZE
) is increased, but this time CONFIG_ARDUINO_LOOP_STACK_SIZE is changed.
Because in this case, the following platform.ini settings are made to use both Arduino and ESP-IDF libraries and functions, and CONFIG_AUTOSTART_ARDUINO
is defined so that the Arduino environment is automatically started and the setup()
and loop()
functions are automatically called. So the main loop task of Arduino ( loop() function
) is running separately from the main task of ESP-IDF. And this time, there is a stack overflow in the Arduino’s loop() function
.
[ env]
platform = espressif32 @ 6.7.0
framework = arduino, espidf
The result of the re-build,,
The stack overflow error after MQTT communication was successfully resolved!
The log shows that even at the end of the loop() function, as shown below, the remaining stack had increased to 7356 bytes, confirming that there was sufficient room. With the original 8192 bytes, 16384-7356=9028 > 8192, indicating that there was a stack overflow.
Free stack at loop start: 7356
Free stack before executeWiFiConnectedRoutines: 7356
Free stack at executeWiFiConnectedRoutines start: 7356
Free stack after SyncShadow loop: 7356
Free stack after applyAndReportConfigUpdates: 7356
Free stack at executeWiFiConnectedRoutines end: 7356
Free stack after executeWiFiConnectedRoutines: 7356
Free stack before ExpressionService: 7356
Free stack after ExpressionService: 7356
Free stack at loop end: 7356
This successfully solved the problem, but to confirm the cause, we will add debugging code to check the stack usage in the onMessage function of MQTT.
void MqttClient::onMessage(const std::function<void(const String &topic, const String &payload)> &callback) {
client->onMessage([callback](const String &topic, const String &payload) {
// Get the high water mark of the stack before the callback
uint32_t freeStackBefore = uxTaskGetStackHighWaterMark (NULL);
Serial.print("Free stack before onMessage: ");
Serial.println(freeStackBefore);
Serial.println("MQTTPubSubClient::onMessage: " + topic + " " + payload);
// Execute callback function
callback(topic, payload);
// Get high water mark on stack after callback
uint32_t freeStackAfter = uxTaskGetStackHighWaterMark (NULL);
Serial.print("Free stack after onMessage: ");
Serial.println(freeStackAfter);
// Calculate consumed stack size
uint32_t stackUsed = freeStackBefore - freeStackAfter;
Serial.print("Stack used by onMessage: ");
Serial.println(stackUsed);
});
}
The following results were obtained from the build: 2736 bytes were consumed by the onMessage function in the MQTT communication.
So, again, CONFIG_ARDUINO_LOOP_STACK_SIZE=8192 was not sufficient.
Free stack before onMessage: 10100
MQTTPubSubClient::onMessage: $aws/things/XXXXXXXX/shadow/get/accepted
preload phrases after initial sync
SELECT * FROM phrase WHERE phrase_type = ? AND kind = 'other' ORDER BY talk_count, RANDOM () LIMIT ? ;, Args: natural, 10,
Matched: 10, Time taken:276963
Free stack after onMessage: 7364
Stack used by on Message: 2736
MQTTPubSubClient::publish: $aws/things/XXXXXXXX/shadow/update { "state":{"reported":{"config":{"talk_frequency":60,"weather_announcement_time":{"hour":8,"minute":0},"birth_date":{},"phrase_type":"natural","talk_start_time":{"hour":7,"minute":0},"talk_end_time":{"hour":22,"minute":0},"work_start_time":{},"work_end_time":{},"volume":50,"firmware_version":"1.0.0"},"wakeup_time":25,"last_action_time":1717276989,"downloading":{"status":"NORMAL","progress":0,"start_time":1717234673,"end_time& quot; :1717235067,"type":"voice"},"connected":true,"last_periodic_talk_time":1716968737}}
[ 59654][V][ssl_client.cpp:369] send_ssl_data(): Writing HTTP request with 63 bytes...
[ 59667 ][V] [ ssl_client.cpp:36 9 ] send_ssl_data(): Writing HTTP request with 512 bytes...
Free stack at loop end: 7364
I’m glad we found the cause and solved the problem.