Beware of footguns with C++ std::string_view

I’ve been debugging a bug which recently popped up in our C++ code. We knew it was a problem in the C++ implementation because the Python version worked fine. However, in our unit tests, both the Python and C++ implementations were passing. The problem was inside a string matching routine. After a bit of debugging, I found out that the string matching function was behaving strangely.

After a bit of stepping through the code, I saw that the matching logic inside the function worked just fine, but as soon as we returned to the callee, the resulting strings were a jumbled mess. This suggested that there was some memory issue and we were pointing to freed memory. We use an std::vector<std::string_view> to collect the results of the string matching. This is a problem in one particular case, where we construct a new std::string and push_back() it to the results container. As a newcomer to C++17 (where std::string_view was introduced), it was news to me that there is an implicit cast from std::string to std::string_view (but not the other way around).

In our case, that meant that by the time we left the scope of the function, that std::string created inside the function was freed and our container member of type std::string_view was pointing to freed memory. The reason this wasn’t caught in the unit tests, is that we were using very short strings that were probably optimized using SSO (short string optimization). I really recommend this talk for a summary of things to think about when using std::string_view. It exactly describes the problem I encountered.

Written on June 11, 2021