ctypes return a string from c function
Solution 1
In hello.c you return a local array. You have to return a pointer to an array, which has to be dynamically allocated using malloc.
char* hello(char* name)
{
char hello[] = "Hello ";
char excla[] = "!\n";
char *greeting = malloc ( sizeof(char) * ( strlen(name) + strlen(hello) + strlen(excla) + 1 ) );
if( greeting == NULL) exit(1);
strcpy( greeting , hello);
strcat(greeting, name);
strcat(greeting, excla);
return greeting;
}
Solution 2
Your problem is that greeting was allocated on the stack, but the stack is destroyed when the function returns. You could allocate the memory dynamically:
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
const char* hello(char* name) {
char* greeting = malloc(100);
snprintf("Hello, %s!\n", 100, name)
printf("%s\n", greeting);
return greeting;
}
But that's only part of the battle because now you have a memory leak. You could plug that with another ctypes call to free().
...or a much better approach is to read up on the official C binding to python (python 2.x at http://docs.python.org/2/c-api/ and python 3.x at http://docs.python.org/3/c-api/). Have your C function create a python string object and hand that back. It will be garbage collected by python automatically. Since you are writing the C side, you don't have to play the ctypes game.
...edit..
I didn't compile and test, but I think this .py would work:
import ctypes
# define the interface
hello = ctypes.cdll.LoadLibrary('./hello.so')
# find lib on linux or windows
libc = ctypes.CDLL(ctypes.util.find_library('c'))
# declare the functions we use
hello.hello.argtypes = (ctypes.c_char_p,)
hello.hello.restype = ctypes.c_char_p
libc.free.argtypes = (ctypes.c_void_p,)
# wrap hello to make sure the free is done
def hello(name):
_result = hello.hello(name)
result = _result.value
libc.free(_result)
return result
# do the deed
print hello("Frank")
Solution 3
I ran into this same problem today and found you must override the default return type (int
) by setting restype
on the method. See Return types in the ctype doc here.
import ctypes
hello = ctypes.cdll.LoadLibrary('./hello.so')
name = "Frank"
c_name = ctypes.c_char_p(name)
hello.hello.restype = ctypes.c_char_p # override the default return type (int)
foo = hello.hello(c_name)
print c_name.value
print ctypes.c_char_p(foo).value
Solution 4
Here's what happens. And why it's breaking. When hello() is called, the C stack pointer is moved up, making room for any memory needed by your function. Along with some function call overhead, all of your function locals are managed there. So that static char greeting[100]
, means that 100 bytes of the increased stack are for that string. You than use some functions that manipulate that memory. At the you place a pointer on the stack to the greeting memory. And then you return from the call, at which point, the stack pointer is retracted back to it's original before call position. So those 100 bytes that were on the stack for the duration of your call, are essentially up for grabs again as the stack is further manipulated. Including the address field which pointed to that value and that you returned. At that point, who knows what happens to it, but it's likely set to zero or some other value. And when you try to access it as if it were still viable memory, you get a segfault.
To get around, you need to manage that memory differently somehow. You can have your function alloc
ate the memory on the heap, but you'll need to make sure it gets free()
'ed at a later date, by your binding. OR, you can write your function so that the binding language passes it a glump of memory to be used.
Solution 5
I also ran into the same problem but used a different approach. I was suppose to find a string in a list of strings matchin a certain value.
Basically I initalized a char array with the size of longest string in my list. Then passed that as an argument to my function to hold the corresponding value.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
void find_gline(char **ganal_lines, /*line array*/
size_t size, /*array size*/
char *idnb, /* id number for check */
char *resline) {
/*Iterates over lines and finds the one that contains idnb
then affects the result to the resline*/
for (size_t i = 0; i < size; i++) {
char *line = ganal_lines[i];
if (strstr(line, idnb) != NULL) {
size_t llen = strlen(line);
for (size_t k = 0; k < llen; k++) {
resline[k] = line[k];
}
return;
}
}
return;
}
This function was wrapped by the corresponding python function:
def find_gline_wrap(lines: list, arg: str, cdll):
""
# set arg types
mlen = maxlen(lines) # gives the length of the longest string in string list
linelen = len(lines)
line_array = ctypes.c_char_p * linelen
cdll.find_gline.argtypes = [
line_array,
ctypes.c_size_t,
ctypes.c_char_p,
ctypes.c_char_p,
]
#
argbyte = bytes(arg, "utf-8")
resbyte = bytes("", "utf-8")
ganal_lines = line_array(*lines)
size = ctypes.c_size_t(linelen)
idnb = ctypes.c_char_p(argbyte)
resline = ctypes.c_char_p(resbyte * mlen)
pdb.set_trace()
result = cdll.find_gline(ganal_lines, size, idnb, resline)
# getting rid of null char at the end
result = resline.value[:-1].decode("utf-8")
return result
Thane Brimhall
I'm the Chief Product Officer at Seek. I love Python 3, Django, REST, and PostgreSQL.
Updated on March 22, 2020Comments
-
Thane Brimhall about 4 years
I'm a Python veteran, but haven't dabbled much in C. After half a day of not finding anything on the internet that works for me, I thought I would ask here and get the help I need.
What I want to do is write a simple C function that accepts a string and returns a different string. I plan to bind this function in several languages (Java, Obj-C, Python, etc.) so I think it has to be pure C?
Here's what I have so far. Notice I get a segfault when trying to retrieve the value in Python.
hello.c
#include <stdlib.h> #include <stdio.h> #include <string.h> const char* hello(char* name) { static char greeting[100] = "Hello, "; strcat(greeting, name); strcat(greeting, "!\n"); printf("%s\n", greeting); return greeting; }
main.py
import ctypes hello = ctypes.cdll.LoadLibrary('./hello.so') name = "Frank" c_name = ctypes.c_char_p(name) foo = hello.hello(c_name) print c_name.value # this comes back fine print ctypes.c_char_p(foo).value # segfault
I've read that the segfault is caused by C releasing the memory that was initially allocated for the returned string. Maybe I'm just barking up the wrong tree?
What's the proper way to accomplish what I want?
-
David Heffernan about 11 yearsYou need to set
foo.restype
appropriately. Do you really want to usestatic
? Not threadsafe. Wouldn't you be better allocating memory in Python and letting the C code populate it with content? Or allocate in the C code, and export a deallocator too. -
Fred Foo about 11 yearsYou should probably return a copy of the string; use
strdup
ormalloc
for that. But really, if you want to do this kind of things in C, then invest in a C book. C is quite different from higher-level languages such as Python. -
Admin about 11 yearsAside from the problem you describe, your buffer is
static
, so there's only one for all calls, so the next call would change what the first return value points at. Keeping it local and notstatic
means its lifetime ends when the function returns, which makes it unsuitable. That's not even touching on the buffer overflow vulnerability! -
Thane Brimhall about 11 yearsHeh, obviously a C noob here. :) If I remove
static
gcc gives me a warning. What's the proper way to allocate the memory for return? I'm just looking for something safe and straightforward. -
Admin about 11 yearsThere is little safe or straightforward in C ;-) At least not if you work with a Python mindset. Read a good C book. Reading existing questions and answers here on Stackoverflow works in a pinch but I wouldn't bet on it. (Btw, gcc gives a warning for the very reason I hinted at: It's incorrect, you're returning the address of something that doesn't exist any more.)
-
-
Thane Brimhall about 11 yearsVery nice! Thank you. A followup question on this answer: Since we're allocating the memory here, where/when is it deallocated? If I call this function 10k times, will I have an awful leak?
-
Geesh_SO about 11 yearsBeat me to it. This should do the job (unless there is any other unforeseen problems). The reason it didn't work before was that the by creating the string like you did, it was allocated on the stack and was thus lost once the function exited. The solution instead uses malloc to allocate on the heap, and returns to Python the location where to find the string.
-
Admin about 11 years@ThaneBrimhall Of course for every malloc you need to make a free
-
Thane Brimhall about 11 yearsExcellent explanation of how it all works. How would I deallocate the memory once I use the result in my binding?
-
Thane Brimhall about 11 yearsI suppose I'd have to free it in the Python binding? Or where else would I do that?
-
Admin about 11 yearsI don't know python but according to the rules set by your question, you would have to make a c function and pass the char* pointer to it.
-
Thane Brimhall about 11 yearsPerfect, thank you! See this question for one way to do it.
-
Warren Weckesser about 11 yearsThe code should also check for malloc failing, and deal with it appropriately.
-
Thane Brimhall about 11 yearsI can't do the "much better approach" you recommended (return a Python object) because I need to bind this function in multiple languages.
-
Admin about 11 yearsMore importantly, it should check the length.
malloc
rarely fails and if it does you get a crash. On the other hand, it's very easy to cause a buffer overflow with this code (or OP's code, to be fair). This isn't the nineties. -
Admin about 11 years@delnan Of course it should do both things , but i wasn't writing the perfect function but an example how to do it in the first place.
-
Admin about 11 yearsStill, this is such a glaring problem, and these bugs have such a horrible history, that I would rather see a huge warning (or just fix it,
strlen
should help). I actually disagree with themalloc
check proposed above, and it triggered me to comment at all. -
Travis Griggs about 11 years
free(pointer)
. That's the opposite ofmalloc
and friends. You'll have to provide a binding for that, or hope that you have one already, most languages that bind to C have some mechanism for doing that. -
Admin about 11 yearsLooks good to me, modulo stylistic issues that go beyond nitpicking. But I don't do this on a daily basis, so don't take it as a guarantee. (And again, I for one don't think the null check is worth the line of code it takes. But this is subjective.)
-
tdelaney about 11 yearsOkay, that can be a problem! Another option is SWIG, which can bind several languages (swig.org/compat.html#SupportedLanguages). I use ctypes from time to time, but it can be unwieldy when the interface is complex.
-
tdelaney about 11 yearsI added the python code to the example - its not tested but looks right to me (lol).
-
Pavel Minaev about 4 years
static
variables are not allocated on the stack, and remain after the function returns, so this is wrong. -
h0r53 over 3 yearsAny ideas how to find
free
on Windows?libc
doesn't exist andutil.find_library('c')
returns None. -
h0r53 over 3 yearsAs a workaround I just defined my own C routine that accepts a
char *
and calls free. Then I call callfree
using my own code shared library code without needing to import libc or anything else. -
Valentin Safonnikov about 2 yearsThis code produces memory leaks. Use python function create_string_buffer and fill the buffer in c code.