Hi there! First time posting on a Rebol/Ren-C/Red forum, I hope this is the right place for my question. I have been testing the ODBC functionality in Ren-C build 2.102.0.3.1 from 10-Aug-2017/5:25:53 on Windows 10 Pro.
Is there any way how to increase the available memory for the ODBC query? Because I have some kind of memory allocation problem while running my queries..
>> insert conn-port "select * from [Sheet0$] where Meno = 'ALENA'"
** Error: Couldn't allocate column buffer!
** Where: insert-odbc --anonymous-- insert
** Near: reduce compose [(sql)] ??
The query should return 1 record from an XLSX file with 13 columns, where 2 of the columns contain strings longer than 1500 characters. When I run the select query without these two columns the query runs fine and I receive my data.
I am a bit confused with this problem because in Rebol2/View I can run the same query with all 100 records (without the where clause) in a single statement without any memory problems. Am I missing something or is it a bug in the ODBC Ren-C implementation? Thank you very much for an answer and for your excellent work.
P.S. running TRACE ON crashes the executable immediately..
Welcome! Yes, this is a good place to ask questions regarding Rebol. I was once a skeptic of the idea of making a forum (I preferred our chat, GitHub Issues, Trello, StackOverflow Q&A...) and wondered if it would just be a burden to have another "place". But I think it turned out to be a good medium for longer-attention-span discussions that don't fit elsewhere.
The ODBC support was an attempt by me to get the ball rolling by hacking on an R3-Alpha extension written by Christian Ensel so that it could work in a minimal sense. I don't know how true that extension was to Rebol2, as I have no real experience with historical ODBC in Rebol (I did not arrive on the scene until the R3 times).
So over the span of a week or so I managed to get something going, against MySQL's ODBC driver on Windows. I concocted a single test file to see what it could do:
That's about as far as we've gotten, though it's better than nothing!!
So I don't presume to know what it should or should not do. It's mostly on those in the community who have a stake to come in and say what it would take for them to feel engaged and want to be involved. I've got a lot of questions about the PORT! design in the first place, and I also feel like there's a lot more STRING! use than I'd like to see for a dialect. One of Rebol's strengths is supposed to be DSLs, so it would seem a smoother and more integrated SQL using WORD!s and such would be an objective...
We could definitely use some more developers to hack on it with us. C programmers would be great but even writing tests and defining what it should do helps.
In any case, please feel free to introduce yourself and your stake in the project here!
Rebol2 was a mature product of many decades of work. We don't have the source code but it would seem some of its magic was not safe!
The error message suggested that the ODBC driver was not allocating enough memory for an unexpected result hence my suggestion that you cast the result first so that the driver would know to allocate the right amount of memory.
While the ODBC driver itself can work with the big columns without any problems (even though the datatype itself is "unsupported") I have not found a way how to tell the Ren-C ODBC module to accept it. Unfortunately I am not much of a C programmer and really can't fix any C code.
I have tested various SQL string operations and all work with the full column data, i.e. beyond the 255 char limit. So a potential work-around would be to split the big columns in SQL to several 255 char parts and then join them back in Ren-C. Or go back to Rebol2/View where it strangely enough doesn't have this problem.
Regarding myself, I use Rebol mostly for scripting short "glue" code to collect data from various sources, modify it and export it for further use in databases, making statistics, etc. I like it very much because it is small, multiplatform (I have used it on Solaris, NetBSD, MS Windows, Linux, MacOSX, even classic MacOS, ...), it is dynamic, easy to use, etc.. A real swiss army knife for any quick work. I have to admit I still fight with the parse rules for string processing so sometimes I use awk (which is an excellent small tool as well).
Yeah, I have tried. But even after reinstalling MS Office, installing the dependencies, it doesn't work for me (I have tried yesterday). But I have to admit I do have a strange combination of Office 2003 and Office 2010 Starter on my Windows computer so I am not complaining.
ODBC seemed to me like a more universal, cleaner and faster way. Hey, and it mostly works!
I'm not that surprised if SQL statements issued to some random "database" willingly works with an overlong VARCHAR(255), because you're just throwing instructions over to the database; it does all the work on its side, and there's not any issue with whether ODBC understands what it's asking.
What I might find surprising would be if the type indicator is formally marked VARCHAR(255) in its specification, yet using some clients it returns longer than 255 and this is tolerated...and for instance Rebol2 is such a client? Are you sure?
It might be possible, though sketchy. The way ODBC works, is that you provide memory buffers of sufficient size for each column, in a process called binding. It is the case that for some types, if you provide too small a buffer, you can get back a status code indicating this... SQL_SUCCESS_WITH_INFO with SQLSTATE 01004 (Data truncated)
It says:
When data truncation occurs, the application can sometimes allocate a larger buffer and refetch the data; this is not true in all cases. (emphasis mine)
The routine that would be returning SQLSTATE 01004 would be, I guess, SQLFetch...here:
So the deal would be that it's possible to notice you've been lied to about the length and a field wound up truncated, update the binding, and then fetch again perhaps. Of course you're operating in an inconsistent world, where the length limit is a lie, and I'd certainly think it's something any given ODBC client or driver would consider undefined behavior.
If you've got proof it's being done then it has been done, though it would help if that proof came along with source code to see how they were doing it. Hence if you can point to an open source application which has the feature, that would be more useful.
All this said, I don't know how much work we can promise. We'd like to have more participants here, but we don't really have a huge developer "budget" for things beyond our core focus unless people are willing to contribute in kind.
So ODBC wants Ren-C to allocate 1 GB of memory and this fails. However Rebol2/View can somehow deduce that this is some kind of a special type (by the SQL_DATA_TYPE property?) and allocates just enough for it to make it work.
Could it be that the datatype -1 means BLOB datatype? I have tried to search through Microsoft ODBC documentation but I have found nothing. They refer to types only by their names not by the constant values.
This is taken from Rebol2/View, so you believe me, (the big columns are number 7 and 8)..
I will try to search for some other ODBC implementation where it works and will report. Any tips what would be the easiest to understand and to port later on? Ruby? Scheme?
According to Microsoft ODBC sqlext.h header file, the SQL_DATA_TYPE -1 corresponds to SQL_LONGVARCHAR. This data-type has no fixed length limit, hence the ODBC reported BUFFER_LENGTH and COLUMN_SIZE of 1073741824 (which equals 1 GB).
So i think the problematic code is the line number 834 of the ODBC_BindColumns function in the ren-c/src/extensions/odbc/mod-odbc.c file, where the buffer size is determined:
The Microsoft recommended way how to handle this data-type is to use maximum allowed string size or a maximum known size of the source data. I have checked for Excel where the maximum cell size is 32K. It could be different (larger) for other ODBC databases. Actually that https://stackoverflow.com/a/9547441 reference talks about the 32K limit but as far as I can tell it is not enforced anywhere in the mod-odbc.c code.
So could some one able to compile Ren-C please devise a way how to cap the column_size to 32K and test it..
Great, thanks for looking at it and into the code itself...if people are willing to do more of that, I'm sure all this could be improved much faster! I've committed a change.
The Ren-C download page's HTML isn't getting updated right now, as @gchiu is in China and is firewalled out of the place where he runs the script. But the builds are being generated by Travis anyway, and are available by URL...if you just put the commit hash in. e.g. if you were downloading Windows 64-bit debug, the now-outdated last link that was at on the page is:
(Builds should be available after all their Travis entries complete, which can take some time as the Mac builds often wait in a queue.)
Of course if people build the system themselves, then they can start making patches, and I'd argue we're doing a reasonable job of keeping it the kind of C you can read with one book's worth of knowledge.
I have even got it running under both the 32bit and 64bit versions (had to do some black magic first to have both versions of the ODBC drivers installed at the same time).
A minor nitpick is that the 64bit r3-e6718d6-debug.exe version reports a memory leak? after quitting:
>> close c
>> close p
>> quit
*** PG_Mem_Usage = 136436 ***
Memory accounting imbalance: Rebol internally tracks how much
memory it uses to know when to garbage collect, etc. For
some reason this accounting did not balance to zero on exit.
Run under Valgrind with --leak-check=full --track-origins=yes
to find out why this is happening.
I can test further if you give me any instructions.
I'm not surprised there are presently leaks in the ODBC. It really was just a first cut at taking Christian's code and getting it to build and do something.
I'm more surprised you've managed to accomplish productive work with it.
At this moment there are several higher priorities in the core, but if you're blocked by an issue please bring it up. If your code is open source, it would also be interesting to see what you're doing and perhaps suggest how you could take advantage of new features.
This warning is displayed after the r3 runtime quits.
I was not doing anything fancy really, just run the r3-... exe REPL, opened the ODBC connection, run a simple query, closed the connection and quit. It did not crash, freeze or anything.
And it is only displayed by the 64bit r3-...exe. Could it be because of the -debug compile? The 32bit runtime does not have this message.
Yes: the memory balancing is only checked in the debug build, and program termination is the only time you can be sure the memory isn't being held onto for later use. A 32-bit debug build would have the same issue.
If you're only getting the message when you've run an ODBC query, then that's where the problem nearly certainly is.