ODBC: Maybe a Relevant Target for Tooling?

hostilefork · October 20, 2019, 3:52am

It turned out that @orr721 had been using Ren-C with ODBC for a couple of years (to process data from Microsoft Excel). I'd been on the fence about its relevance, because it seems a very "legacy" type of technology... I'm more interested in graph databases and non-SQL-oriented storage engines. Also, the bit of PORT!-related cruft as an interface seemed very poorly thought out (as I feel pretty much all of ports have been shown to be).

But since he'd been using it and filed a bug report--and I've been asking people to give me reproducible tests for things they care about to put on Travis--I figured I could patch it up. I did so in a way that pushed it along even further as a libRebol client; various commits I made showed flexing the muscles of the API and its approach.

But not wanting the work to atrophy again, I created a new rebol-odbc GitHub with its own .travis.yml file and issue database. I made it run a primordial but somewhat broad insert and select %odbc-test.reb, which puts various datatypes into a table from Rebol values, and then checks to see that the same values can be gotten back out.

The test is now automated and running and succeeding on MySQL as well as succeeding under SQLite on unixodbc. Linux is the more complicated environment to get thing working on, as ODBC originates from Microsoft--so it's more turnkey there.

I Might Be Underestimating How Useful This Is

Rebol2's ODBC abilities were considered valuable enough that they were a paid feature. One implication of that is that few people ever had a chance to experience a single-download tool that can manipulate ODBC databases and translate fields to native types, while offering the language abilities of Rebol2. (So they sure haven't seen what such a tool with Ren-C can do!)

I went hunting for a vetted open-source alternative client that could do arbitrary SQL commands on DBC databases on your machine. There was pretty much no kind of "simple" option.

But I already had Python available on the machine:

shell> python --version
Python 2.7.15+

So since there was a "pyodbc" I thought I'd install it. However to get it, I had to install the Python Package Manager. This is what sudo apt install python-pip gave back:

The following additional packages will be installed:
  libexpat1-dev libjs-sphinxdoc libpython-all-dev libpython-dev libpython2-dev libpython2.7 libpython2.7-dev
  libpython2.7-minimal libpython2.7-stdlib python-all python-all-dev python-asn1crypto python-cffi-backend
  python-configparser python-crypto python-cryptography python-dev python-entrypoints python-enum34 python-idna
  python-ipaddress python-keyring python-keyrings.alt python-pip-whl python-pkg-resources python-secretstorage
  python-setuptools python-six python-wheel python-xdg python2-dev python2.7 python2.7-dev python2.7-minimal
Suggested packages:
  python-crypto-doc python-cryptography-doc python-cryptography-vectors python-enum34-doc gnome-keyring
  gir1.2-gnomekeyring-1.0 python-fs python-gdata python-keyczar python-secretstorage-doc python-setuptools-doc
  python2.7-doc binfmt-support
The following NEW packages will be installed:
  libexpat1-dev libjs-sphinxdoc libpython-all-dev libpython-dev libpython2-dev libpython2.7-dev python-all
  python-all-dev python-asn1crypto python-cffi-backend python-configparser python-crypto python-cryptography
  python-dev python-entrypoints python-enum34 python-idna python-ipaddress python-keyring python-keyrings.alt
  python-pip python-pip-whl python-pkg-resources python-secretstorage python-setuptools python-six python-wheel
  python-xdg python2-dev python2.7-dev
The following packages will be upgraded:
  libpython2.7 libpython2.7-minimal libpython2.7-stdlib python2.7 python2.7-minimal
5 upgraded, 30 newly installed, 0 to remove and 191 not upgraded.
Need to get 40.1 MB of archives.
After this operation, 66.7 MB of additional disk space will be used.

So I need to download and decompress 40.1 MB of data, to take up 66.7 MB of disk (spread across who knows how many files), when I already had a python interpreter on the machine? And this won't even get me pyodbc!

All right, fine. After installing, I'd gone from 2.7.15+ to 2.7.16:

shell> python --version
Python 2.7.16

Running pip install pyodbc wasn't egregiously big, but still seemed dauntingly complex:

Collecting pyodbc
  Downloading https://files.pythonhosted.org/packages/75/29/aa190749bac37ede0f11a68a75e7055254699c11572bd94213f1163dfd8f/pyodbc-4.0.27.tar.gz (242kB)
100% |████████████████████████████████| 245kB 1.9MB/s 
Building wheels for collected packages: pyodbc
  Running setup.py bdist_wheel for pyodbc ... done
  Stored in directory: /home/hostilefork/.cache/pip/wheels/87/b7/78/0740a2ed98bfe463525ad42d535370e34141c5d36b2d00dcaf
Successfully built pyodbc
Installing collected packages: pyodbc
Successfully installed pyodbc-4.0.27

Putting aside the "what the heck is" ...wheels/87/b7/78/0740a2..., the package manager actually built the shared library. So the specification of the package is parallel to our "extensions"--presumably if you're on windows then pip install pyodbc would be able to adapt its compiler settings and make a .DLL against Windows APIs as opposed to a .so against POSIX. Unlike apt, it seems that pip does installation on a per-user basis by default.

-rwxrwxr-x 1 hostilefork hostilefork 832848 Oct 18 17:33 /home/hostilefork/.local/lib/python2.7/site-packages/pyodbc.so*

So over 800k of binaries just for the ODBC part. That doesn't make much sense, as most all the ODBC functionality is a shared library. What could all of that be?

Maybe There Is Opportunity Here

I don't know how big the cross-platform ODBC market is, and I've already mentioned my personal reluctance in getting involved in packaging binaries. But I did have a thought...

I've already suggested that the Trojan Horse of distributing the system might be to give people utilities built on Rebol that don't try and sell them the language, but try to sell them particular functionality (a dialect, etc.) built with the language.

What if every encapped application could sense its filename and then if it was r3 (or rebol, or some other list of special terms) it would automatically act as plain Rebol?

This means userbases could be identified to market a tool to. You build for the platforms of interest to them and encap the tool inside it.

I don't know what kind of ODBC-powered tool people might imagine, but it seems the world doesn't really even have a basic cross-platform string-based SQL query sorted out. If you imagine just a few nice features built on the usermode console and a dialect, I wonder... might there even be money in that?

My own attention should be on the web build--but if there were others who came with an interest in this, it wouldn't hurt. I don't know. As I say, I just sensed a vacuum and was struck at how there might be an opportunity here.

gchiu · October 20, 2019, 5:41am

I built my EMR on using ODBC and having it on both Windows and Linux for me is a big plus.

hostilefork · October 20, 2019, 9:39am

@gchiu : On the topic of this thread (cc: @BlackATTR, @orr721), please see the improvements to the %odbc-test.reb which the small dialect change I posted about was able to make:

commit 3338fd0: "Make use of new ODBC-EXECUTE functionality"

When you think about how little code it took to write ODBC-EXECUTE and how clean it is, this may show some of the "opportunity" for building whatever is in the imagination. It's hard to think of anything else that packs that much punch with that much flexibility and can look good doing it. (Much less anything that can be squeezed down into a little standalone executable on any platform you can think of...)

I did put more effort than I probably should have into seeing if I could set up Firebird on Linux as one of the Travis tests. But it does not appear the unixodbc driver is maintained, and as databases go its relative unpopularity makes it hard to deal with. There were a long series of problems at every point: not only is their driver only available from SourceForge (spam central with no direct download link to use on Travis, you have to host it yourself)...the archive was corrupt and the driver didn't work.

I did eventually get it built myself and configured and working with Python's ODBC. But it seems a time sink to try to keep pushing on it. We'd be better off verifying and checking the PostgreSQL support.

When you get back in a mood to do database testing then re-file any bugs on relevant code you use.

orr721 · October 20, 2019, 12:38pm

hi,

I have tested it with SQLite ODBC driver on Windows and it works as expected. (a simple SELECT statement for all records)

However for some reason it can't find my tables in the "database" Excel ODBC connection (I get an error stating the ODBC could not find the table). I suspect it will be some error with escaping the special characters as the Excel tables have names like Sheet0$, Sheet1$, etc. I can't test further because the 'tables statement itself doesn' work (I have filed an issue on this on github).

This is blocking further testing my UTF-8 problem with Excel ODBC connection as well. In the meantime have verified that when I convert the files to SQLite UTF-8 works.

Regarding your previous suggestion to use binary! type. How would I get a binary! result from the odbc-execute function?

thanks again!

orr721 · October 20, 2019, 1:31pm

the whole odbc-test.reb passed correctly for me on Windows 10 64bit, SQLite3 ODBC Driver 32/64bit v. 0.9996

run with
r3-5ce6dff-debug-cpp.exe odbc-test.reb testsql3 --sqlite --show-sql

all tests returned
QUERY MATCHED ORIGINAL DATA

great..

I will try to modify it for Excel ODBC connection when my problems mentioned above are somehow resolved.

gchiu · October 20, 2019, 6:16pm

I guess it's PostgresSQL then though I've avoided using this since I've been mainly working on embedded work.

orr721 · October 20, 2019, 8:47pm

sorry, I was too excited and a bit too quick to report. I did not test the NUMBER=9 version (mingw-x86, OS_ID=0.3.1) as it failed the automatic travis build. so this is the result now with the latest exe:

r3-e44b696.exe odbc-test.reb testsql3 --sqlite --show-sql

bigint_s does not match (is this expected as well?):

>> SQL: "DROP TABLE test_bigint_s"                                                                                                  
>> SQL: {CREATE TABLE test_bigint_s ( id INTEGER PRIMARY KEY NOT NULL , val BIGINT NOT NULL )}                                      
Inserting as BIGINT                                                                                                                 
[-9223372036854775808 -10 0 10 9223372036854775807]                                                                                 
>> SQL: "INSERT INTO test_bigint_s (val) VALUES ( ? )"                                                                              
>> SQL: "INSERT INTO test_bigint_s (val) VALUES ( ? )"                                                                              
>> SQL: "INSERT INTO test_bigint_s (val) VALUES ( ? )"                                                                              
>> SQL: "INSERT INTO test_bigint_s (val) VALUES ( ? )"                                                                              
>> SQL: "INSERT INTO test_bigint_s (val) VALUES ( ? )"                                                                              
>> SQL: "SELECT val FROM test_bigint_s"                                                                                             
=> [0 -10 0 10 -1]                                                                                                                  
QUERY DID NOT MATCH ORIGINAL DATA

bigint_u fails hard: (this is expected according to comments in odbc-test.reb)

>> SQL: "DROP TABLE test_bigint_u"                                                                                                  
>> SQL: {CREATE TABLE test_bigint_u ( id INTEGER PRIMARY KEY NOT NULL , val BIGINT UNSIGNED NOT NULL )}                             
Inserting as BIGINT UNSIGNED                                                                                                        
[0 10 20 30 9223372036854775807]                                                                                                    
>> SQL: "INSERT INTO test_bigint_u (val) VALUES ( ? )"                                                                              
>> SQL: "INSERT INTO test_bigint_u (val) VALUES ( ? )"                                                                              
>> SQL: "INSERT INTO test_bigint_u (val) VALUES ( ? )"                                                                              
>> SQL: "INSERT INTO test_bigint_u (val) VALUES ( ? )"                                                                              
>> SQL: "INSERT INTO test_bigint_u (val) VALUES ( ? )"                                                                              
>> SQL: "SELECT val FROM test_bigint_u"                                                                                             
Test had an error: make error! [                                                                                                    
    type: _                                                                                                                         
    id: _                                                                                                                           
    message: "INTEGER! can't hold some unsigned 64-bit values"                                                                      
    near: [                                                                                                                         
        copy-odbc port/locals part ~~]                                                                                              
    where: [copy-odbc _ copy for-each trap do catch either else _ do console]                                                       
    file: _                                                                                                                         
    line: 113                                                                                                                       
]

decimal does not match:

>> SQL: "DROP TABLE test_decimal"                                                                         
>> SQL: {CREATE TABLE test_decimal ( id INTEGER PRIMARY KEY NOT NULL , val DECIMAL(3,2) NOT NULL )}       
Inserting as DECIMAL(3,2)                                                                                 
[-3.4 -1.2 0.0 5.6 7.8]                                                                                   
>> SQL: "INSERT INTO test_decimal (val) VALUES ( ? )"                                                     
>> SQL: "INSERT INTO test_decimal (val) VALUES ( ? )"                                                     
>> SQL: "INSERT INTO test_decimal (val) VALUES ( ? )"                                                     
>> SQL: "INSERT INTO test_decimal (val) VALUES ( ? )"                                                     
>> SQL: "INSERT INTO test_decimal (val) VALUES ( ? )"                                                     
>> SQL: "SELECT val FROM test_decimal"                                                                    
=> ["-3.^@" "-1.^@" "0" "5.6" "7.8"]                                                                      
QUERY DID NOT MATCH ORIGINAL DATA

and finally binary does not match:

>> SQL: "DROP TABLE test_binary"                                                                          
>> SQL: {CREATE TABLE test_binary ( id INTEGER PRIMARY KEY NOT NULL , val BINARY(3) NOT NULL )}           
Inserting as BINARY(3)                                                                                    
[#{000000} #{010203} #{FFFFFF}]                                                                           
>> SQL: "INSERT INTO test_binary (val) VALUES ( ? )"                                                      
>> SQL: "INSERT INTO test_binary (val) VALUES ( ? )"                                                      
>> SQL: "INSERT INTO test_binary (val) VALUES ( ? )"                                                      
>> SQL: "SELECT val FROM test_binary"                                                                     
=> ["X'0^@剅^@^@^@譪" "X'0^@剅^@^@^@譪" "X'F^@剅^@^@^@譪"]                                                
QUERY DID NOT MATCH ORIGINAL DATA

I will be more careful next time..

orr721 · October 20, 2019, 8:52pm

error reports should go to github right? I will create an issue there next time if that is the case...

hostilefork · October 20, 2019, 10:19pm

Yes, please report errors on GitHub.

Let's tidy this thread up to be about the wisdom of possibly taking ODBC-based tooling projects seriously as something to try to get someone to invest in and support (despite its "legacy"-ish status and divergence from the main web focus).

BlackATTR · October 21, 2019, 11:07pm

I think ODBC is useful and a solid feature for Ren-C. The ability to connect to a DBMS is table stakes. Otoh, I don't think that ODBC is necessarily a significant draw for Ren-C either. For Win development, I think LINQ has things sewn-up, and for other platforms my guess would be that you run into PHP/Wordpress, Ruby-on-Rails, the many frameworks of Python, Java and then Node.js and that ilk. And while Python may have a lot of libraries/dependencies, I still think it would take a lot to peel devs away from the safety of that community, its documentation and support etc.

Then again, a lot comes down to whatever compelling features there are to a Ren-C solution. While I don't think being smaller/compact is enough to pull many devs to a new lang paradigm, if there are other ideas in the mix it might tick enough boxes for some-- e.g., WASM db-record editing (or dashboard UIs) that connect to cloud-based doc stores, graph databases, webAPIs etc. Buzzwords aside, that's the direction the industry is headed in. Projects looking to replatform will justify their funding/ROI using the benefits of the new infrastructure (serverless, node, wasm) rather than investing in code rewrites for legacy stacks. New projects will start from day one in the newer paradigm, as we've witnessed for the past few years.

iArnold · November 29, 2020, 10:37am

How to start with ODBC and SQLite3 from Ren-C?
Especially starting a new database and connecting to it?
Don't fancy giving info like databasename from the Linux command line when starting Ren-C.
And creating a new SQLite3 database using Python and continuing with that is not a real option either

hostilefork · November 29, 2020, 4:26pm

You don't get to set the rules when using ODBC.

ODBC is an abstraction layer which requires you to have ODBC itself (built-in on Windows, unixodbc on Linux) as well as a driver that implements the ODBC protocol for the database you wish to use. Then, you have to have a database set up already with a "DSN" name hooked to it.

In other words, ODBC doesn't have "create database" commands. It lets you create tables in an already existing database.

You can see how the ODBC tests do it here. An empty database can be created with sqlite3 test.db "VACCUM;". I'm using the myodbc-installer command for convenience, but it's really just doing some simple edits to config files.

(The people who make unixODBC specifically warn you not to edit the config files by hand, but to go through their APIs, because they make no guarantees about the file formats. However it seems the file formats became assumed constant enough that this isn't as big a deal as they make it.)

iArnold · November 29, 2020, 6:36pm

I had seen something already but can't wrap my head around it. I have created a simple database using the tool SQLite Browser from http://sqlitebrowser.org and now am in possession of a second SQLite db. (Previous was made during my assessment task using Python:

import sqlite3
#To be sure create a folder db using bash.
conn = sqlite3.connect('db/sqlite3test.db')

)

That creates a database when no existing database was found.
Perhaps that is where my 'desire' to create a database comes from.

But it is now the task to talk to it. Looks like I have libsqlite.so installed but not yet libsqlite3odbc.so
To be continued.

Yes I managed to connect to my database!

cat /etc/odbcinst.ini
[SQLite]
Description=SQLite ODBC Driver
Driver=libsqliteodbc.so
Setup=libsqliteodbc.so
UsageCount=1

[SQLite3]
Description=SQLite3 ODBC Driver
Driver=/usr/lib/x86_64-linux-gnu/odbc/libsqlite3odbc.so
Setup=/usr/lib/x86_64-linux-gnu/odbc/libsqlite3odbc.so
UsageCount=1

_

connection: open [scheme: 'odbc target: "driver={SQLite3};database=/path/to/testr3db"]

I now am facing a prompt that want something from me.. but what is it... scary prompt >